Good call. We can't use the conventional web-based JT due to corporate access issues, but I looked at the job_XXX.xml file directly, and sure enough, it set mapred.output.compress to true. Now I just need to remember how that occurs. I simply ran the wordcount example straight off the command line, I didn't specify any overridden conf settings for the job.
Ultimately, the solution (or part of it) is to get away from .19 to a more up-to-date version of Hadoop. I would prefer 2.0 over 1.0 in fact, but due to a remarkable lack of concise EC2/Hadoop documentation (and the fact that what docs I did find were very old and therefore conformed to .19 style Hadoop), I have fallen back on old versions of Hadoop for my initial tests. In the long run, I will need to get a more modern version of Hadoop to successfully deploy on EC2. Thanks. On Feb 14, 2013, at 15:02 , Harsh J wrote: > Did the job.xml of the job that produced this output also carry > mapred.output.compress=false in it? The file should be viewable on the > JT UI page for the job. Unless explicitly turned out, even 0.19 > wouldn't have enabled compression on its own. ________________________________________________________________________________ Keith Wiley [email protected] keithwiley.com music.keithwiley.com "The easy confidence with which I know another man's religion is folly teaches me to suspect that my own is also." -- Mark Twain ________________________________________________________________________________
