Hi Marcos and Keith, Thanks for bringing this to our attention. Saurabh is currently OOF, so I’ll pass this along to the EMR team.
Jeff; From: Marcos Ortiz Valmaseda [mailto:[email protected]] Sent: Thursday, February 14, 2013 7:10 PM To: [email protected] Cc: Saurabh Baji; Barr, Jeffrey Subject: Re: .deflate trouble Regards, Keith. For EMR issues and stuff, you can contact directly to Jeff Barr(Chief Evangelist for AWS) or to Saurabh Baji (Product Manager for AWS EMR). Best wishes. ________________________________ De: "Keith Wiley" <[email protected]> Para: [email protected] Enviados: Jueves, 14 de Febrero 2013 15:46:05 Asunto: Re: .deflate trouble Good call. We can't use the conventional web-based JT due to corporate access issues, but I looked at the job_XXX.xml file directly, and sure enough, it set mapred.output.compress to true. Now I just need to remember how that occurs. I simply ran the wordcount example straight off the command line, I didn't specify any overridden conf settings for the job. Ultimately, the solution (or part of it) is to get away from .19 to a more up-to-date version of Hadoop. I would prefer 2.0 over 1.0 in fact, but due to a remarkable lack of concise EC2/Hadoop documentation (and the fact that what docs I did find were very old and therefore conformed to .19 style Hadoop), I have fallen back on old versions of Hadoop for my initial tests. In the long run, I will need to get a more modern version of Hadoop to successfully deploy on EC2. Thanks. On Feb 14, 2013, at 15:02 , Harsh J wrote: > Did the job.xml of the job that produced this output also carry > mapred.output.compress=false in it? The file should be viewable on the > JT UI page for the job. Unless explicitly turned out, even 0.19 > wouldn't have enabled compression on its own. ________________________________________________________________________________ Keith Wiley [email protected] keithwiley.com music.keithwiley.com "The easy confidence with which I know another man's religion is folly teaches me to suspect that my own is also." -- Mark Twain ________________________________________________________________________________ -- Marcos Ortiz Valmaseda, Product Manager && Data Scientist at UCI<http://www.uci.cu> Blog: http://marcosluis2186.posterous.com LinkedIn: http://www.linkedin.com/in/marcosluis2186 Twitter: @marcosluis2186<http://twitter.com/marcosluis2186>
