Re: [core-user] Help deflating output files

Martin Davidsson Fri, 31 Oct 2008 08:46:53 -0700

You can override this property by passing in -jobconf
mapred.output.compress=false to the hadoop binary, e.g.


hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-0.18.0-streaming.jar 
-input "/user/root/input" -mapper 'cat' -reducer 'wc -l' -output
"/user/root/output"   -jobconf mapred.job.name="Experiment" -jobconf
mapred.output.compress=false

-- Martin


Jim R. Wilson wrote:
> 
> Hi all,
> 
> I'm using hadoop-streaming to execute Python jobs in an EC2 cluster.
> The output directory in HDFS has part-00000.deflate files - how can I
> deflate them back into regular text?
> 
> In my hadoop-site.xml, I unfortunately have:
> <property>
>   <name>mapred.output.compress</name>
>   <value>true</value>
> </property>
> <property>
>   <name>mapred.output.compression.type</name>
>   <value>BLOCK</value>
> </property>
> 
> Of course, I could re-build my AMI's without this option, but is there
> some way I can read my deflate files without going through that
> hassle?  I'm hoping there's a command-line program to read these files
> since I'm none of my code is Java.
> 
> Thanks in advance for any help. :)
> 
> -- Jim R. Wilson (jimbojw)
> 
> 

-- 
View this message in context: 
http://www.nabble.com/-core-user--Help-deflating-output-files-tp17658751p20268639.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: [core-user] Help deflating output files

Reply via email to