Re: Compressed sequence files and "hadoop fs -text "

Scott Farrar Mon, 13 Feb 2012 14:08:10 -0800

Harsh,

Thanks, that was why the native libs were not being loaded -- my cluster is 
Linux, but I was submitting the command from a Mac.


Is there any way to force Hadoop to use the java Codec classes, to avoid this 
native-library dependency?

Thanks so much for your help!!!
Scott Farrar


On Feb 13, 2012, at 12:43 PM, Harsh J wrote:

> Scott,
> 
> The linux native libraries are only loaded if your platform is Linux
> and if the binaries are compatible with the architecture. Could you
> try the same command under Linux (VM or otherwise)?
> 
> On Tue, Feb 14, 2012 at 2:09 AM, Scott Farrar <[email protected]> wrote:
>> I'm trying to use "hadoop fs -text <sequencefile>" to print the contents of 
>> a sequence file via the command line.  This works fine when the sequence 
>> file is uncompressed, but when I turn on compression:
>>        ...
>>        job.setOutputKeyClass(Text.class);
>>        job.setOutputValueClass(Text.class);
>>        job.setOutputFormatClass(SequenceFileOutputFormat.class);
>>        SequenceFileOutputFormat.setCompressOutput(job, true);
>>        SequenceFileOutputFormat.setOutputCompressorClass(job, 
>> GzipCodec.class);
>>        SequenceFileOutputFormat.setOutputCompressionType(job, 
>> CompressionType.BLOCK);
>>        ...
>> 
>> I get the following:
>> 
>> 12/02/13 11:08:59 WARN util.NativeCodeLoader: Unable to load native-hadoop 
>> library for your platform... using builtin-java classes where applicable
>> 12/02/13 11:08:59 INFO compress.CodecPool: Got brand-new decompressor
>> text: null
>> 
>> Some questions about the error message:
>> 
>> (1) I have verified that the native-hadoop library is in 
>> $HADOOP_HOME//lib/native/Linux-i386-32/libhadoop.so.  I am curious as to why 
>> can't Hadoop load it?  The native library isn't necessary for my purposes -- 
>> I don't need native-level decompression performance, I'm just trying to 
>> manually spot-check my data.  I'm just curious about this.
>> 
>> (2) The message "text:null" suggests to me a NullPointerException being 
>> thrown.  But I'm pretty sure there are no nulls in my data, because I can 
>> turn off compression (comment out the last three lines above), run "hadoop 
>> fs -text <sequencefile>", and see the data I expect.  Is there some other 
>> way I can verify that my data is not the cause of the problem?
>> 
>> Any pointers or suggestions you may be able to provide would be greatly 
>> appreciated.
>> 
>> Thank you,
>> Scott Farrar
>> 
> 
> 
> 
> -- 
> Harsh J
> Customer Ops. Engineer
> Cloudera | http://tiny.cloudera.com/about

 Scott Farrar
[email protected]

Re: Compressed sequence files and "hadoop fs -text "

Reply via email to