I tried to do a quick and dirty inspection of some of our data feeds, which
are encoded in gzipped SequenceFile.

basically I did

a = load 'myfile' using ......SequenceFileLoader() AS ( mykey, myvalue);

but it gave me some error:
2013-09-16 17:34:28,915 [Thread-5] INFO
 org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor
2013-09-16 17:34:28,915 [Thread-5] INFO
 org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor
2013-09-16 17:34:28,915 [Thread-5] INFO
 org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor
2013-09-16 17:34:28,961 [Thread-5] WARN
 org.apache.pig.piggybank.storage.SequenceFileLoader - Unable to translate
key class com.mycompany.model.VisitKey to a Pig datatype
2013-09-16 17:34:28,962 [Thread-5] WARN
 org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in
cleanup
2013-09-16 17:34:28,963 [Thread-5] WARN
 org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
org.apache.pig.backend.BackendException: ERROR 0: Unable to translate class
com.mycompany.model.VisitKey to a Pig datatype
at
org.apache.pig.piggybank.storage.SequenceFileLoader.setKeyType(SequenceFileLoader.java:78)
 at
org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:133)


in the pig file, I have already REGISTERED the jar that contains the class
 com.mycompany.model.VisitKey


if PIG doesn't work, the only other approach is probably to use some of the
newer "pseudo-scripting " languages like cascalog or scala
thanks
Yang

Reply via email to