The problem is that pig only speaks its data types. So you need to tell it how to translate from your custom writable to a pig datatype.
Apparently elephant-bird has some support for doing this type of thing... take a look at this SO post http://stackoverflow.com/questions/16540651/apache-pig-can-we-convert-a-custom-writable-object-to-pig-format On Mon, Sep 16, 2013 at 5:37 PM, Yang <[email protected]> wrote: > I tried to do a quick and dirty inspection of some of our data feeds, which > are encoded in gzipped SequenceFile. > > basically I did > > a = load 'myfile' using ......SequenceFileLoader() AS ( mykey, myvalue); > > but it gave me some error: > 2013-09-16 17:34:28,915 [Thread-5] INFO > org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor > 2013-09-16 17:34:28,915 [Thread-5] INFO > org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor > 2013-09-16 17:34:28,915 [Thread-5] INFO > org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor > 2013-09-16 17:34:28,961 [Thread-5] WARN > org.apache.pig.piggybank.storage.SequenceFileLoader - Unable to translate > key class com.mycompany.model.VisitKey to a Pig datatype > 2013-09-16 17:34:28,962 [Thread-5] WARN > org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in > cleanup > 2013-09-16 17:34:28,963 [Thread-5] WARN > org.apache.hadoop.mapred.LocalJobRunner - job_local_0001 > org.apache.pig.backend.BackendException: ERROR 0: Unable to translate class > com.mycompany.model.VisitKey to a Pig datatype > at > > org.apache.pig.piggybank.storage.SequenceFileLoader.setKeyType(SequenceFileLoader.java:78) > at > > org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:133) > > > in the pig file, I have already REGISTERED the jar that contains the class > com.mycompany.model.VisitKey > > > if PIG doesn't work, the only other approach is probably to use some of the > newer "pseudo-scripting " languages like cascalog or scala > thanks > Yang >
