SequenceFileTableSouce will let you read it the file as a PTable, which is probably the quickest way to get what you want.
On Mon, Jul 9, 2012 at 1:55 AM, Rahul <[email protected]> wrote: > Guys, > > I have a SequenceFile with LogWritable Keys and Text as values . I am using > SequenceFileSource with MRPipeline. But when I use MemPipeline it is giving > back the following exception. > > 3503 [main] INFO com.cloudera.crunch.io.seq.SeqFileReaderFactory - Error > reading from path: file:/home/rahul/software/crunch/sampleFile > java.io.IOException: wrong key class: org.apache.hadoop.io.ObjectWritable is > not class org.apache.hadoop.io.LongWritable > at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1895) > at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947) > at > com.cloudera.crunch.io.seq.SeqFileReaderFactory$1.hasNext(SeqFileReaderFactory.java:68) > at > com.cloudera.crunch.io.CompositePathIterable$2.hasNext(CompositePathIterable.java:81) > > Now this is due to the fact that the file contains LongWritable Keys but it > is using a NullWritable to read them. This gives error in MemPipline only, > it works in the MRPipeline because the KeyClass is passed there using the > MapContext of Hadoop and thus it is the correct one. I modified the > SeqFileReaderFactory to pass the KeyClass also but is this the correct way > of doing so ? > > regards > Rahul
