Re: Strange error when using custom LoadFunc

Christian Decker Thu, 14 Oct 2010 05:23:38 -0700

Right now all my tuple values are of type String. Actually my code looks
like this, still pretty basic but it's doing what it's supposed to:


List<ColumnOrSuperColumn> cf =
> (List<ColumnOrSuperColumn>)reader.getCurrentValue();
> HashMap<String, Object> res = new HashMap<String, Object>();
> for (ColumnOrSuperColumn c : cf){
>   res.put(new String(c.column.name), new String(c.column.value));
> }
> Tuple output = TupleFactory.getInstance().newTuple(3);
> output.set(0, res.get("col1"));
> output.set(1, res.get("col2"));
> output.set(2, res.get("col3"));
>

Any idea?

Regards,
Chris

On Tue, Oct 12, 2010 at 11:23 PM, Dmitriy Ryaboy <[email protected]> wrote:

> What are the objects underlying col1, col2, and col3? You can only use the
> set of objects Pig understands (so, String, various Number derivatives,
> DataByteArray, Map<String, Object> , Tuple, DataBag)
>
> -D
>
> On Tue, Oct 12, 2010 at 12:37 PM, Christian Decker <
> [email protected]> wrote:
>
> > Hi,
> >
> > I'm currently working on a simple Cassandra Loader that reads an index
> and
> > then works on that data. Now whenever I try to work on the retrieved data
> I
> > get a strange error:
> >
> > java.io.IOException: Type mismatch in key from map: expected
> > > org.apache.pig.impl.io.NullableBytesWritable, recieved
> > > org.apache.pig.impl.io.NullableText
> > >     at
> > >
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:845)
> > >     at
> > >
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
> > >     at
> > >
> >
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> > >     at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:115)
> > >     at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:234)
> > >     at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:227)
> > >     at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:52)
> > >     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> > >     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> > >     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> > >     at
> > >
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> > >
> >
> > The script is pretty simple right now:
> >
> > rows = LOAD 'cassandra://localhost:9160/...' USING CassandraIndexReader()
> > as
> > > (col1, col2, col3);
> > > dump rows;
> > > grouped = GROUP rows BY col1;
> > > dump grouped;
> > >
> >
> > The first dump works fine,while the second just dies with the above
> error.
> > Strangely when I store it on disc and then load it with PigStorage()
> again
> > it just works as expected.
> >
> > Am I doing something wrong with my Custom Loader?
> >
> > Regards,
> > Chris
> >
>

Re: Strange error when using custom LoadFunc

Reply via email to