That did the trick exactly, now I'm even able to pass Doubles and Longs directly without having it converted all over the place to and from strings :-)
Thanks mate, Chris On Thu, Oct 14, 2010 at 5:46 PM, Dmitriy Ryaboy <[email protected]> wrote: > He should be able to put in Strings. > Christian, do you have getLoadCaster implemented? > > From LoadFunc: > > /** > > * This will be called on the front end during planning and not on the > back > > * end during execution. > > * @return the {...@link LoadCaster} associated with this loader. Returning > null > > * indicates that casts from byte array are not supported for this > loader. > > * construction > > * @throws IOException if there is an exception during LoadCaster > > */ > > public LoadCaster getLoadCaster() throws IOException { > > return new Utf8StorageConverter(); > > } > > -D > On Thu, Oct 14, 2010 at 8:00 AM, Jeff Zhang <[email protected]> wrote: > > > Hi Christian, > > > > Like Dmitriy said, You should put pig types to tuple. > > > > >> Tuple output = TupleFactory.getInstance().newTuple(3); > > >> output.set(0, new DataByteArray(res.get("col1")).getBytes("UTF-8")); > > >> output.set(1, new DataByteArray(res.get("col2")).getBytes("UTF-8")); > > >> output.set(2, new DataByteArray(res.get("col3")).getBytes("UTF-8")); > > > > > > On Thu, Oct 14, 2010 at 8:22 PM, Christian Decker > > <[email protected]> wrote: > > > Right now all my tuple values are of type String. Actually my code > looks > > > like this, still pretty basic but it's doing what it's supposed to: > > > > > > List<ColumnOrSuperColumn> cf = > > >> (List<ColumnOrSuperColumn>)reader.getCurrentValue(); > > >> HashMap<String, Object> res = new HashMap<String, Object>(); > > >> for (ColumnOrSuperColumn c : cf){ > > >> res.put(new String(c.column.name), new String(c.column.value)); > > >> } > > >> Tuple output = TupleFactory.getInstance().newTuple(3); > > >> output.set(0, res.get("col1")); > > >> output.set(1, res.get("col2")); > > >> output.set(2, res.get("col3")); > > >> > > > > > > Any idea? > > > > > > Regards, > > > Chris > > > > > > On Tue, Oct 12, 2010 at 11:23 PM, Dmitriy Ryaboy <[email protected]> > > wrote: > > > > > >> What are the objects underlying col1, col2, and col3? You can only use > > the > > >> set of objects Pig understands (so, String, various Number > derivatives, > > >> DataByteArray, Map<String, Object> , Tuple, DataBag) > > >> > > >> -D > > >> > > >> On Tue, Oct 12, 2010 at 12:37 PM, Christian Decker < > > >> [email protected]> wrote: > > >> > > >> > Hi, > > >> > > > >> > I'm currently working on a simple Cassandra Loader that reads an > index > > >> and > > >> > then works on that data. Now whenever I try to work on the retrieved > > data > > >> I > > >> > get a strange error: > > >> > > > >> > java.io.IOException: Type mismatch in key from map: expected > > >> > > org.apache.pig.impl.io.NullableBytesWritable, recieved > > >> > > org.apache.pig.impl.io.NullableText > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:845) > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541) > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:115) > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:234) > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:227) > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:52) > > >> > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > > >> > > at > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) > > >> > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > > >> > > at > > >> > > > > >> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) > > >> > > > > >> > > > >> > The script is pretty simple right now: > > >> > > > >> > rows = LOAD 'cassandra://localhost:9160/...' USING > > CassandraIndexReader() > > >> > as > > >> > > (col1, col2, col3); > > >> > > dump rows; > > >> > > grouped = GROUP rows BY col1; > > >> > > dump grouped; > > >> > > > > >> > > > >> > The first dump works fine,while the second just dies with the above > > >> error. > > >> > Strangely when I store it on disc and then load it with PigStorage() > > >> again > > >> > it just works as expected. > > >> > > > >> > Am I doing something wrong with my Custom Loader? > > >> > > > >> > Regards, > > >> > Chris > > >> > > > >> > > > > > > > > > > > -- > > Best Regards > > > > Jeff Zhang > > >
