Lack of conversion from Cassandra's TimeUUIDType to any compatible type in Pig.
-------------------------------------------------------------------------------

                 Key: CASSANDRA-2954
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2954
             Project: Cassandra
          Issue Type: Bug
          Components: Hadoop
    Affects Versions: 0.8.1
            Reporter: Jacek Gerbszt
            Priority: Minor


CassandraStorage passes wrong data types to pig. When I try to access column 
family with comparator=TimeUUIDType, I get an exception:
java.lang.RuntimeException: Unexpected data type -1 found in stream.
        at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
        at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
        at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
        at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
        at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
        at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
        at 
org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
        at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
        at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:531)
        at 
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)

The reason is that CassandraStorage converts TimeUUIDType to UUID using 
org.apache.cassandra.db.marshal.TimeUUIDType and puts it in a Pig's Tuple:
CassandraStorage.java:148:            pair.set(0, 
marshallers.get(0).compose(name));

Pig cannot handle UUID, so throws an exception.
There's a need for some mechanizm to convert Cassadra types to Pig types, 
because probably this isn't a single case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to