[jira] [Commented] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value

Ben Holloway (JIRA) Mon, 22 Apr 2013 09:49:17 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638159#comment-13638159
 ]


Ben Holloway commented on CASSANDRA-5504:
-----------------------------------------

patch doesn't fix my issue still get:

{quote}
java.lang.RuntimeException: org.apache.thrift.TException: Message length 
exceeded: 21
        at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
        at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
        at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.getProgress(PigRecordReader.java:158)
        at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:514)
        at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:539)
        at 
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Unknown Source)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.thrift.TException: Message length exceeded: 21
        at 
org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
        at 
org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
        at org.apache.cassandra.thrift.Column.read(Column.java:528)
        at 
org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
        at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
        at 
org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
        at 
org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
        at 
org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
        at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
{quote}

Maybe these are separate issues
                
> Eternal iteration when using newer hadoop version due to next() call and 
> empty key value
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5504
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.2.3
>            Reporter: Oleksandr Petrov
>            Priority: Critical
>         Attachments: patch.diff
>
>
> Currently, when using newer hadoop versions, due to the call to 
> next(ByteBuffer key, SortedMap<ByteBuffer, IColumn> value)
> within ColumnFamilyRecordReader, because `key.clear();` is called, key is 
> emptied. That causes the StaticRowIterator and WideRowIterator to glitch, 
> namely, when Iterables.getLast(rows).key is called, key is already empty. 
> This will cause Hadoop to request the same range again and again all the time.
> Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) 
> and saves it for the next iteration along with all the rows, this allows 
> query for the next range to be fully correct.
> This patch is branched from 1.2.3 version.
> Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value

Reply via email to