[jira] [Commented] (GORA-170) Getting a BufferUnderflowException in class CassandraColumn, method fromByteBuffer()

Lewis John McGibbney (JIRA) Mon, 05 Nov 2012 15:10:15 -0800

    [ 
https://issues.apache.org/jira/browse/GORA-170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491024#comment-13491024
 ]


Lewis John McGibbney commented on GORA-170:
-------------------------------------------

WHen I attempt to Generate a fetch list with Nutch 2.x I get the following

{code}
2012-11-05 22:51:03,951 DEBUG connection.HThriftClient - keyspace reseting from 
null to webpage
2012-11-05 22:51:04,066 DEBUG connection.HThriftClient - Transport open status 
true for client CassandraClient<localhost:9160-8>
2012-11-05 22:51:04,066 DEBUG connection.ConcurrentHClientPool - Status of 
releaseClient CassandraClient<localhost:9160-8> to queue: true
2012-11-05 22:51:04,087 WARN  mapred.FileOutputCommitter - Output path is null 
in cleanup
2012-11-05 22:51:04,089 WARN  mapred.LocalJobRunner - job_local_0001
java.nio.BufferUnderflowException
        at java.nio.Buffer.nextGetIndex(Buffer.java:480)
        at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:336)
        at 
me.prettyprint.cassandra.serializers.IntegerSerializer.fromByteBuffer(IntegerSerializer.java:35)
        at 
me.prettyprint.cassandra.serializers.FloatSerializer.fromByteBuffer(FloatSerializer.java:25)
        at 
me.prettyprint.cassandra.serializers.FloatSerializer.fromByteBuffer(FloatSerializer.java:10)
        at 
org.apache.gora.cassandra.query.CassandraColumn.fromByteBuffer(CassandraColumn.java:74)
        at 
org.apache.gora.cassandra.query.CassandraSubColumn.getValue(CassandraSubColumn.java:86)
        at 
org.apache.gora.cassandra.query.CassandraResult.updatePersistent(CassandraResult.java:90)
        at 
org.apache.gora.cassandra.query.CassandraResult.nextInner(CassandraResult.java:56)
        at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:112)
        at 
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:111)
        at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
        at 
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
2012-11-05 22:51:04,253 ERROR crawl.GeneratorJob - GeneratorJob: 
java.lang.RuntimeException: job failed: name=generate: 1352155857-1625665918, 
jobid=job_local_0001
        at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
        at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:191)
        at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:213)
        at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:241)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.crawl.GeneratorJob.main(GeneratorJob.java:249)
{code}

This is without the patch attached.
                
> Getting a BufferUnderflowException in class CassandraColumn, method 
> fromByteBuffer()
> ------------------------------------------------------------------------------------
>
>                 Key: GORA-170
>                 URL: https://issues.apache.org/jira/browse/GORA-170
>             Project: Apache Gora
>          Issue Type: Bug
>          Components: storage-cassandra
>    Affects Versions: 0.2.1
>         Environment: Not sure environment matters for this one but Ubuntu
>            Reporter: Chris Gerken
>            Priority: Blocker
>             Fix For: 0.3
>
>
> When using CassandraStore and GoraMapper to retrieve data previously stored 
> in Cassandra, a BufferUnderflowException is being thrown in method 
> fromByteBuffer() in class CassandraColumn.  This results in a complete 
> failure of the hadoop job trying to use the Cassandra data.
> The problem seems to be caused by an invalid assumption in the (de) 
> Serializer logic.  Serializers assume that the bytes in a ByteBuffer to be 
> deserialized start at offset 0 (zero) in the ByteBuffer's internal buffer.  
> In fact, there are times when a ByteBuffer passed back from  the 
> Hector/Thrift API will have its data start at a non-zero offset in its 
> buffer.  When serializers are given these non-zero offset ByteBuffers an 
> exception, usually BufferUnderflowException, is thrown.
> The suggested fix is to use the TbaseHelper class from Cassandra/Thrift:
>   import org.apache.thrift.TBaseHelper;
>   protected Object fromByteBuffer(Schema schema, ByteBuffer byteBuffer) {
>     Object value = null;
>     Serializer serializer = GoraSerializerTypeInferer.getSerializer(schema);
>     if (serializer == null) {
>       LOG.info("Schema is not supported: " + schema.toString());
>     } else {
>       ByteBuffer corrected = TBaseHelper.rightSize(byteBuffer);
>       value = serializer.fromByteBuffer(corrected);
>     }
>     return value;
>   }
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GORA-170) Getting a BufferUnderflowException in class CassandraColumn, method fromByteBuffer()

Reply via email to