[ 
https://issues.apache.org/jira/browse/CASSANDRA-13339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150916#comment-16150916
 ] 

Jason Brown edited comment on CASSANDRA-13339 at 9/8/17 4:28 PM:
-----------------------------------------------------------------

Here's where I'm at so far investigating:
- A mutation is being stored on the coorindator as it is a replica for the 
data, as can be seen from the {{LocalMutationRunnable}} in the stack traces.
- In both 3.0 and 3.9 (which have been reported), We execute the 
{{StorageProxy#performLocally}} method that takes an 
{{IAsyncCallbackWithFailure}} as the last parameter (The method has a different 
arity between the two cassandra versions, but it's the same method basically). 
That method is a few different ways in {{StorageProxy}}
-- {{#apply}} - the standard 'write a mutation' function
-- sync batchlog - write the batchlog and block
-- counter write
-- syncWriteBatachedMutations
-- asyncWriteBatchedMutations

Due to the way everyone's currently reported stack traces look, what I'm 
suspecting is the write thread think the rows in the Mutation (one of the 
{{PartitionUpdate}}'s {{holder}} instances to be specific) are empty when we 
check the serialized size, but not empty when we actually serialize. Here's why:

The stack traces all fail in 
[{{UnfilteredRowIteratorSerializer#serialize}}|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/rows/UnfilteredRowIteratorSerializer.java#L120].
 At that point in the serialize method, we've already written out at least two 
bytes (one for the partition key length, and one for the flags). We then try to 
serialize the {{SerializationHeader}}, which serializes the {{EncodingStats}}, 
and then it fails. In 
[{{UnfilteredRowIteratorSerializer#serializedSize}}|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/rows/UnfilteredRowIteratorSerializer.java#L150],
 it accounts for the partition key length and flags *at the minimum*. If the 
{{iterator}} argument to the method {{#isEmpty}}, is simply returns the 
currently computed size. Thus we always serialize the 'basic data' about a row, 
but then nothing else; we knew we needed to something about a row, but didn't 
have the full knowledge about the row when we calculated the size.

I think there may be some thread visibility issue or some race condition where 
the {{iterator}} is empty at 
{{UnfilteredRowIteratorSerializer#serializedSize}}, yet not empty at 
{{UnfilteredRowIteratorSerializer#serialize}}. Note that there may be something 
funny going on with the {{PartitionUpdate#holder}}, but I could not see 
anything obvious (without grasping at straws).

Without more details or a way to reproduce, I'm kind of at a stand-still 
without just flailing at all the things. Thanks to all those who have 
commented, especially [~crichards]


was (Author: jasobrown):
Here's where I'm at so far investigating:
- A mutation is being stored on the coorindator as it is a replica for the 
data, as can be seen from the {{LocalMutationRunnable}} in the stack traces.
- In both 3.0 and 3.9 (which have been reported), We execute the 
{{StorageProxy#performLocally}} method that takes an 
{{IAsyncCallbackWithFailure}} as the last parameter (The method has a different 
arity between the two cassandra versions, but it's the same method basically). 
That method is a few different ways in {{StorageProxy}}
-- {{#apply}} - the standard 'write a mutation' function
-- sync batchlog - write the batchlog and block
-- counter write
-- syncWriteBatachedMutations
-- asyncWriteBatchedMutations

Due to the way everyone's currently reported stack traces look, what I'm 
suspecting is the write thread think the rows in the Mutation (one of the 
{{PartitionUpdate}}'s {{holder}} instances to be specific) are empty when we 
check the serialized size, but not empty when we actually serialize. Here's why:

The stack traces all fail in 
[{{UnfilteredRowIteratorSerializer#serialize}}|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/rows/UnfilteredRowIteratorSerializer.java#L120].
 At that point in the serialize method, we've already written out at least two 
bytes (one for the partition key length, and one for the flags). We then try to 
serialize the {{SerializationHeader}}, which serializes the {{EncodingStats}}, 
and then it fails. In 
[{{UnfilteredRowIteratorSerializer#serializedSize}}|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/rows/UnfilteredRowIteratorSerializer.java#L150],
 it accounts for the partition key length and flags *at the minimum*. If the 
{{iterator}} argument to the method {{#isEmpty}}, is simply returns the 
currently computed size. Thus we always serialize the 'basic data' about a row, 
but then nothing else; we knew we needed to something about a row, but didn't 
have the full knowledge about the row when we calculated the size.

I think there may be some thread visibility issue or some race condition where 
the {{iterator}} is empty at 
{{UnfilteredRowIteratorSerializer#serializedSize}}, yet not empty at 
{{UnfilteredRowIteratorSerializer#serialize}}. Note that there may be something 
funny going on with the {{PartitionUpdate#holder}}, but I could see anything 
obvious (without grasping at straws).

Without more details or a way to reproduce, I'm kind of at a stand-still 
without just flailing at all the things. Thanks to all those who have 
commented, especially [~crichards]

> java.nio.BufferOverflowException: null
> --------------------------------------
>
>                 Key: CASSANDRA-13339
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13339
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Richards
>
> I'm seeing the following exception running Cassandra 3.9 (with Netty updated 
> to 4.1.8.Final) running on a 2 node cluster.  It would have been processing 
> around 50 queries/second at the time (mixture of 
> inserts/updates/selects/deletes) : there's a collection of tables (some with 
> counters some without) and a single materialized view.
> {code}
> ERROR [MutationStage-4] 2017-03-15 22:50:33,052 StorageProxy.java:1353 - 
> Failed to apply mutation locally : {}
> java.nio.BufferOverflowException: null
>       at 
> org.apache.cassandra.io.util.DataOutputBufferFixed.doFlush(DataOutputBufferFixed.java:52)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.writeUnsignedVInt(BufferedDataOutputStreamPlus.java:262)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.db.rows.EncodingStats$Serializer.serialize(EncodingStats.java:233)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.db.SerializationHeader$Serializer.serializeForMessaging(SerializationHeader.java:380)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:122)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:89)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serialize(PartitionUpdate.java:790)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serialize(Mutation.java:393)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:279) 
> ~[apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
> ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1347)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2539)
>  [apache-cassandra-3.9.jar:3.9]
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_121]
>       at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  [apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
>       at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> {code}
> and then again shortly afterwards
> {code}
> ERROR [MutationStage-3] 2017-03-15 23:27:36,198 StorageProxy.java:1353 - 
> Failed to apply mutation locally : {}
> java.nio.BufferOverflowException: null
>       at 
> org.apache.cassandra.io.util.DataOutputBufferFixed.doFlush(DataOutputBufferFixed.java:52)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.writeUnsignedVInt(BufferedDataOutputStreamPlus.java:262)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.db.rows.EncodingStats$Serializer.serialize(EncodingStats.java:233)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.db.SerializationHeader$Serializer.serializeForMessaging(SerializationHeader.java:380)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:122)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:89)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serialize(PartitionUpdate.java:790)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serialize(Mutation.java:393)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:279) 
> ~[apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
> ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1347)
>  ~[apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2539)
>  [apache-cassandra-3.9.jar:3.9]
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_121]
>       at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  [apache-cassandra-3.9.jar:3.9]
>       at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
>       at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
>       at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to