[
https://issues.apache.org/jira/browse/CASSANDRA-11393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15226377#comment-15226377
]
Benjamin Lerer commented on CASSANDRA-11393:
--------------------------------------------
I believe based on the different stack traces that we have in reality 2
different scenarios:
# In the case where the assertion is thrown by the
{{LegacyReadCommandSerializer}} the problem is caused by the fact that the
coordinator though at the time where the message was created that the replica
was on version 2.1 and that it discover before serializing the message that the
replica has been upgraded to version 3.0.
# In the case where the assertion is thrown by the {{ReadCommandSerializer}}
the problem is caused by the fact that the coordinator though at the time where
the message was created that the replica was on version 3.0 and that it
discover before serializing the message that the replica is in fact a 2.1 node.
My guess is that this case could happen if the coordinator has just restarted
and that the message creation is performed just after the endPoint has been
added and before the version is set (MessageService returns the current version
if it has no version associated to the end point).
> dtest failure in
> upgrade_tests.upgrade_through_versions_test.ProtoV3Upgrade_2_1_UpTo_3_0_HEAD.rolling_upgrade_test
> ------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-11393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11393
> Project: Cassandra
> Issue Type: Bug
> Reporter: Philip Thompson
> Assignee: Benjamin Lerer
> Labels: dtest
>
> We are seeing a failure in the upgrade tests that go from 2.1 to 3.0
> {code}
> node2: ERROR [SharedPool-Worker-2] 2016-03-10 20:05:17,865 Message.java:611 -
> Unexpected exception during request; channel = [id: 0xeb79b477,
> /127.0.0.1:39613 => /127.0.0.2:9042]
> java.lang.AssertionError: null
> at
> org.apache.cassandra.db.ReadCommand$LegacyReadCommandSerializer.serializedSize(ReadCommand.java:1208)
> ~[main/:na]
> at
> org.apache.cassandra.db.ReadCommand$LegacyReadCommandSerializer.serializedSize(ReadCommand.java:1155)
> ~[main/:na]
> at org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:166)
> ~[main/:na]
> at
> org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:72)
> ~[main/:na]
> at
> org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:609)
> ~[main/:na]
> at
> org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:758)
> ~[main/:na]
> at
> org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:701)
> ~[main/:na]
> at
> org.apache.cassandra.net.MessagingService.sendRRWithFailure(MessagingService.java:684)
> ~[main/:na]
> at
> org.apache.cassandra.service.AbstractReadExecutor.makeRequests(AbstractReadExecutor.java:110)
> ~[main/:na]
> at
> org.apache.cassandra.service.AbstractReadExecutor.makeDataRequests(AbstractReadExecutor.java:85)
> ~[main/:na]
> at
> org.apache.cassandra.service.AbstractReadExecutor$AlwaysSpeculatingReadExecutor.executeAsync(AbstractReadExecutor.java:330)
> ~[main/:na]
> at
> org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.doInitialQueries(StorageProxy.java:1699)
> ~[main/:na]
> at
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1654)
> ~[main/:na]
> at
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1601)
> ~[main/:na]
> at
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1520)
> ~[main/:na]
> at
> org.apache.cassandra.db.SinglePartitionReadCommand.execute(SinglePartitionReadCommand.java:302)
> ~[main/:na]
> at
> org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:67)
> ~[main/:na]
> at
> org.apache.cassandra.service.pager.SinglePartitionPager.fetchPage(SinglePartitionPager.java:34)
> ~[main/:na]
> at
> org.apache.cassandra.cql3.statements.SelectStatement$Pager$NormalPager.fetchPage(SelectStatement.java:297)
> ~[main/:na]
> at
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:333)
> ~[main/:na]
> at
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:209)
> ~[main/:na]
> at
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76)
> ~[main/:na]
> at
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206)
> ~[main/:na]
> at
> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:472)
> ~[main/:na]
> at
> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:449)
> ~[main/:na]
> at
> org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130)
> ~[main/:na]
> at
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
> [main/:na]
> at
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
> [main/:na]
> at
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_51]
> at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
> [main/:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> [main/:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
> {code}
> example failure:
> http://cassci.datastax.com/job/upgrade_tests-all/24/testReport/upgrade_tests.upgrade_through_versions_test/ProtoV3Upgrade_2_1_UpTo_3_0_HEAD/rolling_upgrade_test
> Failed on CassCI build upgrade_tests-all #24
> The stack trace and context match that of CASSANDRA-10122. It looks like it
> may be the same issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)