[ https://issues.apache.org/jira/browse/CASSANDRA-14444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523989#comment-16523989 ]
Xiaodong Xie commented on CASSANDRA-14444: ------------------------------------------ Ah, thanks for the explanation, [~eperott], what you saw in your test actually happened during our upgrade. Without the patch, we were seeing tens to hundreds of NPEs per seconds in the Cassandra cluster; while after this patch, the Cassandra cluster was quiet, but the app started to throw the exception you mentioned (far less often, probably several per minute in average, which was tolerable in our use case). I thought we might encounter this bug in the driver: [https://datastax-oss.atlassian.net/browse/JAVA-1740,] but the exception still happened after we bumped the driver version to 3.5.0. Thanks again for your test and explanation, [~eperott]. (y) > Got NPE when querying Cassandra 3.11.2 > -------------------------------------- > > Key: CASSANDRA-14444 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14444 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: Ubuntu 14.04, JDK 1.8.0_171. > Cassandra 3.11.2 > Reporter: Xiaodong Xie > Priority: Blocker > > We just upgraded our Cassandra cluster from 2.2.6 to 3.11.2 > After upgrading, we immediately got exceptions in Cassandra like this one: > > {code} > ERROR [Native-Transport-Requests-1] 2018-05-11 17:10:21,994 > QueryMessage.java:129 - Unexpected error during query > java.lang.NullPointerException: null > at > org.apache.cassandra.dht.RandomPartitioner.getToken(RandomPartitioner.java:248) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.dht.RandomPartitioner.decorateKey(RandomPartitioner.java:92) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at org.apache.cassandra.config.CFMetaData.decorateKey(CFMetaData.java:666) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.service.pager.PartitionRangeQueryPager.<init>(PartitionRangeQueryPager.java:44) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.db.PartitionRangeReadCommand.getPager(PartitionRangeReadCommand.java:268) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.cql3.statements.SelectStatement.getPager(SelectStatement.java:475) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:288) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:118) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:224) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:255) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:240) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517) > [apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.2.jar:3.11.2] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_171] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.2.jar:3.11.2] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.2.jar:3.11.2] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171] > {code} > > The table schema is like: > {code} > CREATE TABLE example.example_table ( > id bigint, > hash text, > json text, > PRIMARY KEY (id, hash) > ) WITH COMPACT STORAGE > {code} > > The query is something like: > {code} > "select * from example.example_table;" // (We do know this is bad practise, > and we are trying to fix that right now) > {code} > with fetch-size as 200, using DataStax Java driver. > This table contains about 20k rows. > > Actually, the fix is quite simple, > > {code} > --- a/src/java/org/apache/cassandra/service/pager/PagingState.java > +++ b/src/java/org/apache/cassandra/service/pager/PagingState.java > @@ -46,7 +46,7 @@ public class PagingState > public PagingState(ByteBuffer partitionKey, RowMark rowMark, int remaining, > int remainingInPartition) > { > - this.partitionKey = partitionKey; > + this.partitionKey = partitionKey == null ? ByteBufferUtil.EMPTY_BYTE_BUFFER > : partitionKey; > this.rowMark = rowMark; > this.remaining = remaining; > this.remainingInPartition = remainingInPartition; > {code} > > "partitionKey == null ? ByteBufferUtil.EMPTY_BYTE_BUFFER : partitionKey;" is > in 2.2.6 and 2.2.8. But it was removed for some reason. > The interesting part is that, we have: > {code} > public final ByteBuffer partitionKey; // Can be null for single partition > queries. > {code} > It seems "partitionKey" could be null. > Thanks a lot. > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org