Jaydeepkumar Chovatia created CASSANDRA-14525:
-------------------------------------------------

             Summary: streaming failure during bootstrap makes new node into 
inconsistent state
                 Key: CASSANDRA-14525
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: Jaydeepkumar Chovatia
            Assignee: Jaydeepkumar Chovatia


If bootstrap fails for newly joining node (most common reason is due to 
streaming failure) then Cassandra state remains in {{joining}} state which is 
fine but Cassandra also enables Native transport which makes overall state 
inconsistent. This further creates NullPointer exception if auth is enabled on 
the new node, please find reproducible steps here:

For example if bootstrap fails due to streaming errors like
{quote}
java.util.concurrent.ExecutionException: 
org.apache.cassandra.streaming.StreamException: Stream failed
        at 
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
 ~[guava-18.0.jar:na]
        at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
 ~[guava-18.0.jar:na]
        at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
~[guava-18.0.jar:na]
        at 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256) 
[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
 [apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:660) 
[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:573) 
[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567) 
[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
[apache-cassandra-3.0.16.jar:3.0.16]
Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
        at 
org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
        at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
~[guava-18.0.jar:na]
        at 
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
 ~[guava-18.0.jar:na]
        at 
com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
 ~[guava-18.0.jar:na]
        at 
com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) 
~[guava-18.0.jar:na]
        at 
com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
 ~[guava-18.0.jar:na]
        at 
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
~[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
{quote}


then variable [StorageService.java::dataAvailable 
|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
 will be {{false}}. Since {{dataAvailable}} is {{false}} hence at it will not 
call [StorageService.java::finishJoiningRing 
|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933]
 and as a result 
[StorageService.java::doAuthSetup|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L999]
 will not be invoked. 

API [StorageService.java::joinTokenRing | 
https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L763]
 returns without any problem. After this 
[CassandraDaemon.java::start|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/CassandraDaemon.java#L584]
 is invoked which starts native transport at 
[CassandraDaemon.java::startNativeTransport 
|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/CassandraDaemon.java#L478]
 

At this point daemon’s bootstrap is still not finished and transport is 
enabled. So client will connect to the node and will encounter 
{{java.lang.NullPointerException}} as following:


{quote}
ERROR [SharedPool-Worker-2] Message.java:647 - Unexpected exception during 
request; channel = [id: 0x412a26b3, L:/a.b.c.d:9042 - R:/p.q.r.s:20121]
java.lang.NullPointerException: null
        at 
org.apache.cassandra.auth.PasswordAuthenticator.doAuthenticate(PasswordAuthenticator.java:160)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:82)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.auth.PasswordAuthenticator.access$100(PasswordAuthenticator.java:54)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.auth.PasswordAuthenticator$PlainTextSaslAuthenticator.getAuthenticatedUser(PasswordAuthenticator.java:198)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:78)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:535)
 [apache-cassandra-3.0.16.jar:3.0.16]
        at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:429)
 [apache-cassandra-3.0.16.jar:3.0.16]
        at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.1.0.CR6.jar:4.1.0.CR6]
        at 
io.netty.channel.ChannelHandlerInvokerUtil.invokeChannelReadNow(ChannelHandlerInvokerUtil.java:83)
 [netty-all-4.1.0.CR6.jar:4.1.0.CR6]
        at 
io.netty.channel.DefaultChannelHandlerInvoker$7.run(DefaultChannelHandlerInvoker.java:159)
 [netty-all-4.1.0.CR6.jar:4.1.0.CR6]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_121]
        at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
 [apache-cassandra-3.0.16.jar:3.0.16]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-3.0.16.jar:3.0.16]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
{quote}

At this point if we run {{nodetool status}} then it will show this new node in 
{{UJ}} state, however clients can connect to this node over {{CQL}} and will 
receive {{java.lang.NullPointerException}}





 





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to