[ 
https://issues.apache.org/jira/browse/CASSANDRA-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton reassigned CASSANDRA-11825:
-----------------------------------------

    Assignee: Joel Knighton

> NPE in gossip
> -------------
>
>                 Key: CASSANDRA-11825
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11825
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: T Jake Luciani
>            Assignee: Joel Knighton
>              Labels: fallout
>             Fix For: 3.0.x
>
>
> We have a test that causes an NPE in gossip code:
> It's basically calling nodetool enable/disable gossip
> From the debug log
> {quote}
> WARN  [RMI TCP Connection(17)-54.153.70.214] 2016-05-17 18:58:44,423 
> StorageService.java:395 - Starting gossip by operator request
> DEBUG [RMI TCP Connection(17)-54.153.70.214] 2016-05-17 18:58:44,424 
> StorageService.java:1996 - Node /172.31.24.76 state NORMAL, token 
> [-9223372036854775808]
> INFO  [RMI TCP Connection(17)-54.153.70.214] 2016-05-17 18:58:44,424 
> StorageService.java:1999 - Node /172.31.24.76 state jump to NORMAL
> DEBUG [RMI TCP Connection(17)-54.153.70.214] 2016-05-17 18:58:44,424 
> YamlConfigurationLoader.java:102 - Loading settings from 
> file:/mnt/ephemeral/automaton/cassandra-src/conf/cassandra.yaml
> DEBUG [PendingRangeCalculator:1] 2016-05-17 18:58:44,425 
> PendingRangeCalculatorService.java:66 - finished calculation for 5 keyspaces 
> in 0ms
> DEBUG [GossipStage:1] 2016-05-17 18:58:45,346 FailureDetector.java:456 - 
> Ignoring interval time of 75869093776 for /172.31.31.1
> DEBUG [GossipStage:1] 2016-05-17 18:58:45,347 FailureDetector.java:456 - 
> Ignoring interval time of 75869214424 for /172.31.17.32
> INFO  [GossipStage:1] 2016-05-17 18:58:45,347 Gossiper.java:1028 - Node 
> /172.31.31.1 has restarted, now UP
> DEBUG [GossipStage:1] 2016-05-17 18:58:45,347 StorageService.java:1996 - Node 
> /172.31.31.1 state NORMAL, token [-3074457345618258603]
> INFO  [GossipStage:1] 2016-05-17 18:58:45,347 StorageService.java:1999 - Node 
> /172.31.31.1 state jump to NORMAL
> INFO  [HANDSHAKE-/172.31.31.1] 2016-05-17 18:58:45,348 
> OutboundTcpConnection.java:514 - Handshaking version with /172.31.31.1
> ERROR [GossipStage:1] 2016-05-17 18:58:45,354 CassandraDaemon.java:195 - 
> Exception in thread Thread[GossipStage:1,5,main]
> java.lang.NullPointerException: null
>       at org.apache.cassandra.gms.Gossiper.getHostId(Gossiper.java:846) 
> ~[main/:na]
>       at 
> org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:2008)
>  ~[main/:na]
>       at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1729)
>  ~[main/:na]
>       at 
> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2446) 
> ~[main/:na]
>       at 
> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1050) 
> ~[main/:na]
>       at 
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1133) 
> ~[main/:na]
>       at 
> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
>  ~[main/:na]
>       at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[main/:na]
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_40]
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_40]
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_40]
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_40]
>       at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
> INFO  [GossipStage:1] 2016-05-17 18:58:45,355 Gossiper.java:1028 - Node 
> /172.31.17.32 has restarted, now UP
> DEBUG [GossipStage:1] 2016-05-17 18:58:45,355 StorageService.java:1996 - Node 
> /172.31.17.32 state NORMAL, token [3074457345618258602]
> INFO  [GossipStage:1] 2016-05-17 18:58:45,356 StorageService.java:1999 - Node 
> /172.31.17.32 state jump to NORMAL
> INFO  [HANDSHAKE-/172.31.17.32] 2016-05-17 18:58:45,356 
> OutboundTcpConnection.java:514 - Handshaking version with /172.31.17.32
> DEBUG [PendingRangeCalculator:1] 2016-05-17 18:58:45,357 
> PendingRangeCalculatorService.java:66 - finished calculation for 5 keyspaces 
> in 0ms
> DEBUG [GossipStage:1] 2016-05-17 18:58:45,357 MigrationManager.java:94 - Not 
> pulling schema because versions match or shouldPullSchemaFrom returned false
> INFO  [GossipStage:1] 2016-05-17 18:58:45,357 TokenMetadata.java:429 - 
> Updating topology for /172.31.17.32
> INFO  [GossipStage:1] 2016-05-17 18:58:45,358 TokenMetadata.java:429 - 
> Updating topology for /172.31.17.32
> DEBUG [SharedPool-Worker-1] 2016-05-17 18:58:45,358 Gossiper.java:993 - 
> removing expire time for endpoint : /172.31.17.32
> INFO  [SharedPool-Worker-1] 2016-05-17 18:58:45,358 Gossiper.java:994 - 
> InetAddress /172.31.17.32 is now UP
> DEBUG [SharedPool-Worker-1] 2016-05-17 18:58:45,358 MigrationManager.java:94 
> - Not pulling schema because versions match or shouldPullSchemaFrom returned 
> false
> DEBUG [GossipStage:1] 2016-05-17 18:58:45,358 MigrationManager.java:94 - Not 
> pulling schema because versions match or shouldPullSchemaFrom returned false
> DEBUG [SharedPool-Worker-2] 2016-05-17 18:58:45,360 Gossiper.java:993 - 
> removing expire time for endpoint : /172.31.31.1
> DEBUG [SharedPool-Worker-1] 2016-05-17 18:58:45,360 Gossiper.java:993 - 
> removing expire time for endpoint : /172.31.31.1
> INFO  [SharedPool-Worker-2] 2016-05-17 18:58:45,360 Gossiper.java:994 - 
> InetAddress /172.31.31.1 is now UP
> INFO  [SharedPool-Worker-1] 2016-05-17 18:58:45,360 Gossiper.java:994 - 
> InetAddress /172.31.31.1 is now UP
> WARN  [GossipTasks:1] 2016-05-17 18:58:45,429 FailureDetector.java:287 - Not 
> marking nodes down due to local pause of 75131216102 > 5000000000
> DEBUG [GossipTasks:1] 2016-05-17 18:58:45,429 FailureDetector.java:293 - 
> Still not marking nodes down due to local pause
> INFO  [HANDSHAKE-/172.31.31.1] 2016-05-17 18:58:45,431 
> OutboundTcpConnection.java:514 - Handshaking version with /172.31.31.1
> DEBUG [GossipTasks:1] 2016-05-17 18:58:46,429 FailureDetector.java:293 - 
> Still not marking nodes down due to local pause
> DEBUG [GossipTasks:1] 2016-05-17 18:58:46,429 FailureDetector.java:293 - 
> Still not marking nodes down due to local pause
> DEBUG [GossipTasks:1] 2016-05-17 18:58:47,430 FailureDetector.java:293 - 
> Still not marking nodes down due to local pause
> DEBUG [GossipTasks:1] 2016-05-17 18:58:47,430 FailureDetector.java:293 - 
> Still not marking nodes down due to local pause
> DEBUG [GossipTasks:1] 2016-05-17 18:58:48,430 FailureDetector.java:293 - 
> Still not marking nodes down due to local pause
> DEBUG [GossipTasks:1] 2016-05-17 18:58:48,430 FailureDetector.java:293 - 
> Still not marking nodes down due to local pause
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to