Rick Branson created CASSANDRA-5211:
---------------------------------------

             Summary: Migrating Clusters with gossip tables that have old dead 
nodes causes NPE, inability to join cluster
                 Key: CASSANDRA-5211
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5211
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 1.2.1
            Reporter: Rick Branson


I had done a removetoken on this cluster when it was 1.1.x, and it had a 
"ghost" entry for the removed node still in the stored ring data. When the 
nodes loaded the table up after conversion to 1.2 and attempting to migrate to 
VNodes, I got the following traceback:

ERROR [WRITE-/10.0.0.0] 2013-01-31 18:35:44,788 CassandraDaemon.java (line 133) 
Exception in thread Thread[WRITE-/10.0.0.0,5,main]
java.lang.NullPointerException
        at 
org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167)
        at 
org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124)
        at org.apache.cassandra.cql.jdbc.JdbcUTF8.getString(JdbcUTF8.java:73)
        at org.apache.cassandra.cql.jdbc.JdbcUTF8.compose(JdbcUTF8.java:93)
        at org.apache.cassandra.db.marshal.UTF8Type.compose(UTF8Type.java:32)
        at 
org.apache.cassandra.cql3.UntypedResultSet$Row.getString(UntypedResultSet.java:96)
        at 
org.apache.cassandra.db.SystemTable.loadDcRackInfo(SystemTable.java:402)
        at 
org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:117)
        at 
org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:127)
        at 
org.apache.cassandra.net.OutboundTcpConnection.isLocalDC(OutboundTcpConnection.java:74)
        at 
org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:270)
        at 
org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:142)

This is because these ghost nodes had a NULL tokens list in the system/peers 
table. A workaround was to delete the offending row in the system/peers table 
and restart the node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to