[ 
https://issues.apache.org/jira/browse/CASSANDRA-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-4626:
----------------------------------------

    Attachment: 4626.txt

I think this can happen because of the commit log. Basically, it's possible 
that when you restart a node that it doesn't pick the correct current NodeId if 
he attempts to read the current NodeId before the commit log if fully replayed 
(and the more recent NodeId is in the log, not yet replayed). This would then 
lead to having 2 columns in the CurrentLocal row.

However, the main problem is that the way we maintain the CurrentLocal row is 
fragile and honestly dumb (I wrote it so I'm blaming myself). We store all the 
generated NodeId sorted by creation time in a separated row, so reading the 
last column of that row is a much simpler and resilient way to do it. Attaching 
a patch that does just that.  

The patch also adds a forceFlush in SystemTable.writeCurrentNodeId to avoid the 
problem of not reading the last NodeId because of log replay.

                
> Multiple values for CurrentLocal Node ID
> ----------------------------------------
>
>                 Key: CASSANDRA-4626
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4626
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.11
>            Reporter: Aaron Morton
>         Attachments: 4626.txt
>
>
> From this email thread 
> http://www.mail-archive.com/user@cassandra.apache.org/msg24677.html
> There are multiple columns for the CurrentLocal row in NodeIdInfo:
> {noformat}
> [default@system] list NodeIdInfo ;
> Using default limit of 100
> ...
> -------------------
> RowKey: 43757272656e744c6f63616c
> => (column=01efa5d0-e133-11e1-0000-51be601cd0ff, value=0a1020d2, 
> timestamp=1344414498989)
> => (column=92109b80-ea0a-11e1-0000-51be601cd0af, value=0a1020d2, 
> timestamp=1345386691897)
> {noformat}
> SystemTable.getCurrentLocalNodeId() throws an assertion that occurs when the 
> static constructor for o.a.c.utils.NodeId is in the stack.
> The impact is a java.lang.NoClassDefFoundError when accessing a particular CF 
> (I assume on with counters) on a particular node.
> Cannot see an obvious cause in the code. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to