[
https://issues.apache.org/jira/browse/CASSANDRA-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147054#comment-13147054
]
Jonas Borgström commented on CASSANDRA-3466:
--------------------------------------------
> I haven't been able to reproduce the assertion errors, but I did find what is
> preventing hint delivery in some cases
Brandon, Did you verify that removing those lines of code actually fixes hint
delivery?
Instead of changing the code I just did a quick experiment with "nodetool
flush" on the node holding the hints and then restarting the other node but
that was not enough to trigger hints delivery:
{code}
Node1 notices that node2 is backup up
INFO 14:41:50,752 Node /127.0.0.2 has restarted, now UP
INFO 14:41:50,752 InetAddress /127.0.0.2 is now UP
INFO 14:41:50,753 Node /127.0.0.2 state jump to normal
But no hints are delivered...
nodetool flush is used to make sure hints hit the disk on node1:
INFO 14:42:32,675 Enqueuing flush of Memtable-Versions@1503666327(83/103
serialized/live bytes, 3 ops)
INFO 14:42:32,675 Writing Memtable-Versions@1503666327(83/103 serialized/live
bytes, 3 ops)
INFO 14:42:32,681 Completed flushing
/tmp/node1/data/data/system/Versions-h-1-Data.db (247 bytes)
INFO 14:42:32,682 Enqueuing flush of
Memtable-HintsColumnFamily@737188401(177/221 serialized/live bytes, 1 ops)
INFO 14:42:32,682 Writing Memtable-HintsColumnFamily@737188401(177/221
serialized/live bytes, 1 ops)
INFO 14:42:32,688 Completed flushing
/tmp/node1/data/data/system/HintsColumnFamily-h-1-Data.db (277 bytes)
INFO 14:42:32,691 Enqueuing flush of Memtable-bar@1831941861(17/21
serialized/live bytes, 1 ops)
INFO 14:42:32,691 Writing Memtable-bar@1831941861(17/21 serialized/live
bytes, 1 ops)
INFO 14:42:32,694 Completed flushing /tmp/node1/data/data/foo/bar-h-1-Data.db
(68 bytes)
Node2 is restarted once more to check if this will trigger hints delivery:
INFO 14:42:54,650 InetAddress /127.0.0.2 is now dead.
INFO 14:43:02,628 Node /127.0.0.2 has restarted, now UP
INFO 14:43:02,629 InetAddress /127.0.0.2 is now UP
INFO 14:43:02,629 Node /127.0.0.2 state jump to normal
Still nothing... Restarting node 1 will deliver the hints within a few seconds
though...
{code}
Regarding reproducing the assertion error it's a bit tricky. But after letting
my two node test cluster performing hints delivery for each other a few times I
was able to reproduce it once more. Is there anything special you would like me
to test?
> Hinted handoff not working after rolling upgrade from 0.8.7 to 1.0.2
> --------------------------------------------------------------------
>
> Key: CASSANDRA-3466
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3466
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.0.0
> Reporter: Jonas Borgström
> Assignee: Brandon Williams
> Labels: hintedhandoff
> Fix For: 1.0.3
>
>
> While testing rolling upgrades from 0.8.7 to 1.0.2 on a test cluster I've
> noticed that hinted hand-off didn't always work properly. Hints generated on
> an upgraded node does not seem to be delivered to other newly upgraded nodes
> once they rejoin the ring. They only way I've found to get a node to deliver
> its hints is to restart it.
> Here's some steps to reproduce this issue:
> 1. Install cassandra 0.8.7 on node1 and node2 using default settings.
> 2. Create keyspace foo with {replication_factor: 2}. Create column family bar
> 3. Shutdown node2
> 4. Insert data into bar and verify that HintsColumnFamily on node2 contains
> hints
> 5. Start node2 and verify that hinted handoff is performed and
> HintsColumnFamily becomes empty again.
> 6. Upgrade and restart node1
> 7. Shutdown node2
> 8. Insert data into bar and verify that HintsColumnFamily on node2 contains
> hints
> 9. Upgrade and start node2
> 10. Notice that hinted handoff is *not* performed when "node2" comes back.
> (Only if node1 is restarted)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira