Hinted handoffs isn't delivered if/when HintedHandOffManager ends up in invalid
state.
--------------------------------------------------------------------------------------
Key: CASSANDRA-3546
URL: https://issues.apache.org/jira/browse/CASSANDRA-3546
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 1.0.3
Reporter: Fredrik L Stigbäck
Running Cassandra 1.0.3.
I've done some testing with 2 nodes (node A, node B), replication factor 2.
I take node A down, writing some data to node B and then take node A up.
Sometimes hints aren't delivered when node A comes up.
I've done some debugging in org.apache.cassandra.db.HintedHandOffManager and
sometimes node B ends up in a strange state in method
org.apache.cassandra.db.HintedHandOffManager.deliverHints(final InetAddress
to), where org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries
already has node A in it's Set and therefore no hints will ever be delivered to
node A.
The only reason for this that I can see is that in
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(InetAddress
endpoint) the hintStore.isEmpty() check returns true and the endpoint (node A)
isn't removed from
org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries. Then no hints
will ever be delivered again until node B is restarted.
During what conditions will hintStore.isEmpty() return true?
Shouldn't the hintStore.isEmpty() check be inside the try {} finally{} clause,
removing the endpoint from queuedDeliveries in the finally block?
{code}
public void deliverHints(final InetAddress to)
{
logger_.debug("deliverHints to {}", to);
if (!queuedDeliveries.add(to))
return;
.......
}
{code}
{code}
private void deliverHintsToEndpoint(InetAddress endpoint)
throws IOException, DigestMismatchException, InvalidRequestException,
TimeoutException, InterruptedException
{
ColumnFamilyStore hintStore =
Table.open(Table.SYSTEM_TABLE).getColumnFamilyStore(HINTS_CF);
if (hintStore.isEmpty())
return; // nothing to do, don't confuse users by logging a no-op
handoff
try
{
......
}
finally
{
queuedDeliveries.remove(endpoint);
}
}
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira