Yes, I'll do that.

/Fredrik
Sylvain Lebresne skrev 2011-12-01 11:10:
You're right, good catch.
Do you mind opening a ticket on jira
(https://issues.apache.org/jira/browse/CASSANDRA)?

--
Sylvain

On Thu, Dec 1, 2011 at 10:03 AM, Fredrik L Stigbäck
<fredrik.l.stigb...@sitevision.se>  wrote:
Hi,
We,re running cassandra 1.0.3.
I've done some testing with 2 nodes (node A, node B), replication factor 2.
I take node A down, writing some data to node B and then take node A up.
Sometimes hints aren't delivered when node A comes up.

I've done some debugging in org.apache.cassandra.db.HintedHandOffManager and
sometimes node B ends up in a strange state in method
org.apache.cassandra.db.HintedHandOffManager.deliverHints(final InetAddress
to), where org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries
already has node A in it's Set and therefore no hints will ever be delivered
to node A.
The only reason for this that I can see is that in
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(InetAddress
endpoint) the hintStore.isEmpty() check returns true and the endpoint (node
A)  isn't removed from
org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries. Then no hints
will ever be delivered again until node B is restarted.
During what conditions will hintStore.isEmpty() return true?
Shouldn't the hintStore.isEmpty() check be inside the try {} finally{}
clause, removing the endpoint from queuedDeliveries in the finally block?

public void deliverHints(final InetAddress to)
{
         logger_.debug("deliverHints to {}", to);
         if (!queuedDeliveries.add(to))
             return;
         .......
}

private void deliverHintsToEndpoint(InetAddress endpoint) throws
IOException, DigestMismatchException, InvalidRequestException,
TimeoutException,
{
         ColumnFamilyStore hintStore =
Table.open(Table.SYSTEM_TABLE).getColumnFamilyStore(HINTS_CF);
         if (hintStore.isEmpty())
             return; // nothing to do, don't confuse users by logging a no-op
handoff
     try
     {
         ......
     }
     finally
     {
             queuedDeliveries.remove(endpoint);
     }
}

Regards
/Fredrik

Reply via email to