[
https://issues.apache.org/jira/browse/CASSANDRA-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261991#comment-13261991
]
Vijay commented on CASSANDRA-4189:
----------------------------------
{quote}
Also, on (mis)completion we flush and force a compaction that should clear out
the tombstones (see CASSANDRA-3733) so I'm skeptical this is a real problem.
{quote}
May be the above will fix it, the hints CF (about 10GB) is too large for the
node in question... so i have to do more tests.
{quote}
Sure, in a two node cluster maybe the single threaded nature is a problem, but
in any cluster of appreciable size it's always overload that's an issue, so I
don't see much to be gained by multithreading it.
{quote}
No the problem is when you have 10's of nodes and they are all in different
DC's, it is naturally throttled by the latency of 100's of milliseconds. Now
while replaying hints, the thread gets stuck replaying the hints to the remote
node, no other node gets the hints. What i am suggesting is to throttle but in
a multi threaded way.
> Improve hints replay
> --------------------
>
> Key: CASSANDRA-4189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4189
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 1.2
> Reporter: Vijay
> Assignee: Vijay
> Priority: Minor
> Fix For: 1.2
>
>
> Problem: Hints are stored in one row.
> when there are a lot of hints stored and we store Tombstones for the ones
> which has been replayed.
> It might be worth shading the hints based on Hour at which the hints are
> stored. This can reduce the complexity of the scanning for hints.
> Problem: Hints replay is too slow and single threaded.
> There are use-case where the hints needs to be replayed ASAP to make the
> cluster more consistent.
> In Multi region cluster, the throttle is already done due to the latency
> which is in the order of 100's of millisecond.
> It might be worth trying to replay the hints in parallel and throttle on the
> number of bytes read from the disk or use the existing setting of throttle
> based on sleep interval on all the threads.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira