[
https://issues.apache.org/jira/browse/CASSANDRA-2253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Ellis updated CASSANDRA-2253:
--------------------------------------
Affects Version/s: (was: 0.7.2)
(was: 0.7.1)
Fix Version/s: (was: 0.7.0)
0.7.4
> Gossiper Starvation
> -------------------
>
> Key: CASSANDRA-2253
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2253
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.0
> Environment: linux, windows
> Reporter: Mikael Sitruk
> Fix For: 0.7.4
>
> Attachments: CASSANDRA-0.7-2253.txt
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> Gossiper periodic task will get into starvation in case large sstable files
> need to be deleted.
> Indeed the SSTableDeletingReference uses the same scheduledTasks pool (from
> StorageService) as the Gossiper and other periodic tasks, but the gossiper
> tasks should run each second to assure correct cluster status (liveness of
> nodes). In case of large sstable files to be deleted (several GB) the delete
> operation can take more than 30 sec, thus making the whole cluster going into
> a wrong state where nodes are marked as not living while they are!
> This will lead to unneeded additional load like hinted hand off, wrong
> cluster state, increase in latency.
> One of the possible solution is to use a separate pool for periodic and non
> periodic tasks.
> I've implemented such change and it resolves the problem.
> I can provide a patch
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira