[ https://issues.apache.org/jira/browse/CASSANDRA-2253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mikael Sitruk updated CASSANDRA-2253: ------------------------------------- Attachment: CASSANDRA-0.7-2253.txt patch for bug 2253, Gossip starvation > Gossiper Starvation > ------------------- > > Key: CASSANDRA-2253 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2253 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 0.7.0, 0.7.1, 0.7.2 > Environment: linux, windows > Reporter: Mikael Sitruk > Fix For: 0.7.0 > > Attachments: CASSANDRA-0.7-2253.txt > > Original Estimate: 2h > Remaining Estimate: 2h > > Gossiper periodic task will get into starvation in case large sstable files > need to be deleted. > Indeed the SSTableDeletingReference uses the same scheduledTasks pool (from > StorageService) as the Gossiper and other periodic tasks, but the gossiper > tasks should run each second to assure correct cluster status (liveness of > nodes). In case of large sstable files to be deleted (several GB) the delete > operation can take more than 30 sec, thus making the whole cluster going into > a wrong state where nodes are marked as not living while they are! > This will lead to unneeded additional load like hinted hand off, wrong > cluster state, increase in latency. > One of the possible solution is to use a separate pool for periodic and non > periodic tasks. > I've implemented such change and it resolves the problem. > I can provide a patch -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira