[ https://issues.apache.org/jira/browse/CASSANDRA-2253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-2253: -------------------------------------- Affects Version/s: (was: 0.7.2) (was: 0.7.1) Fix Version/s: (was: 0.7.0) 0.7.4 > Gossiper Starvation > ------------------- > > Key: CASSANDRA-2253 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2253 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 0.7.0 > Environment: linux, windows > Reporter: Mikael Sitruk > Fix For: 0.7.4 > > Attachments: CASSANDRA-0.7-2253.txt > > Original Estimate: 2h > Remaining Estimate: 2h > > Gossiper periodic task will get into starvation in case large sstable files > need to be deleted. > Indeed the SSTableDeletingReference uses the same scheduledTasks pool (from > StorageService) as the Gossiper and other periodic tasks, but the gossiper > tasks should run each second to assure correct cluster status (liveness of > nodes). In case of large sstable files to be deleted (several GB) the delete > operation can take more than 30 sec, thus making the whole cluster going into > a wrong state where nodes are marked as not living while they are! > This will lead to unneeded additional load like hinted hand off, wrong > cluster state, increase in latency. > One of the possible solution is to use a separate pool for periodic and non > periodic tasks. > I've implemented such change and it resolves the problem. > I can provide a patch -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira