[jira] [Updated] (CASSANDRA-5244) Compactions don't work while node is bootstrapping
[ https://issues.apache.org/jira/browse/CASSANDRA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-5244: Priority: Critical (was: Minor) Compactions don't work while node is bootstrapping -- Key: CASSANDRA-5244 URL: https://issues.apache.org/jira/browse/CASSANDRA-5244 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 beta 1 Reporter: Jouni Hartikainen Assignee: Brandon Williams Priority: Critical Labels: gossip Fix For: 1.2.2 It seems that there is a race condition in StorageService that prevents compactions from completing while node is in a bootstrap state. I have been able to reproduce this multiple times by throttling streaming throughput to extend the bootstrap time while simultaneously inserting data to the cluster. The problems lies in the synchronization of initServer(int delay) and reportSeverity(double incr) methods as they both try to acquire the instance lock of StorageService through the use of synchronized keyword. As initServer does not return until the bootstrap has completed, all calls to reportSeverity will block until that. However, reportSeverity is called when starting compactions in CompactionInfo and thus all compactions block until bootstrap completes. This might severely degrade node's performance after bootstrap as it might have lots of compactions pending while simultaneously starting to serve reads. I have been able to solve the issue by adding a separate lock for reportSeverity and removing its class level synchronization. This of course is not a valid approach if we must assume that any of Gossiper's IEndpointStateChangeSubscribers could potentially end up calling back to StorageService's synchronized methods. However, at least at the moment, that does not seem to be the case. Maybe somebody with more experience about the codebase comes up with a better solution? (This might affect DynamicEndpointSnitch as well, as it also calls to reportSeverity in its setSeverity method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5244) Compactions don't work while node is bootstrapping
[ https://issues.apache.org/jira/browse/CASSANDRA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-5244: Attachment: 5244.txt It seems to me the only reason we're synchronizing here is for the increment, and we don't need to get our own severity out of gossip, so we can just track a local AtomicDouble instead. Compactions don't work while node is bootstrapping -- Key: CASSANDRA-5244 URL: https://issues.apache.org/jira/browse/CASSANDRA-5244 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 beta 1 Reporter: Jouni Hartikainen Assignee: Brandon Williams Priority: Critical Labels: gossip Fix For: 1.2.2 Attachments: 5244.txt It seems that there is a race condition in StorageService that prevents compactions from completing while node is in a bootstrap state. I have been able to reproduce this multiple times by throttling streaming throughput to extend the bootstrap time while simultaneously inserting data to the cluster. The problems lies in the synchronization of initServer(int delay) and reportSeverity(double incr) methods as they both try to acquire the instance lock of StorageService through the use of synchronized keyword. As initServer does not return until the bootstrap has completed, all calls to reportSeverity will block until that. However, reportSeverity is called when starting compactions in CompactionInfo and thus all compactions block until bootstrap completes. This might severely degrade node's performance after bootstrap as it might have lots of compactions pending while simultaneously starting to serve reads. I have been able to solve the issue by adding a separate lock for reportSeverity and removing its class level synchronization. This of course is not a valid approach if we must assume that any of Gossiper's IEndpointStateChangeSubscribers could potentially end up calling back to StorageService's synchronized methods. However, at least at the moment, that does not seem to be the case. Maybe somebody with more experience about the codebase comes up with a better solution? (This might affect DynamicEndpointSnitch as well, as it also calls to reportSeverity in its setSeverity method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5244) Compactions don't work while node is bootstrapping
[ https://issues.apache.org/jira/browse/CASSANDRA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-5244: Reviewer: vijay2...@yahoo.com Compactions don't work while node is bootstrapping -- Key: CASSANDRA-5244 URL: https://issues.apache.org/jira/browse/CASSANDRA-5244 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 beta 1 Reporter: Jouni Hartikainen Assignee: Brandon Williams Priority: Critical Labels: gossip Fix For: 1.2.2 Attachments: 5244.txt It seems that there is a race condition in StorageService that prevents compactions from completing while node is in a bootstrap state. I have been able to reproduce this multiple times by throttling streaming throughput to extend the bootstrap time while simultaneously inserting data to the cluster. The problems lies in the synchronization of initServer(int delay) and reportSeverity(double incr) methods as they both try to acquire the instance lock of StorageService through the use of synchronized keyword. As initServer does not return until the bootstrap has completed, all calls to reportSeverity will block until that. However, reportSeverity is called when starting compactions in CompactionInfo and thus all compactions block until bootstrap completes. This might severely degrade node's performance after bootstrap as it might have lots of compactions pending while simultaneously starting to serve reads. I have been able to solve the issue by adding a separate lock for reportSeverity and removing its class level synchronization. This of course is not a valid approach if we must assume that any of Gossiper's IEndpointStateChangeSubscribers could potentially end up calling back to StorageService's synchronized methods. However, at least at the moment, that does not seem to be the case. Maybe somebody with more experience about the codebase comes up with a better solution? (This might affect DynamicEndpointSnitch as well, as it also calls to reportSeverity in its setSeverity method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5244) Compactions don't work while node is bootstrapping
[ https://issues.apache.org/jira/browse/CASSANDRA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-5244: Attachment: 5244.txt Compactions don't work while node is bootstrapping -- Key: CASSANDRA-5244 URL: https://issues.apache.org/jira/browse/CASSANDRA-5244 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 beta 1 Reporter: Jouni Hartikainen Assignee: Brandon Williams Priority: Critical Labels: gossip Fix For: 1.2.2 Attachments: 5244.txt It seems that there is a race condition in StorageService that prevents compactions from completing while node is in a bootstrap state. I have been able to reproduce this multiple times by throttling streaming throughput to extend the bootstrap time while simultaneously inserting data to the cluster. The problems lies in the synchronization of initServer(int delay) and reportSeverity(double incr) methods as they both try to acquire the instance lock of StorageService through the use of synchronized keyword. As initServer does not return until the bootstrap has completed, all calls to reportSeverity will block until that. However, reportSeverity is called when starting compactions in CompactionInfo and thus all compactions block until bootstrap completes. This might severely degrade node's performance after bootstrap as it might have lots of compactions pending while simultaneously starting to serve reads. I have been able to solve the issue by adding a separate lock for reportSeverity and removing its class level synchronization. This of course is not a valid approach if we must assume that any of Gossiper's IEndpointStateChangeSubscribers could potentially end up calling back to StorageService's synchronized methods. However, at least at the moment, that does not seem to be the case. Maybe somebody with more experience about the codebase comes up with a better solution? (This might affect DynamicEndpointSnitch as well, as it also calls to reportSeverity in its setSeverity method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5244) Compactions don't work while node is bootstrapping
[ https://issues.apache.org/jira/browse/CASSANDRA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-5244: Attachment: (was: 5244.txt) Compactions don't work while node is bootstrapping -- Key: CASSANDRA-5244 URL: https://issues.apache.org/jira/browse/CASSANDRA-5244 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 beta 1 Reporter: Jouni Hartikainen Assignee: Brandon Williams Priority: Critical Labels: gossip Fix For: 1.2.2 Attachments: 5244.txt It seems that there is a race condition in StorageService that prevents compactions from completing while node is in a bootstrap state. I have been able to reproduce this multiple times by throttling streaming throughput to extend the bootstrap time while simultaneously inserting data to the cluster. The problems lies in the synchronization of initServer(int delay) and reportSeverity(double incr) methods as they both try to acquire the instance lock of StorageService through the use of synchronized keyword. As initServer does not return until the bootstrap has completed, all calls to reportSeverity will block until that. However, reportSeverity is called when starting compactions in CompactionInfo and thus all compactions block until bootstrap completes. This might severely degrade node's performance after bootstrap as it might have lots of compactions pending while simultaneously starting to serve reads. I have been able to solve the issue by adding a separate lock for reportSeverity and removing its class level synchronization. This of course is not a valid approach if we must assume that any of Gossiper's IEndpointStateChangeSubscribers could potentially end up calling back to StorageService's synchronized methods. However, at least at the moment, that does not seem to be the case. Maybe somebody with more experience about the codebase comes up with a better solution? (This might affect DynamicEndpointSnitch as well, as it also calls to reportSeverity in its setSeverity method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5244) Compactions don't work while node is bootstrapping
[ https://issues.apache.org/jira/browse/CASSANDRA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5244: -- Priority: Minor (was: Critical) Thanks for the detective work, Jouni. I'll let Brandon comment on solutions; in the meantime, marking Minor since while inconvenient this does not compromise correctness. Compactions don't work while node is bootstrapping -- Key: CASSANDRA-5244 URL: https://issues.apache.org/jira/browse/CASSANDRA-5244 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.1 Reporter: Jouni Hartikainen Priority: Minor It seems that there is a race condition in StorageService that prevents compactions from completing while node is in a bootstrap state. I have been able to reproduce this multiple times by throttling streaming throughput to extend the bootstrap time while simultaneously inserting data to the cluster. The problems lies in the synchronization of initServer(int delay) and reportSeverity(double incr) methods as they both try to acquire the instance lock of StorageService through the use of synchronized keyword. As initServer does not return until the bootstrap has completed, all calls to reportSeverity will block until that. However, reportSeverity is called when starting compactions in CompactionInfo and thus all compactions block until bootstrap completes. This might severely degrade node's performance after bootstrap as it might have lots of compactions pending while simultaneously starting to serve reads. I have been able to solve the issue by adding a separate lock for reportSeverity and removing its class level synchronization. This of course is not a valid approach if we must assume that any of Gossiper's IEndpointStateChangeSubscribers could potentially end up calling back to StorageService's synchronized methods. However, at least at the moment, that does not seem to be the case. Maybe somebody with more experience about the codebase comes up with a better solution? (This might affect DynamicEndpointSnitch as well, as it also calls to reportSeverity in its setSeverity method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5244) Compactions don't work while node is bootstrapping
[ https://issues.apache.org/jira/browse/CASSANDRA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5244: -- Component/s: Core Fix Version/s: 1.2.2 Labels: gossip (was: ) Compactions don't work while node is bootstrapping -- Key: CASSANDRA-5244 URL: https://issues.apache.org/jira/browse/CASSANDRA-5244 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.1 Reporter: Jouni Hartikainen Priority: Minor Labels: gossip Fix For: 1.2.2 It seems that there is a race condition in StorageService that prevents compactions from completing while node is in a bootstrap state. I have been able to reproduce this multiple times by throttling streaming throughput to extend the bootstrap time while simultaneously inserting data to the cluster. The problems lies in the synchronization of initServer(int delay) and reportSeverity(double incr) methods as they both try to acquire the instance lock of StorageService through the use of synchronized keyword. As initServer does not return until the bootstrap has completed, all calls to reportSeverity will block until that. However, reportSeverity is called when starting compactions in CompactionInfo and thus all compactions block until bootstrap completes. This might severely degrade node's performance after bootstrap as it might have lots of compactions pending while simultaneously starting to serve reads. I have been able to solve the issue by adding a separate lock for reportSeverity and removing its class level synchronization. This of course is not a valid approach if we must assume that any of Gossiper's IEndpointStateChangeSubscribers could potentially end up calling back to StorageService's synchronized methods. However, at least at the moment, that does not seem to be the case. Maybe somebody with more experience about the codebase comes up with a better solution? (This might affect DynamicEndpointSnitch as well, as it also calls to reportSeverity in its setSeverity method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5244) Compactions don't work while node is bootstrapping
[ https://issues.apache.org/jira/browse/CASSANDRA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-5244: Affects Version/s: (was: 1.2.1) 1.2.0 beta 1 Compactions don't work while node is bootstrapping -- Key: CASSANDRA-5244 URL: https://issues.apache.org/jira/browse/CASSANDRA-5244 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 beta 1 Reporter: Jouni Hartikainen Assignee: Brandon Williams Priority: Minor Labels: gossip Fix For: 1.2.2 It seems that there is a race condition in StorageService that prevents compactions from completing while node is in a bootstrap state. I have been able to reproduce this multiple times by throttling streaming throughput to extend the bootstrap time while simultaneously inserting data to the cluster. The problems lies in the synchronization of initServer(int delay) and reportSeverity(double incr) methods as they both try to acquire the instance lock of StorageService through the use of synchronized keyword. As initServer does not return until the bootstrap has completed, all calls to reportSeverity will block until that. However, reportSeverity is called when starting compactions in CompactionInfo and thus all compactions block until bootstrap completes. This might severely degrade node's performance after bootstrap as it might have lots of compactions pending while simultaneously starting to serve reads. I have been able to solve the issue by adding a separate lock for reportSeverity and removing its class level synchronization. This of course is not a valid approach if we must assume that any of Gossiper's IEndpointStateChangeSubscribers could potentially end up calling back to StorageService's synchronized methods. However, at least at the moment, that does not seem to be the case. Maybe somebody with more experience about the codebase comes up with a better solution? (This might affect DynamicEndpointSnitch as well, as it also calls to reportSeverity in its setSeverity method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira