[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high
[ https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266493#comment-14266493 ] Benedict commented on CASSANDRA-7139: - This is only the default. You are recommended to tune this default based on your own system's behaviour. With modern SSDs and many cores, many concurrent compactors is a great idea. For spinning disk setups, it can be terrible, and we want to avoid terrible default decisions. Either way, I suspect the problem you are encountering is entirely different, i.e. that the default _compaction_throughput_mb_per_sec_ is 10, which would be why you are maxing out at exactly 10MB/s. Default concurrent_compactors is probably too high -- Key: CASSANDRA-7139 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Jonathan Ellis Priority: Minor Fix For: 2.1 rc1 Attachments: 7139.txt The default number of concurrent compactors is probably too high for modern hardware with spinning disks for storage: A modern blade can easily have 24+ Cores, which would result in a default of 24 concurrent compactions. This not only increases random IO, it also keeps around a lot of obsoleted files for an unnecessarily long time, as each compaction keeps references to any possibly overlapping files that it isn't itself compacting - but these can have been obsoleted part way through by compactions that finished earlier. If you factor in the default compaction throughput rate of 16Mb/s, anything but a single default concurrent_compactor makes very little sense, as a single thread should always be able to handle 16Mb/s, will cause less interference with other processes, and permits obsoleted files to be immediately removed. See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making this change on a box with 24-cores and 8Tb of storage (first spike is default settings) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high
[ https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266473#comment-14266473 ] Catalin Alexandru Zamfir commented on CASSANDRA-7139: - Our set-up was RAID5 and the min (numberOfDisk, numberOfCores) would just be 2, when we have 40+ cores. The commented concurrent_compactors would be 2 meaning that a lot of SSTables are accumulating in high-cardinality tables (where the partition key is an UUID-type) because the compaction is limited to 2. Looking at dstat even if we've set compaction_throughput_in_mb_per_sec to 192 (spinning disk) the dstat -lrv1 disk write maxes out at 10MB/s. IMHO, the concurrent_compactors should be number_of_cores/compaction_throughput_in_mb_per_sec * 100 which in our case (40 cores) gives around 20/21 compactors. And on 8 cores (8/192 * 100 gives 4 concurrent compactors). Default concurrent_compactors is probably too high -- Key: CASSANDRA-7139 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Jonathan Ellis Priority: Minor Fix For: 2.1 rc1 Attachments: 7139.txt The default number of concurrent compactors is probably too high for modern hardware with spinning disks for storage: A modern blade can easily have 24+ Cores, which would result in a default of 24 concurrent compactions. This not only increases random IO, it also keeps around a lot of obsoleted files for an unnecessarily long time, as each compaction keeps references to any possibly overlapping files that it isn't itself compacting - but these can have been obsoleted part way through by compactions that finished earlier. If you factor in the default compaction throughput rate of 16Mb/s, anything but a single default concurrent_compactor makes very little sense, as a single thread should always be able to handle 16Mb/s, will cause less interference with other processes, and permits obsoleted files to be immediately removed. See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making this change on a box with 24-cores and 8Tb of storage (first spike is default settings) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high
[ https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034093#comment-14034093 ] Jeremiah Jordan commented on CASSANDRA-7139: Can we get this change in 2.0? Have had the default concurrent compactors causes issues on a few clusters. Default concurrent_compactors is probably too high -- Key: CASSANDRA-7139 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Jonathan Ellis Priority: Minor Fix For: 2.1 rc1 Attachments: 7139.txt The default number of concurrent compactors is probably too high for modern hardware with spinning disks for storage: A modern blade can easily have 24+ Cores, which would result in a default of 24 concurrent compactions. This not only increases random IO, it also keeps around a lot of obsoleted files for an unnecessarily long time, as each compaction keeps references to any possibly overlapping files that it isn't itself compacting - but these can have been obsoleted part way through by compactions that finished earlier. If you factor in the default compaction throughput rate of 16Mb/s, anything but a single default concurrent_compactor makes very little sense, as a single thread should always be able to handle 16Mb/s, will cause less interference with other processes, and permits obsoleted files to be immediately removed. See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making this change on a box with 24-cores and 8Tb of storage (first spike is default settings) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high
[ https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034309#comment-14034309 ] Jonathan Ellis commented on CASSANDRA-7139: --- I don't like changing defaults out from under people mid-release. Makes for an unpleasant surprise if those defaults were working for you. Default concurrent_compactors is probably too high -- Key: CASSANDRA-7139 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Jonathan Ellis Priority: Minor Fix For: 2.1 rc1 Attachments: 7139.txt The default number of concurrent compactors is probably too high for modern hardware with spinning disks for storage: A modern blade can easily have 24+ Cores, which would result in a default of 24 concurrent compactions. This not only increases random IO, it also keeps around a lot of obsoleted files for an unnecessarily long time, as each compaction keeps references to any possibly overlapping files that it isn't itself compacting - but these can have been obsoleted part way through by compactions that finished earlier. If you factor in the default compaction throughput rate of 16Mb/s, anything but a single default concurrent_compactor makes very little sense, as a single thread should always be able to handle 16Mb/s, will cause less interference with other processes, and permits obsoleted files to be immediately removed. See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making this change on a box with 24-cores and 8Tb of storage (first spike is default settings) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high
[ https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004524#comment-14004524 ] Benedict commented on CASSANDRA-7139: - LGTM, +1 Default concurrent_compactors is probably too high -- Key: CASSANDRA-7139 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Jonathan Ellis Priority: Minor Fix For: 2.1 rc1 Attachments: 7139.txt The default number of concurrent compactors is probably too high for modern hardware with spinning disks for storage: A modern blade can easily have 24+ Cores, which would result in a default of 24 concurrent compactions. This not only increases random IO, it also keeps around a lot of obsoleted files for an unnecessarily long time, as each compaction keeps references to any possibly overlapping files that it isn't itself compacting - but these can have been obsoleted part way through by compactions that finished earlier. If you factor in the default compaction throughput rate of 16Mb/s, anything but a single default concurrent_compactor makes very little sense, as a single thread should always be able to handle 16Mb/s, will cause less interference with other processes, and permits obsoleted files to be immediately removed. See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making this change on a box with 24-cores and 8Tb of storage (first spike is default settings) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high
[ https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987647#comment-13987647 ] Jonathan Ellis commented on CASSANDRA-7139: --- That's a graph of ... something vs time? Default concurrent_compactors is probably too high -- Key: CASSANDRA-7139 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Priority: Minor Fix For: 2.0.8, 2.1 rc1 The default number of concurrent compactors is probably too high for modern hardware with spinning disks for storage: A modern blade can easily have 24+ Cores, which would result in a default of 24 concurrent compactions. This not only increases random IO, it also keeps around a lot of obsoleted files for an unnecessarily long time, as each compaction keeps references to any possibly overlapping files that it isn't itself compacting - but these can have been obsoleted part way through by compactions that finished earlier. If you factor in the default compaction throughput rate of 16Mb/s, anything but a single default concurrent_compactor makes very little sense, as a single thread should always be able to handle 16Mb/s, will cause less interference with other processes, and permits obsoleted files to be immediately removed. See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making this change on a box with 24-cores and 8Tb of storage (first spike is default settings) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high
[ https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987649#comment-13987649 ] Benedict commented on CASSANDRA-7139: - disk (space) utilisation vs time Default concurrent_compactors is probably too high -- Key: CASSANDRA-7139 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Priority: Minor Fix For: 2.0.8, 2.1 rc1 The default number of concurrent compactors is probably too high for modern hardware with spinning disks for storage: A modern blade can easily have 24+ Cores, which would result in a default of 24 concurrent compactions. This not only increases random IO, it also keeps around a lot of obsoleted files for an unnecessarily long time, as each compaction keeps references to any possibly overlapping files that it isn't itself compacting - but these can have been obsoleted part way through by compactions that finished earlier. If you factor in the default compaction throughput rate of 16Mb/s, anything but a single default concurrent_compactor makes very little sense, as a single thread should always be able to handle 16Mb/s, will cause less interference with other processes, and permits obsoleted files to be immediately removed. See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making this change on a box with 24-cores and 8Tb of storage (first spike is default settings) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high
[ https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987735#comment-13987735 ] Jonathan Ellis commented on CASSANDRA-7139: --- so first spike is defaults, what are the other seven? Default concurrent_compactors is probably too high -- Key: CASSANDRA-7139 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Priority: Minor Fix For: 2.0.8, 2.1 rc1 The default number of concurrent compactors is probably too high for modern hardware with spinning disks for storage: A modern blade can easily have 24+ Cores, which would result in a default of 24 concurrent compactions. This not only increases random IO, it also keeps around a lot of obsoleted files for an unnecessarily long time, as each compaction keeps references to any possibly overlapping files that it isn't itself compacting - but these can have been obsoleted part way through by compactions that finished earlier. If you factor in the default compaction throughput rate of 16Mb/s, anything but a single default concurrent_compactor makes very little sense, as a single thread should always be able to handle 16Mb/s, will cause less interference with other processes, and permits obsoleted files to be immediately removed. See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making this change on a box with 24-cores and 8Tb of storage (first spike is default settings) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high
[ https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987740#comment-13987740 ] Benedict commented on CASSANDRA-7139: - They're flush/compaction spikes during operation with only one concurrent_compactor. i.e. their disk space was exploding prior to changing, and they were having to bounce nodes daily to reclaim disk space - the graph only goes back as far as just before changing the config option. Default concurrent_compactors is probably too high -- Key: CASSANDRA-7139 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Priority: Minor Fix For: 2.0.8, 2.1 rc1 The default number of concurrent compactors is probably too high for modern hardware with spinning disks for storage: A modern blade can easily have 24+ Cores, which would result in a default of 24 concurrent compactions. This not only increases random IO, it also keeps around a lot of obsoleted files for an unnecessarily long time, as each compaction keeps references to any possibly overlapping files that it isn't itself compacting - but these can have been obsoleted part way through by compactions that finished earlier. If you factor in the default compaction throughput rate of 16Mb/s, anything but a single default concurrent_compactor makes very little sense, as a single thread should always be able to handle 16Mb/s, will cause less interference with other processes, and permits obsoleted files to be immediately removed. See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making this change on a box with 24-cores and 8Tb of storage (first spike is default settings) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high
[ https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987780#comment-13987780 ] Jonathan Ellis commented on CASSANDRA-7139: --- One could be reasonable with SSD + unlimited compaction throughput, especially with LCS. But on HDD + STCS [still the default] getting compactions piled up behind a huge compaction op is a real thing. How about one per disk, instead of one per core? Default concurrent_compactors is probably too high -- Key: CASSANDRA-7139 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Priority: Minor Fix For: 2.0.8, 2.1 rc1 The default number of concurrent compactors is probably too high for modern hardware with spinning disks for storage: A modern blade can easily have 24+ Cores, which would result in a default of 24 concurrent compactions. This not only increases random IO, it also keeps around a lot of obsoleted files for an unnecessarily long time, as each compaction keeps references to any possibly overlapping files that it isn't itself compacting - but these can have been obsoleted part way through by compactions that finished earlier. If you factor in the default compaction throughput rate of 16Mb/s, anything but a single default concurrent_compactor makes very little sense, as a single thread should always be able to handle 16Mb/s, will cause less interference with other processes, and permits obsoleted files to be immediately removed. See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making this change on a box with 24-cores and 8Tb of storage (first spike is default settings) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high
[ https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987792#comment-13987792 ] Benedict commented on CASSANDRA-7139: - How about: 1 per disk, with a cap of 8, say? Boxes with 12+ (even 24+) disks aren't totally uncommon and you could see the same problem there as well. This should all be less of a problem with CASSANDRA-6696 as we'll be able to actually schedule on a per-disk basis and have no risk of referring to files on other disks, so we just want a sensible number to avoid breaking anyone who hasn't tuned their nodes between now and then. Default concurrent_compactors is probably too high -- Key: CASSANDRA-7139 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Priority: Minor Fix For: 2.0.8, 2.1 rc1 The default number of concurrent compactors is probably too high for modern hardware with spinning disks for storage: A modern blade can easily have 24+ Cores, which would result in a default of 24 concurrent compactions. This not only increases random IO, it also keeps around a lot of obsoleted files for an unnecessarily long time, as each compaction keeps references to any possibly overlapping files that it isn't itself compacting - but these can have been obsoleted part way through by compactions that finished earlier. If you factor in the default compaction throughput rate of 16Mb/s, anything but a single default concurrent_compactor makes very little sense, as a single thread should always be able to handle 16Mb/s, will cause less interference with other processes, and permits obsoleted files to be immediately removed. See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making this change on a box with 24-cores and 8Tb of storage (first spike is default settings) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high
[ https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987820#comment-13987820 ] Jonathan Ellis commented on CASSANDRA-7139: --- SGTM. Default concurrent_compactors is probably too high -- Key: CASSANDRA-7139 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Priority: Minor Fix For: 2.0.8, 2.1 rc1 The default number of concurrent compactors is probably too high for modern hardware with spinning disks for storage: A modern blade can easily have 24+ Cores, which would result in a default of 24 concurrent compactions. This not only increases random IO, it also keeps around a lot of obsoleted files for an unnecessarily long time, as each compaction keeps references to any possibly overlapping files that it isn't itself compacting - but these can have been obsoleted part way through by compactions that finished earlier. If you factor in the default compaction throughput rate of 16Mb/s, anything but a single default concurrent_compactor makes very little sense, as a single thread should always be able to handle 16Mb/s, will cause less interference with other processes, and permits obsoleted files to be immediately removed. See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making this change on a box with 24-cores and 8Tb of storage (first spike is default settings) -- This message was sent by Atlassian JIRA (v6.2#6252)