[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2015-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266493#comment-14266493
 ] 

Benedict commented on CASSANDRA-7139:
-

This is only the default. You are recommended to tune this default based on 
your own system's behaviour. With modern SSDs and many cores, many concurrent 
compactors is a great idea. For spinning disk setups, it can be terrible, and 
we want to avoid terrible default decisions.

Either way, I suspect the problem you are encountering is entirely different, 
i.e. that the default _compaction_throughput_mb_per_sec_ is 10, which would be 
why you are maxing out at exactly 10MB/s.

 Default concurrent_compactors is probably too high
 --

 Key: CASSANDRA-7139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1 rc1

 Attachments: 7139.txt


 The default number of concurrent compactors is probably too high for modern 
 hardware with spinning disks for storage: A modern blade can easily have 24+ 
 Cores, which would result in a default of 24 concurrent compactions. This not 
 only increases random IO, it also keeps around a lot of obsoleted files for 
 an unnecessarily long time, as each compaction keeps references to any 
 possibly overlapping files that it isn't itself compacting - but these can 
 have been obsoleted part way through by compactions that finished earlier. If 
 you factor in the default compaction throughput rate of 16Mb/s, anything but 
 a single default concurrent_compactor makes very little sense, as a single 
 thread should always be able to handle 16Mb/s, will cause less interference 
 with other processes, and permits obsoleted files to be immediately removed.
 See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
 this change on a box with 24-cores and 8Tb of storage (first spike is default 
 settings)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2015-01-06 Thread Catalin Alexandru Zamfir (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266473#comment-14266473
 ] 

Catalin Alexandru Zamfir commented on CASSANDRA-7139:
-

Our set-up was RAID5 and the min (numberOfDisk, numberOfCores) would just be 2, 
when we have 40+ cores. The commented concurrent_compactors would be 2 
meaning that a lot of SSTables are accumulating in high-cardinality tables 
(where the partition key is an UUID-type) because the compaction is limited to 
2. Looking at dstat even if we've set compaction_throughput_in_mb_per_sec 
to 192 (spinning disk) the dstat -lrv1 disk write maxes out at 10MB/s.

IMHO, the concurrent_compactors should be 
number_of_cores/compaction_throughput_in_mb_per_sec * 100 which in our case (40 
cores) gives around 20/21 compactors. And on 8 cores (8/192 * 100 gives 4 
concurrent compactors).

 Default concurrent_compactors is probably too high
 --

 Key: CASSANDRA-7139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1 rc1

 Attachments: 7139.txt


 The default number of concurrent compactors is probably too high for modern 
 hardware with spinning disks for storage: A modern blade can easily have 24+ 
 Cores, which would result in a default of 24 concurrent compactions. This not 
 only increases random IO, it also keeps around a lot of obsoleted files for 
 an unnecessarily long time, as each compaction keeps references to any 
 possibly overlapping files that it isn't itself compacting - but these can 
 have been obsoleted part way through by compactions that finished earlier. If 
 you factor in the default compaction throughput rate of 16Mb/s, anything but 
 a single default concurrent_compactor makes very little sense, as a single 
 thread should always be able to handle 16Mb/s, will cause less interference 
 with other processes, and permits obsoleted files to be immediately removed.
 See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
 this change on a box with 24-cores and 8Tb of storage (first spike is default 
 settings)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2014-06-17 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034093#comment-14034093
 ] 

Jeremiah Jordan commented on CASSANDRA-7139:


Can we get this change in 2.0?  Have had the default concurrent compactors 
causes issues on a few clusters.

 Default concurrent_compactors is probably too high
 --

 Key: CASSANDRA-7139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1 rc1

 Attachments: 7139.txt


 The default number of concurrent compactors is probably too high for modern 
 hardware with spinning disks for storage: A modern blade can easily have 24+ 
 Cores, which would result in a default of 24 concurrent compactions. This not 
 only increases random IO, it also keeps around a lot of obsoleted files for 
 an unnecessarily long time, as each compaction keeps references to any 
 possibly overlapping files that it isn't itself compacting - but these can 
 have been obsoleted part way through by compactions that finished earlier. If 
 you factor in the default compaction throughput rate of 16Mb/s, anything but 
 a single default concurrent_compactor makes very little sense, as a single 
 thread should always be able to handle 16Mb/s, will cause less interference 
 with other processes, and permits obsoleted files to be immediately removed.
 See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
 this change on a box with 24-cores and 8Tb of storage (first spike is default 
 settings)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2014-06-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034309#comment-14034309
 ] 

Jonathan Ellis commented on CASSANDRA-7139:
---

I don't like changing defaults out from under people mid-release.  Makes for an 
unpleasant surprise if those defaults were working for you.

 Default concurrent_compactors is probably too high
 --

 Key: CASSANDRA-7139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1 rc1

 Attachments: 7139.txt


 The default number of concurrent compactors is probably too high for modern 
 hardware with spinning disks for storage: A modern blade can easily have 24+ 
 Cores, which would result in a default of 24 concurrent compactions. This not 
 only increases random IO, it also keeps around a lot of obsoleted files for 
 an unnecessarily long time, as each compaction keeps references to any 
 possibly overlapping files that it isn't itself compacting - but these can 
 have been obsoleted part way through by compactions that finished earlier. If 
 you factor in the default compaction throughput rate of 16Mb/s, anything but 
 a single default concurrent_compactor makes very little sense, as a single 
 thread should always be able to handle 16Mb/s, will cause less interference 
 with other processes, and permits obsoleted files to be immediately removed.
 See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
 this change on a box with 24-cores and 8Tb of storage (first spike is default 
 settings)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2014-05-21 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004524#comment-14004524
 ] 

Benedict commented on CASSANDRA-7139:
-

LGTM, +1

 Default concurrent_compactors is probably too high
 --

 Key: CASSANDRA-7139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1 rc1

 Attachments: 7139.txt


 The default number of concurrent compactors is probably too high for modern 
 hardware with spinning disks for storage: A modern blade can easily have 24+ 
 Cores, which would result in a default of 24 concurrent compactions. This not 
 only increases random IO, it also keeps around a lot of obsoleted files for 
 an unnecessarily long time, as each compaction keeps references to any 
 possibly overlapping files that it isn't itself compacting - but these can 
 have been obsoleted part way through by compactions that finished earlier. If 
 you factor in the default compaction throughput rate of 16Mb/s, anything but 
 a single default concurrent_compactor makes very little sense, as a single 
 thread should always be able to handle 16Mb/s, will cause less interference 
 with other processes, and permits obsoleted files to be immediately removed.
 See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
 this change on a box with 24-cores and 8Tb of storage (first spike is default 
 settings)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2014-05-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987647#comment-13987647
 ] 

Jonathan Ellis commented on CASSANDRA-7139:
---

That's a graph of ... something vs time?

 Default concurrent_compactors is probably too high
 --

 Key: CASSANDRA-7139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 2.0.8, 2.1 rc1


 The default number of concurrent compactors is probably too high for modern 
 hardware with spinning disks for storage: A modern blade can easily have 24+ 
 Cores, which would result in a default of 24 concurrent compactions. This not 
 only increases random IO, it also keeps around a lot of obsoleted files for 
 an unnecessarily long time, as each compaction keeps references to any 
 possibly overlapping files that it isn't itself compacting - but these can 
 have been obsoleted part way through by compactions that finished earlier. If 
 you factor in the default compaction throughput rate of 16Mb/s, anything but 
 a single default concurrent_compactor makes very little sense, as a single 
 thread should always be able to handle 16Mb/s, will cause less interference 
 with other processes, and permits obsoleted files to be immediately removed.
 See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
 this change on a box with 24-cores and 8Tb of storage (first spike is default 
 settings)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2014-05-02 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987649#comment-13987649
 ] 

Benedict commented on CASSANDRA-7139:
-

disk (space) utilisation vs time

 Default concurrent_compactors is probably too high
 --

 Key: CASSANDRA-7139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 2.0.8, 2.1 rc1


 The default number of concurrent compactors is probably too high for modern 
 hardware with spinning disks for storage: A modern blade can easily have 24+ 
 Cores, which would result in a default of 24 concurrent compactions. This not 
 only increases random IO, it also keeps around a lot of obsoleted files for 
 an unnecessarily long time, as each compaction keeps references to any 
 possibly overlapping files that it isn't itself compacting - but these can 
 have been obsoleted part way through by compactions that finished earlier. If 
 you factor in the default compaction throughput rate of 16Mb/s, anything but 
 a single default concurrent_compactor makes very little sense, as a single 
 thread should always be able to handle 16Mb/s, will cause less interference 
 with other processes, and permits obsoleted files to be immediately removed.
 See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
 this change on a box with 24-cores and 8Tb of storage (first spike is default 
 settings)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2014-05-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987735#comment-13987735
 ] 

Jonathan Ellis commented on CASSANDRA-7139:
---

so first spike is defaults, what are the other seven?

 Default concurrent_compactors is probably too high
 --

 Key: CASSANDRA-7139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 2.0.8, 2.1 rc1


 The default number of concurrent compactors is probably too high for modern 
 hardware with spinning disks for storage: A modern blade can easily have 24+ 
 Cores, which would result in a default of 24 concurrent compactions. This not 
 only increases random IO, it also keeps around a lot of obsoleted files for 
 an unnecessarily long time, as each compaction keeps references to any 
 possibly overlapping files that it isn't itself compacting - but these can 
 have been obsoleted part way through by compactions that finished earlier. If 
 you factor in the default compaction throughput rate of 16Mb/s, anything but 
 a single default concurrent_compactor makes very little sense, as a single 
 thread should always be able to handle 16Mb/s, will cause less interference 
 with other processes, and permits obsoleted files to be immediately removed.
 See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
 this change on a box with 24-cores and 8Tb of storage (first spike is default 
 settings)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2014-05-02 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987740#comment-13987740
 ] 

Benedict commented on CASSANDRA-7139:
-

They're flush/compaction spikes during operation with only one 
concurrent_compactor. i.e. their disk space was exploding prior to changing, 
and they were having to bounce nodes daily to reclaim disk space - the graph 
only goes back as far as just before changing the config option.

 Default concurrent_compactors is probably too high
 --

 Key: CASSANDRA-7139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 2.0.8, 2.1 rc1


 The default number of concurrent compactors is probably too high for modern 
 hardware with spinning disks for storage: A modern blade can easily have 24+ 
 Cores, which would result in a default of 24 concurrent compactions. This not 
 only increases random IO, it also keeps around a lot of obsoleted files for 
 an unnecessarily long time, as each compaction keeps references to any 
 possibly overlapping files that it isn't itself compacting - but these can 
 have been obsoleted part way through by compactions that finished earlier. If 
 you factor in the default compaction throughput rate of 16Mb/s, anything but 
 a single default concurrent_compactor makes very little sense, as a single 
 thread should always be able to handle 16Mb/s, will cause less interference 
 with other processes, and permits obsoleted files to be immediately removed.
 See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
 this change on a box with 24-cores and 8Tb of storage (first spike is default 
 settings)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2014-05-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987780#comment-13987780
 ] 

Jonathan Ellis commented on CASSANDRA-7139:
---

One could be reasonable with SSD + unlimited compaction throughput, especially 
with LCS.  But on HDD + STCS [still the default] getting compactions piled up 
behind a huge compaction op is a real thing.

How about one per disk, instead of one per core?

 Default concurrent_compactors is probably too high
 --

 Key: CASSANDRA-7139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 2.0.8, 2.1 rc1


 The default number of concurrent compactors is probably too high for modern 
 hardware with spinning disks for storage: A modern blade can easily have 24+ 
 Cores, which would result in a default of 24 concurrent compactions. This not 
 only increases random IO, it also keeps around a lot of obsoleted files for 
 an unnecessarily long time, as each compaction keeps references to any 
 possibly overlapping files that it isn't itself compacting - but these can 
 have been obsoleted part way through by compactions that finished earlier. If 
 you factor in the default compaction throughput rate of 16Mb/s, anything but 
 a single default concurrent_compactor makes very little sense, as a single 
 thread should always be able to handle 16Mb/s, will cause less interference 
 with other processes, and permits obsoleted files to be immediately removed.
 See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
 this change on a box with 24-cores and 8Tb of storage (first spike is default 
 settings)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2014-05-02 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987792#comment-13987792
 ] 

Benedict commented on CASSANDRA-7139:
-

How about: 1 per disk, with a cap of 8, say? Boxes with 12+ (even 24+) disks 
aren't totally uncommon and you could see the same problem there as well. 

This should all be less of a problem with CASSANDRA-6696 as we'll be able to 
actually schedule on a per-disk basis and have no risk of referring to files on 
other disks, so we just want a sensible number to avoid breaking anyone who 
hasn't tuned their nodes between now and then.

 Default concurrent_compactors is probably too high
 --

 Key: CASSANDRA-7139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 2.0.8, 2.1 rc1


 The default number of concurrent compactors is probably too high for modern 
 hardware with spinning disks for storage: A modern blade can easily have 24+ 
 Cores, which would result in a default of 24 concurrent compactions. This not 
 only increases random IO, it also keeps around a lot of obsoleted files for 
 an unnecessarily long time, as each compaction keeps references to any 
 possibly overlapping files that it isn't itself compacting - but these can 
 have been obsoleted part way through by compactions that finished earlier. If 
 you factor in the default compaction throughput rate of 16Mb/s, anything but 
 a single default concurrent_compactor makes very little sense, as a single 
 thread should always be able to handle 16Mb/s, will cause less interference 
 with other processes, and permits obsoleted files to be immediately removed.
 See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
 this change on a box with 24-cores and 8Tb of storage (first spike is default 
 settings)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2014-05-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987820#comment-13987820
 ] 

Jonathan Ellis commented on CASSANDRA-7139:
---

SGTM.

 Default concurrent_compactors is probably too high
 --

 Key: CASSANDRA-7139
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 2.0.8, 2.1 rc1


 The default number of concurrent compactors is probably too high for modern 
 hardware with spinning disks for storage: A modern blade can easily have 24+ 
 Cores, which would result in a default of 24 concurrent compactions. This not 
 only increases random IO, it also keeps around a lot of obsoleted files for 
 an unnecessarily long time, as each compaction keeps references to any 
 possibly overlapping files that it isn't itself compacting - but these can 
 have been obsoleted part way through by compactions that finished earlier. If 
 you factor in the default compaction throughput rate of 16Mb/s, anything but 
 a single default concurrent_compactor makes very little sense, as a single 
 thread should always be able to handle 16Mb/s, will cause less interference 
 with other processes, and permits obsoleted files to be immediately removed.
 See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
 this change on a box with 24-cores and 8Tb of storage (first spike is default 
 settings)



--
This message was sent by Atlassian JIRA
(v6.2#6252)