[
https://issues.apache.org/jira/browse/CASSANDRA-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17427680#comment-17427680
]
Stefan Miklosovic commented on CASSANDRA-13780:
-----------------------------------------------
Hi [~jjirsa], is this still relevant after all these years? I think this
targets 3.0 which we hardly support anymore anyway ...
> ADD Node streaming throughput performance
> -----------------------------------------
>
> Key: CASSANDRA-13780
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13780
> Project: Cassandra
> Issue Type: Improvement
> Components: Legacy/Core
> Environment: Linux 2.6.32-696.3.2.el6.x86_64 #1 SMP Mon Jun 19
> 11:55:55 PDT 2017 x86_64 x86_64 x86_64 GNU/Linux
> Architecture: x86_64
> CPU op-mode(s): 32-bit, 64-bit
> Byte Order: Little Endian
> CPU(s): 40
> On-line CPU(s) list: 0-39
> Thread(s) per core: 2
> Core(s) per socket: 10
> Socket(s): 2
> NUMA node(s): 2
> Vendor ID: GenuineIntel
> CPU family: 6
> Model: 79
> Model name: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
> Stepping: 1
> CPU MHz: 2199.869
> BogoMIPS: 4399.36
> Virtualization: VT-x
> L1d cache: 32K
> L1i cache: 32K
> L2 cache: 256K
> L3 cache: 25600K
> NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
> NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
> total used free shared buffers cached
> Mem: 252G 217G 34G 708K 308M 149G
> -/+ buffers/cache: 67G 185G
> Swap: 16G 0B 16G
> Reporter: Kevin Rivait
> Priority: Normal
> Fix For: 3.0.x
>
>
> Problem: Adding a new node to a large cluster runs at least 1000x slower than
> what the network and node hardware capacity can support, taking several days
> per new node. Adjusting stream throughput and other YAML parameters seems to
> have no effect on performance. Essentially, it appears that Cassandra has an
> architecture scalability growth problem when adding new nodes to a moderate
> to high data ingestion cluster because Cassandra cannot add new node capacity
> fast enough to keep up with increasing data ingestion volumes and growth.
> Initial Configuration:
> Running 3.0.9 and have implemented TWCS on one of our largest table.
> Largest table partitioned on (ID, YYYYMM) using 1 day buckets with a TTL of
> 60 days.
> Next release will change partitioning to (ID, YYYYMMDD) so that partitions
> are aligned with daily TWCS buckets.
> Each node is currently creating roughly a 30GB SSTable per day.
> TWCS working as expected, daily SSTables are dropping off daily after 70
> days ( 60 + 10 day grace)
> Current deployment is a 28 node 2 datacenter cluster, 14 nodes in each DC ,
> replication factor 3
> Data directories are backed with 4 - 2TB SSDs on each node and a 1 800GB SSD
> for commit logs.
> Requirement is to double cluster size, capacity, and ingestion volume within
> a few weeks.
> Observed Behavior:
> 1. streaming throughput during add node – we observed maximum 6 Mb/s
> streaming from each of the 14 nodes on a 20Gb/s switched network, taking at
> least 106 hours for each node to join cluster and each node is only about 2.2
> TB is size.
> 2. compaction on the newly added node - compaction has fallen behind, with
> anywhere from 4,000 to 10,000 SSTables at any given time. It took 3 weeks
> for compaction to finish on each newly added node. Increasing number of
> compaction threads to match number of CPU (40) and increasing compaction
> throughput to 32MB/s seemed to be the sweet spot.
> 3. TWCS buckets on new node, data streamed to this node over 4 1/2 days.
> Compaction correctly placed the data in daily files, but the problem is the
> file dates reflect when compaction created the file and not the date of the
> last record written in the TWCS bucket, which will cause the files to remain
> around much longer than necessary.
> Two Questions:
> 1. What can be done to substantially improve the performance of adding a new
> node?
> 2. Can compaction on TWCS partitions for newly added nodes change the file
> create date to match the highest date record in the file -or- add another
> piece of meta-data to the TWCS files that reflect the file drop date so that
> TWCS partitions can be dropped consistently?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]