[ https://issues.apache.org/jira/browse/CASSANDRA-20571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17949894#comment-17949894 ]
Jai Bheemsen Rao Dhanwada edited comment on CASSANDRA-20571 at 5/7/25 4:52 AM: ------------------------------------------------------------------------------- Please ignore the comment above: this [commit|https://github.com/apache/cassandra/commit/5be1038c5d38af32d3cbb0545d867f21304f3a46] is probably exposing the problem and not introducing it because not having this commit is making the overall streaming slower and causing the CPU consumption to be low. This can be clearly seen in the time it takes for the bootstrap (26 minutes vs 3 minutes). I have done multiple tests with different scenarios and reviewed the streaming code and here is what I see. * In 4.1 If the property {{streaming.session.parallelTransfers}} is not set the number of streaming sessions opens up max parallel thread. {code:java} Ref: org/apache/cassandra/streaming/async/StreamingMultiplexedChannel.java private static final int DEFAULT_MAX_PARALLEL_TRANSFERS = getAvailableProcessors(); private static final int MAX_PARALLEL_TRANSFERS = parseInt(getProperty(PROPERTY_PREFIX + "streaming.session.parallelTransfers", Integer.toString(DEFAULT_MAX_PARALLEL_TRANSFERS))); <<truncated output>> fileTransferExecutor = executorFactory() .configurePooled("NettyStreaming-Outbound-" + name, MAX_PARALLEL_TRANSFERS) .withKeepAlive(1L, SECONDS).build(); } Ref: org/apache/cassandra/utils/FBUtilities.java public static int getAvailableProcessors() { if (availableProcessors > 0) return availableProcessors; else return Runtime.getRuntime().availableProcessors(); } {code} * However, in 4.0 if the property {{streaming.session.parallelTransfers}} is not set the number of streaming sessions depends on the ThreadpoolExectuor which by default set to 1 and then allow the pool to grow. {code:java} Ref: org/apache/cassandra/streaming/async/NettyStreamingMessageSender.java fileTransferExecutor = new DebuggableThreadPoolExecutor(1, MAX_PARALLEL_TRANSFERS, 1L, TimeUnit.SECONDS, new LinkedBlockingQueue<>(), new NamedThreadFactory("NettyStreaming-Outbound-" + name)); Ref: ThreadPoolExecutor.class /** * Creates a new {@code ThreadPoolExecutor} with the given initial * parameters and default rejected execution handler. * * @param corePoolSize the number of threads to keep in the pool, even * if they are idle, unless {@code allowCoreThreadTimeOut} is set * @param maximumPoolSize the maximum number of threads to allow in the * pool * @param keepAliveTime when the number of threads is greater than * the core, this is the maximum time that excess idle threads * will wait for new tasks before terminating. * @param unit the time unit for the {@code keepAliveTime} argument * @param workQueue the queue to use for holding tasks before they are * executed. This queue will hold only the {@code Runnable} * tasks submitted by the {@code execute} method. * @param threadFactory the factory to use when the executor * creates a new thread * @throws IllegalArgumentException if one of the following holds:<br> * {@code corePoolSize < 0}<br> * {@code keepAliveTime < 0}<br> * {@code maximumPoolSize <= 0}<br> * {@code maximumPoolSize < corePoolSize} * @throws NullPointerException if {@code workQueue} * or {@code threadFactory} is null */ public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory) { this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, threadFactory, defaultHandler); } {code} * This is evident from the DEBUG logs where in 4.1 we see the threads are ranging from \{{NettyStreaming-Outbound-/x.x.x.x.7000:1}} to {{NettyStreaming-Outbound-/x.x.x.x.7000:16}} during the streaming, in the 4.0 version they are always set to 1. \{{NettyStreaming-Outbound-/x.x.x.x.7000:1}} {code:java} DEBUG [NettyStreaming-Outbound-/x.x.x.x.7000:11] 2025-01-31 05:36:32,393 CassandraCompressedStreamWriter.java:62 - [Stream #0d829ec0-df95-11ef-9b5a-81b717c52836] Start streaming file /var/lib/cassandra/data/keyspace1/table1-212345705bc011e895a6cbf409872089/nb-16524-big-Data.db to /x.x.x.x:7000, repairedAt = 0, totalSize = 102241274 {code} * [This is the commit|https://github.com/apache/cassandra/commit/be1f050bc8c0cd695a42952e3fc84625ad48d83a#diff-1075097b8848b3055c2e17cdff45207c7889df8ec1eeae360a88f33a9168bc3f] which introduced this change the task execution in CASSANDRA-16925. * To prove this I did below two experiments and in both the cases the CPU spike is not observed similar to what I see earlier. ** set {{-Dcassandra.streaming.session.parallelTransfers=1}} ** compile the C* code with {{fileTransferExecutor.setCorePoolSize(1);}} * IIUC setting the {{-Dcassandra.streaming.session.parallelTransfers=1}} will always limit it to 1 and never let it grow, is that a correct way to handle this? as in 4.0.12 max is set number of cores, so the pool could grow. * Also I have set the property {{streaming_connections_per_host: 1}} which (I could be wrong but) looks like is not being honored in 4.1.X. I couldn't conclude this but just suspecting because 4.1 is creating 16 streaming sessions. * I did another experiment to by adding {{fileTransferExecutor.setCorePoolSize(MAX_PARALLEL_TRANSFERS);}} in 4.0.12 and I see the exact same behavior in 4.0 where the CPU spikes during the streaming. Is this a bug that 4.1 tries to utilize all available processors or is this the expectation that user should set {{-Dcassandra.streaming.session.parallelTransfers=1}} which was not required in 4.0.x? If this is expected we should default it to 1 and let the user increase it as needed. [~dcapwell] I am using Open Source Apache cassandra but to run some these tests, I have modified the source code and built it. was (Author: jaid): Please ignore the comment above: this [commit|https://github.com/apache/cassandra/commit/5be1038c5d38af32d3cbb0545d867f21304f3a46] is probably exposing the problem and not introducing it because not having this commit is making the overall streaming slower and causing the CPU consumption to be low. This can be clearly seen in the time it takes for the bootstrap (26 minutes vs 3 minutes). I have done multiple tests with different scenarios and reviewed the streaming code and here is what I see. * In 4.1 If the property `streaming.session.parallelTransfers` is not set the number of streaming sessions opens up max parallel thread. {code:java} Ref: org/apache/cassandra/streaming/async/StreamingMultiplexedChannel.java private static final int DEFAULT_MAX_PARALLEL_TRANSFERS = getAvailableProcessors(); private static final int MAX_PARALLEL_TRANSFERS = parseInt(getProperty(PROPERTY_PREFIX + "streaming.session.parallelTransfers", Integer.toString(DEFAULT_MAX_PARALLEL_TRANSFERS))); <<truncated output>> fileTransferExecutor = executorFactory() .configurePooled("NettyStreaming-Outbound-" + name, MAX_PARALLEL_TRANSFERS) .withKeepAlive(1L, SECONDS).build(); } Ref: org/apache/cassandra/utils/FBUtilities.java public static int getAvailableProcessors() { if (availableProcessors > 0) return availableProcessors; else return Runtime.getRuntime().availableProcessors(); } {code} * However, in 4.0 if the property `streaming.session.parallelTransfers` is not set the number of streaming sessions depends on the ThreadpoolExectuor which by default set to 1 and then allow the pool to grow. {code:java} Ref: org/apache/cassandra/streaming/async/NettyStreamingMessageSender.java fileTransferExecutor = new DebuggableThreadPoolExecutor(1, MAX_PARALLEL_TRANSFERS, 1L, TimeUnit.SECONDS, new LinkedBlockingQueue<>(), new NamedThreadFactory("NettyStreaming-Outbound-" + name)); Ref: ThreadPoolExecutor.class /** * Creates a new {@code ThreadPoolExecutor} with the given initial * parameters and default rejected execution handler. * * @param corePoolSize the number of threads to keep in the pool, even * if they are idle, unless {@code allowCoreThreadTimeOut} is set * @param maximumPoolSize the maximum number of threads to allow in the * pool * @param keepAliveTime when the number of threads is greater than * the core, this is the maximum time that excess idle threads * will wait for new tasks before terminating. * @param unit the time unit for the {@code keepAliveTime} argument * @param workQueue the queue to use for holding tasks before they are * executed. This queue will hold only the {@code Runnable} * tasks submitted by the {@code execute} method. * @param threadFactory the factory to use when the executor * creates a new thread * @throws IllegalArgumentException if one of the following holds:<br> * {@code corePoolSize < 0}<br> * {@code keepAliveTime < 0}<br> * {@code maximumPoolSize <= 0}<br> * {@code maximumPoolSize < corePoolSize} * @throws NullPointerException if {@code workQueue} * or {@code threadFactory} is null */ public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory) { this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, threadFactory, defaultHandler); } {code} * This is evident from the DEBUG logs where in 4.1 we see the threads are ranging from `NettyStreaming-Outbound-/x.x.x.x.7000:1` to `NettyStreaming-Outbound-/x.x.x.x.7000:16` during the streaming, in the 4.0 version they are always set to 1. `NettyStreaming-Outbound-/x.x.x.x.7000:1` {code:java} DEBUG [NettyStreaming-Outbound-/x.x.x.x.7000:11] 2025-01-31 05:36:32,393 CassandraCompressedStreamWriter.java:62 - [Stream #0d829ec0-df95-11ef-9b5a-81b717c52836] Start streaming file /var/lib/cassandra/data/keyspace1/table1-212345705bc011e895a6cbf409872089/nb-16524-big-Data.db to /x.x.x.x:7000, repairedAt = 0, totalSize = 102241274 {code} * [This is the commit|https://github.com/apache/cassandra/commit/be1f050bc8c0cd695a42952e3fc84625ad48d83a#diff-1075097b8848b3055c2e17cdff45207c7889df8ec1eeae360a88f33a9168bc3f] which introduced this change the task execution in CASSANDRA-16925. * To prove this I did below two experiments and in both the cases the CPU spike is not observed similar to what I see earlier. ** set `-Dcassandra.streaming.session.parallelTransfers=1` ** compile the C* code with `fileTransferExecutor.setCorePoolSize(1);` * IIUC setting the `-Dcassandra.streaming.session.parallelTransfers=1` will always limit it to 1 and never let it grow, is that a correct way to handle this? as in 4.0.12 max is set number of cores, so the pool could grow. * Also I have set the property `streaming_connections_per_host: 1` which (I could be wrong but) looks like is not being honored in 4.1.X. I couldn't conclude this but just suspecting because 4.1 is creating 16 streaming sessions. * I did another experiment to by adding `fileTransferExecutor.setCorePoolSize(MAX_PARALLEL_TRANSFERS);` in 4.0.12 and I see the exact same behavior in 4.0 where the CPU spikes during the streaming. Is this a bug that 4.1 tries to utilize all available processors or is this the expectation that user should set `-Dcassandra.streaming.session.parallelTransfers=1` which was not required in 4.0.x? If this is expected we should default it to 1 and let the user increase it as needed. [~dcapwell] I am using Open Source Apache cassandra but to run some these tests, I have modified the source code and built it. > CPU Spikes during the Streaming of data > --------------------------------------- > > Key: CASSANDRA-20571 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20571 > Project: Apache Cassandra > Issue Type: Bug > Components: Consistency/Streaming > Reporter: Jai Bheemsen Rao Dhanwada > Priority: Normal > Fix For: 4.1.x, 5.0.x, 5.x > > Attachments: async_profiler_cpu.html > > > Hello Team, > We are seeing an issue where there is a huge spike in CPU on the node which > is streaming data (adding a new node or replacing a node or running a > nodetool rebuild). Essentially anytime when there is a Streaming is involved > the CPU spike is very huge. This does not happen in all the clusters but we > occasionally see this issue on specific cluster. > > C* version: 4.1.6 (> 4.1.0) > Schema: All the tables use counter data types. > CPU Cores: 16 > > The same worksloads + clusters types do not show this behavior with the 4.0.x > version of cassandra, hence we suspect something changed in 4.1.6. Looking at > the top threads it's mostly the StreamDeserialize + compaction. > {code:java} > top - 17:01:29 up 18:42, 2 users, load average: 51.75, 13.61, 4.79 > Threads: 741 total, 54 running, 687 sleeping, 0 stopped, 0 zombie > %Cpu(s): 91.5 us, 4.9 sy, 0.0 ni, 1.4 id, 0.7 wa, 1.1 hi, 0.4 si, 0.0 > st > MiB Mem : 31176.5 total, 8762.5 free, 11028.0 used, 11386.0 buff/cache > MiB Swap: 0.0 total, 0.0 free, 0.0 used. 19334.3 avail Mem > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 305763 xxxxxx 20 0 18.6g 9.8g 446524 R 30.8 32.2 0:04.69 > Stream-Deserial > 305815 xxxxxx 20 0 18.6g 9.8g 446524 R 28.6 32.2 0:04.81 > Stream-Deserial > 300600 xxxxxx 20 0 18.6g 9.8g 446524 R 27.9 32.2 0:04.73 > CompactionExecu > 305678 xxxxxx 20 0 18.6g 9.8g 446524 R 27.9 32.2 0:03.98 > Stream-Deserial > 305602 xxxxxx 20 0 18.6g 9.8g 446524 R 27.6 32.2 0:04.65 > Stream-Deserial > 305563 xxxxxx 20 0 18.6g 9.8g 446524 R 27.3 32.2 0:04.02 > CompactionExecu > 305687 xxxxxx 20 0 18.6g 9.8g 446524 R 26.9 32.2 0:04.28 > Stream-Deserial > 305707 xxxxxx 20 0 18.6g 9.8g 446524 S 26.9 32.2 0:04.29 > Stream-Deserial > 305714 xxxxxx 20 0 18.6g 9.8g 446524 R 26.9 32.2 0:04.91 > Stream-Deserial > 305569 xxxxxx 20 0 18.6g 9.8g 446524 R 26.6 32.2 0:05.69 > Stream-Deserial > 305771 xxxxxx 20 0 18.6g 9.8g 446524 R 26.6 32.2 0:03.99 > Stream-Deserial > 305817 xxxxxx 20 0 18.6g 9.8g 446524 R 26.3 32.2 0:03.79 > Stream-Deserial > 305566 xxxxxx 20 0 18.6g 9.8g 446524 R 26.0 32.2 0:04.64 > CompactionExecu {code} > Initial Hypothesis was if streaming_stats are playing a role here based on: > https://issues.apache.org/jira/browse/CASSANDRA-18110. However we turned the > streaming_stats: false and still see a spike in CPU. Post the streaming is > complete the cluster is back to normal state where we don't see a spike in > CPU but we would like to understand what's causing the huge CPU spikes. I > have profiler attached during the time of CPU. > Please let me know if you need any other details. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org