Re: Cassandra 4.1 compaction thread no longer low priority (cpu nice)

Bowen Song via user Thu, 22 Feb 2024 14:18:39 -0800

On the IO scheduler point, cfq WAS the only scheduler supporting IOpriorities (such as ionice) shipped by default with the Linux kernel,but that has changed since bfq and mq-deadline were added to the Linuxkernel. Both bfq and mq-deadline supports IO priority, as documentedhere: https://docs.kernel.org/block/ioprio.html


On 22/02/2024 19:39, Dmitry Konstantinov wrote:

Hi all,
I was not participating in the changes but I analyzed the questionsome time ago from another side.There were also changes related to -XX:ThreadPriorityPolicy JVMoption. When you set a thread priority for a Java thread it does notmean it is always propagated as a native OS thread priority. Topropagate the priority you should use -XX:ThreadPriorityPolicy and-XX:JavaPriorityN_To_OSPriorityJVM options, but there is an issue withthem because JVM wants to be executed under root toset -XX:ThreadPriorityPolicy=1 which enables the priorities usage. Ahack was invented long time ago to workaround it by setting-XX:ThreadPriorityPolicy=42 value (any value not equal to 0 or 1) andbypass the not so needed and annoying grants validation logic (seehttp://tech.stolsvik.com/2010/01/linux-java-thread-priorities-workaround.htmlfor more details about).It worked for Java 8 but then there was a change in Java 9 aboutadding extra validation for JVM option values (JEP 245: Validate JVMCommand-Line Flag Arguments -https://bugs.openjdk.org/browse/JDK-8059557) and the hack stopped towork and started to cause JVM failure with a validation error. As areaction to it - the flag was removed from Cassandra JVM configurationfiles in https://issues.apache.org/jira/browse/CASSANDRA-13107. Afterit the lower priority value for compaction threads have not had anyactual effect.The interesting story is that the JVM logic has been changed tosupport the ability to set -XX:ThreadPriorityPolicy=1 for non-rootusers in Java 13 (https://bugs.openjdk.org/browse/JDK-8215962) and thechange was backported to Java 11 as well(https://bugs.openjdk.org/browse/JDK-8217494).So, from this point of view I think it would be nice to return backthe ability to set thread priority for compaction threads. At the sametime I would not expect too much improvement by enabling it.
P.S. There was also an idea about using ionice(https://issues.apache.org/jira/browse/CASSANDRA-9946) but the currentLinux IO schedulers do not take that into account anymore. It lookslike the only scheduler that supported ionice was CFQ(https://issues.apache.org/jira/browse/CASSANDRA-9946?focusedCommentId=14648616&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14648616<https://issues.apache.org/jira/browse/CASSANDRA-9946?focusedCommentId=14648616&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14648616>)and it was deprecated and removed since Linux kernel 5.x(https://github.com/torvalds/linux/commit/f382fb0bcef4c37dc049e9f6963e3baf204d815c).
Regards,
Dmitry
On Thu, 22 Feb 2024 at 15:30, Bowen Song via user<user@cassandra.apache.org> wrote:
    Hi Pierre,

    Is there anything stopping you from using the
    compaction_throughput
    
<https://github.com/apache/cassandra/blob/f9e033f519c14596da4dc954875756a69aea4e78/conf/cassandra.yaml#L989>
    option in the cassandra.yaml file to manage the performance impact
    of compaction operations?

    With thread priority, there's a failure scenario on busy nodes
    when the read operations uses too much CPU. If the compaction
    thread has lower priority, it does not get enough CPU time to run,
    and SSTable files will build up, causing read to become slower and
    more expensive, which in turn result in compaction gets even less
    CPU time. At the end, one of the following three will happen:

      * the node becomes too slow and most queries time out
      * the Java process crashes due to too many open files or OOM
        because JVM GC can't keep up
      * the filesystem run out of free space or inodes

    However, I'm unsure whether the compaction thread priority was
    intentionally removed from 4.1.0. Someone familiar with this
    matter may be able to answer that.

    Cheers,
    Bowen


    On 22/02/2024 13:26, Pierre Fersing wrote:
    Hello all,

    I've recently upgraded to Cassandra 4.1 and see a change in
    compaction behavior that seems unwanted:

    * With Cassandra 3.11 compaction was run by thread in low
    priority and thus using CPU nice (visible using top) (I believe
    Cassandra 4.0 also had this behavior)

    * With Cassandra 4.1, compactions are no longer run as low
    priority thread (compaction now use "normal" CPU).

    This means that when the server had limited CPU, Cassandra
    compaction now compete for the CPU with other process (probably
    including Cassandra itself) that need CPU. When it was using CPU
    nice, the compaction only competed for CPU with other lower
    priority process which was great as it leaves CPU available for
    processes that need to kept small response time (like an API used
    by human).

    Is it wanted to lose this feature in Cassandra 4.1 or was it just
    a forget during re-write of compaction executor ? Should I open a
    bug to re-introduce this feature in Cassandra ?


    I've done few searches, and:

    * I believe compaction used CPU nice because the compactor
    executor was created with minimal priority:
    
https://github.com/apache/cassandra/blob/cassandra-3.11.16/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L1906
    
<https://github.com/apache/cassandra/blob/cassandra-3.11.16/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L1906>

    * I think it was dropped by commit
    
https://github.com/apache/cassandra/commit/be1f050bc8c0cd695a42952e3fc84625ad48d83a
    
<https://github.com/apache/cassandra/commit/be1f050bc8c0cd695a42952e3fc84625ad48d83a>

    * It looks doable to set the thread priority with new executor, I
    think adding ".withThreadPriority(Thread.MIN_PRIORITY)" when
    using executorFactory in
    
https://github.com/apache/cassandra/blob/77a3e0e818df3cce45a974ecc977ad61bdcace47/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L2028
    
<https://github.com/apache/cassandra/blob/77a3e0e818df3cce45a974ecc977ad61bdcace47/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L2028>should
    do it.


    Did I miss a reason to no longer use low priority threads for
    compaction ? Should I open a bug for re-adding this feature /
    submit a PR ?

    Regards,

    Pierre Fersing
--
Dmitry Konstantinov

Re: Cassandra 4.1 compaction thread no longer low priority (cpu nice)

Reply via email to