Thank you for highlighting this, it looks like I need to refresh my
knowledge about IO schedulers :-)

Cheers,
Dmitry

On Thu, 22 Feb 2024 at 22:18, Bowen Song via user <user@cassandra.apache.org>
wrote:

> On the IO scheduler point, cfq WAS the only scheduler supporting IO
> priorities (such as ionice) shipped by default with the Linux kernel, but
> that has changed since bfq and mq-deadline were added to the Linux kernel.
> Both bfq and mq-deadline supports IO priority, as documented here:
> https://docs.kernel.org/block/ioprio.html
>
>
> On 22/02/2024 19:39, Dmitry Konstantinov wrote:
>
> Hi all,
>
> I was not participating in the changes but I analyzed the question some
> time ago from another side.
> There were also changes related to -XX:ThreadPriorityPolicy JVM option.
> When you set a thread priority for a Java thread it does not mean it is
> always propagated as a native OS thread priority. To propagate the priority
> you should use  -XX:ThreadPriorityPolicy and -XX:JavaPriorityN_To_OSPriority
>  JVM options, but there is an issue with them because JVM wants to be
> executed under root to set -XX:ThreadPriorityPolicy=1 which enables the
> priorities usage. A hack was invented long time ago to workaround it by
> setting -XX:ThreadPriorityPolicy=42 value (any value not equal to 0 or 1)
> and bypass the not so needed and annoying grants validation logic (see
> http://tech.stolsvik.com/2010/01/linux-java-thread-priorities-workaround.html
> for more details about).
> It worked for Java 8 but then there was a change in Java 9 about adding
> extra validation for JVM option values (JEP 245: Validate JVM Command-Line
> Flag Arguments  - https://bugs.openjdk.org/browse/JDK-8059557) and the
> hack stopped to work and started to cause JVM failure with a validation
> error. As a reaction to it - the flag was removed from Cassandra JVM
> configuration files in
> https://issues.apache.org/jira/browse/CASSANDRA-13107. After it the lower
> priority value for compaction threads have not had any actual effect.
> The interesting story is that the JVM logic has been changed to support
> the ability to set -XX:ThreadPriorityPolicy=1 for non-root users in Java 13
> (https://bugs.openjdk.org/browse/JDK-8215962) and the change was
> backported to Java 11 as well (https://bugs.openjdk.org/browse/JDK-8217494
> ).
> So, from this point of view I think it would be nice to return back the
> ability to set thread priority for compaction threads. At the same time I
> would not expect too much improvement by enabling it.
>
> P.S. There was also an idea about using ionice (
> https://issues.apache.org/jira/browse/CASSANDRA-9946) but the current
> Linux IO schedulers do not take that into account anymore. It looks like
> the only scheduler that supported ionice was CFQ (
> https://issues.apache.org/jira/browse/CASSANDRA-9946?focusedCommentId=14648616&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14648616)
> and it was deprecated and removed since Linux kernel 5.x (
> https://github.com/torvalds/linux/commit/f382fb0bcef4c37dc049e9f6963e3baf204d815c
> ).
>
> Regards,
> Dmitry
>
>
> On Thu, 22 Feb 2024 at 15:30, Bowen Song via user <
> user@cassandra.apache.org> wrote:
>
>> Hi Pierre,
>>
>> Is there anything stopping you from using the compaction_throughput
>> <https://github.com/apache/cassandra/blob/f9e033f519c14596da4dc954875756a69aea4e78/conf/cassandra.yaml#L989>
>> option in the cassandra.yaml file to manage the performance impact of
>> compaction operations?
>>
>> With thread priority, there's a failure scenario on busy nodes when the
>> read operations uses too much CPU. If the compaction thread has lower
>> priority, it does not get enough CPU time to run, and SSTable files will
>> build up, causing read to become slower and more expensive, which in turn
>> result in compaction gets even less CPU time. At the end, one of the
>> following three will happen:
>>
>>    - the node becomes too slow and most queries time out
>>    - the Java process crashes due to too many open files or OOM because
>>    JVM GC can't keep up
>>    - the filesystem run out of free space or inodes
>>
>> However, I'm unsure whether the compaction thread priority was
>> intentionally removed from 4.1.0. Someone familiar with this matter may be
>> able to answer that.
>>
>> Cheers,
>> Bowen
>>
>>
>> On 22/02/2024 13:26, Pierre Fersing wrote:
>>
>> Hello all,
>>
>> I've recently upgraded to Cassandra 4.1 and see a change in compaction
>> behavior that seems unwanted:
>>
>> * With Cassandra 3.11 compaction was run by thread in low priority and
>> thus using CPU nice (visible using top) (I believe Cassandra 4.0 also had
>> this behavior)
>>
>> * With Cassandra 4.1, compactions are no longer run as low priority
>> thread (compaction now use "normal" CPU).
>>
>> This means that when the server had limited CPU, Cassandra compaction now
>> compete for the CPU with other process (probably including Cassandra
>> itself) that need CPU. When it was using CPU nice, the compaction only
>> competed for CPU with other lower priority process which was great as it
>> leaves CPU available for processes that need to kept small response time
>> (like an API used by human).
>>
>> Is it wanted to lose this feature in Cassandra 4.1 or was it just a
>> forget during re-write of compaction executor ? Should I open a bug to
>> re-introduce this feature in Cassandra ?
>>
>>
>> I've done few searches, and:
>>
>> * I believe compaction used CPU nice because the compactor executor was
>> created with minimal priority:
>> https://github.com/apache/cassandra/blob/cassandra-3.11.16/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L1906
>>
>> * I think it was dropped by commit
>> https://github.com/apache/cassandra/commit/be1f050bc8c0cd695a42952e3fc84625ad48d83a
>>
>> * It looks doable to set the thread priority with new executor, I think
>> adding ".withThreadPriority(Thread.MIN_PRIORITY)" when using
>> executorFactory in
>> https://github.com/apache/cassandra/blob/77a3e0e818df3cce45a974ecc977ad61bdcace47/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L2028
>> should do it.
>>
>>
>> Did I miss a reason to no longer use low priority threads for compaction
>> ? Should I open a bug for re-adding this feature / submit a PR ?
>>
>> Regards,
>>
>> Pierre Fersing
>>
>>
>
> --
> Dmitry Konstantinov
>
>

-- 
Dmitry Konstantinov

Reply via email to