[jira] [Commented] (CASSANDRA-18305) Enhance nodetool compactionstats with existing MBean metrics

Stefan Miklosovic (Jira) Wed, 14 Jun 2023 04:22:15 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-18305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732490#comment-17732490
 ]


Stefan Miklosovic commented on CASSANDRA-18305:
-----------------------------------------------

After looking more closer to it, I do not think that 'compaction throughput 
ratio' is computed correctly or that figure even make sense.

I think the original idea behind this was what if a node is compacting 
SSTables, there would be a percentage, like 20%, if it compacts with speed 
around 12 MBps. However, if you look into CompactionManager, there is:

    private final RateLimiter compactionRateLimiter = 
RateLimiter.create(Double.MAX_VALUE);

And then we get the rate like:

{code}
    public double getCompactionRate()
    {
        return compactionRateLimiter.getRate();
    }
{code}

This should be changed to

{code}
    public double getCompactionRate()
    {
        return getRateLimiter().getRate();
    }
{code}

because getRateLimiter() is setting the rates based on how DatabaseDescriptor 
is configured:

{code}
    public RateLimiter getRateLimiter()
    {
        setRate(DatabaseDescriptor.getCompactionThroughputMbPerSec());
        return compactionRateLimiter;
    }
{code}

and then

{code}
    public void setRate(final double throughPutMbPerSec)
    {
        double throughput = throughPutMbPerSec * 1024.0 * 1024.0;
        // if throughput is set to 0, throttling is disabled
        if (throughput == 0 || StorageService.instance.isBootstrapMode())
            throughput = Double.MAX_VALUE;
        if (compactionRateLimiter.getRate() != throughput)
            compactionRateLimiter.setRate(throughput);
    }
{code}

Secondly, once this is done, what did we actually obtained by calling 
CompactionManagerMBean and having `getCompactionRate()` as a double? We just 
got the maximum throughput allowed. 

But this does not mean that we got _the actual throughput_. We just got the max 
possible. So further computing the percentage will not make sense because we 
always get 100%.

This in the patch

{code}
        double configured = probe.getCompactionThroughput();
        double actual = probe.getCompactionRate() / (1024 * 1024);
{code}

these are basically same figures and they will not change over time, regardless 
how "fast" we compact. As of now I do not know about any way how to get "the 
realtime compaction speed" so I think it is better if we omit this completely.

There are also these metrics which are not covered currently:

BytesCompacted
CompactionsAborted
CompactionsReduced
SSTablesDroppedFromCompaction

BytesCompacted metric tracks the total amount of bytes Cassandra node compacted 
since its start. We may probably convert it to something human readable instead 
of just "bytes" which are hard to parse to something meaningful over time.

CompactionsAborted is self-explanatory.

CompactionsReduced and SSTablesDroppedFromCompaction are interesting metrics. 
They are set in CompactionTask.buildCompactionCandidatesForAvailableDiskSpace.

That method checks if it has enough disk space for the compaction. 
CompactionsReduced is increased if it evaluates that some SSTables will be 
skipped from compaction because if it was compacted it would not fit into the 
disk. 

The second metric, SSTablesDroppedFromCompaction, is a counter which adds the 
number of sstables which were left out from a particular compaction because 
they would not fit into the disk.

This all should be in the output too.

CompactionsAborted, CompactionsReduced and SSTablesDroppedFromCompaction where 
not added in NodeProbe.getCompactionMetric so I added them there too.

So, right now, it looks like this:

{code}
$ ./bin/nodetool compactionstats
concurrent compactors            2          
pending compaction tasks         0          
ks                               tb                      3
ks                               tb2                     5
compactions completed            51         
data compacted                   8.99 KiB   
compactions aborted              0          
compactions reduced              0          
sstables dropped from compaction 0          
minute rate                      0.60/second
5 minute rate                    0.60/second
15 minute rate                   0.60/second
mean rate                        0.42/second
compaction throughput (MBps)     64.0 
{code}

I pushed the changes to the same PR (1) as this commit (2)

[[email protected]] feel free to take it again and just tweak the 
tests. I know it might be little bit frustrating we are returning to this but 
it is how it is ... Please tell us you are ok with this, if you are busy we may 
help you with that too.

(1) https://github.com/apache/cassandra/pull/2393/files
(2) 
https://github.com/apache/cassandra/pull/2393/commits/02b128c1f17f5880ced21007389366385a52ecf8

> Enhance nodetool compactionstats with existing MBean metrics
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-18305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18305
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tool/nodetool
>            Reporter: Brad Schoening
>            Assignee: Manish Ghildiyal
>            Priority: Normal
>             Fix For: 4.0.x, 4.1.x, 5.x
>
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Nodetool compactionstats reports only on active compactions, if nothing is 
> active, you see only:
> {quote}$nodetool compactionstats
> pending tasks: 0
> {quote}
> but in the MBean Compaction/TotalCompactionsCompleted there are recent 
> statistic in events/second for:
>  * Count
>  * FifteenMinueRate
>  * FiveMinueRate
>  * MeanRate
>  * OneMinuteRate
> 1) It would be useful to see in addition:
> {quote}pending tasks: 0
> compactions completed: 20
>     1 minute rate:    0/second
>    5 minute rate:    2.3/second
>   15 minute rate:   4.6/second
> {quote}
> 2) Since compaction_throughput_mb_per_sec is a throttling parameter in 
> cassandra.yaml (default 64 MBps), it would be nice to show the actual 
> compaction throughput and be able to observe if you're close to the limit.  
> I.e., 
> {quote}compaction throughput 13.2 MBps / 16 MBps (82.5%)
> {quote}
> 3) for completness, compactionstats should list the number of concurrent 
> compactors configured, perhaps simply add to existing 'pending tasks' line:
> {quote}4 concurrent compactors, 0 pending tasks
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-18305) Enhance nodetool compactionstats with existing MBean metrics

Reply via email to