[ 
https://issues.apache.org/jira/browse/CASSANDRA-20250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17925435#comment-17925435
 ] 

Dmitry Konstantinov commented on CASSANDRA-20250:
-------------------------------------------------

Also an attempt to emulate for CPU cache misses impact in JMH - LongAdder is 
affected more as [~benedict]  predicted:
{code:java}
[java] Benchmark                          (metricsCount)          (type)  Mode  
Cnt     Score     Error  Units
[java] ThreadLocalMetricsBench.increment              50       LongAdder  avgt  
 16  4197.469 ? 317.674  ns/op
[java] ThreadLocalMetricsBench.increment              50    LazySetArray  avgt  
 16   988.841 ?   7.683  ns/op
[java] ThreadLocalMetricsBench.increment              50  PiggybackArray  avgt  
 16   973.498 ?   6.913  ns/op
[java] ThreadLocalMetricsBench.increment             100       LongAdder  avgt  
 16  7523.716 ? 130.284  ns/op
[java] ThreadLocalMetricsBench.increment             100    LazySetArray  avgt  
 16  1760.691 ?  42.937  ns/op
[java] ThreadLocalMetricsBench.increment             100  PiggybackArray  avgt  
 16  1663.640 ?  11.116  ns/op
{code}
{code:java}
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 4, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 8, time = 2, timeUnit = TimeUnit.SECONDS)
@Fork(value = 2,
      jvmArgsAppend = { "-Djmh.executor=CUSTOM", 
"-Djmh.executor.class=org.apache.cassandra.test.microbench.FastThreadExecutor"})
@Threads(4)
@State(Scope.Benchmark)
public class ThreadLocalMetricsBench
{
    @Param({"LongAdder", "LazySetArray", "PiggybackArray"})
    private String type;

    @Param({"50", "100"})
    private int metricsCount;

    private List<CounterMetric> counterMetrics;


    @Setup(Level.Trial)
    public void setup() throws Throwable
    {
        counterMetrics = new ArrayList<>(metricsCount);
        for (int i = 0; i < metricsCount; i++)
        {
            CounterMetric counterMetric;
            switch (type)
            {
                case "LongAdder":
                    counterMetric = new LongAdderCounter();
                    break;
                case "LazySetArray":
                    counterMetric = 
LazySetArrayThreadLocalMetrics.createCounter();
                    break;
                case "PiggybackArray":
                    counterMetric = 
PiggybackArrayThreadLocalMetrics.createCounter();
                    break;
                default:
                    throw new UnsupportedOperationException();
            }
            counterMetrics.add(counterMetric);
        }
    }

    private final AtomicLongArray anotherMemory = new AtomicLongArray(256 * 
1024);

    @Setup(Level.Invocation)
    public void polluteCpuCaches() {
        for (int i = 0; i < anotherMemory.length(); i++)
            anotherMemory.incrementAndGet(i);
    }

    @Benchmark
    public void increment() {
        for (CounterMetric counterMetric : counterMetrics)
            counterMetric.inc();
    }
}
{code}

> Provide the ability to disable specific metrics collection
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-20250
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20250
>             Project: Apache Cassandra
>          Issue Type: New Feature
>          Components: Observability/Metrics
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>         Attachments: 5.1_profile_cpu.html, 
> 5.1_profile_cpu_without_metrics.html, async_profiler_cpu_profiles.zip, 
> cpu_profile_insert.html, jmh-result.json, vmstat.log, 
> vmstat_without_metrics.log
>
>
> Cassandra has a lot of metrics collected, many of them are collected per 
> table, so their instance number is multiplied by number of tables. From one 
> side it gives a better observability, from another side metrics are not for 
> free, there is an overhead associated with them:
> 1) CPU overhead: in case of simple CPU bound load: I already see like 5.5% of 
> total CPU spent for metrics in cpu framegraphs for read load and 11% for 
> write load. 
> Example: [^cpu_profile_insert.html] (search by "codahale" pattern). The 
> framegraph is captured using Async profiler build: 
> async-profiler-3.0-29ee888-linux-x64
> 2) memory overhead: we spend memory for entities used to aggregate metrics 
> such as LongAdders and reservoirs + for MBeans (String concatenation within 
> object names is a major cause of it, for each table+metric name combination a 
> new String is created)
>  
> The idea of this ticket is to allow an operator to configure a list of 
> disabled metrics in cassandra.yaml, like:
> {code:java}
> disabled_metrics:
>     - metric_a
>     - metric_b
> {code}
> From implementation point of view I see two possible approaches (which can be 
> combined):
>  # Generic: when a metric is registering if it is listed in disabled_metrics 
> we do not publish it via JMX and provide a noop implementation of metric 
> object (such as histogram) for it.
> Logging analogy: log level check within log method
>  # Specialized: for some metrics the process of value calculation is not for 
> free and introduces an overhead as well, in such cases it would be useful to 
> check within specific logic using an API (like: isMetricEnabled) do we need 
> to do it. Example of such metric: 
> ClientRequestSizeMetrics.recordRowAndColumnCountMetrics
> Logging analogy: an explicit 'if (isDebugEnabled())' condition used when a 
> message parameter is expensive.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to