[
https://issues.apache.org/jira/browse/SOLR-16273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Smiley resolved SOLR-16273.
---------------------------------
Fix Version/s: 9.3
Resolution: Fixed
Thanks Matthew Biscocho for contributing!
> Prometheus Metric Exporter is very slow when collecting large amounts of
> sample data
> ------------------------------------------------------------------------------------
>
> Key: SOLR-16273
> URL: https://issues.apache.org/jira/browse/SOLR-16273
> Project: Solr
> Issue Type: Improvement
> Components: contrib - prometheus-exporter
> Affects Versions: 8.6.3, 9.0
> Reporter: Fa Ming
> Priority: Critical
> Fix For: 9.3
>
> Time Spent: 1h 40m
> Remaining Estimate: 0h
>
> I have a solr cluster with 300 Collections, use Prometheus Metric Exporter
> program to get solr-cluster information, but it takes 2 minutes to get data
> each time, `jstack` is as follows:
> {code:}
> "solr-exporter-collectors-1-thread-2" #21 prio=5 os_prio=0
> tid=0x00007fcef8009000 nid=0x45208 runnable [0x00007fcf16470000]
> java.lang.Thread.State: RUNNABLE
> at
> io.prometheus.client.Collector$MetricFamilySamples$Sample.equals(Collector.java:95)
> at java.util.ArrayList.indexOf(ArrayList.java:323)
> at java.util.ArrayList.contains(ArrayList.java:306)
> at
> org.apache.solr.prometheus.collector.MetricSamples.addSampleIfMetricExists(MetricSamples.java:50)
> at
> org.apache.solr.prometheus.collector.MetricSamples.addAll(MetricSamples.java:60)
> at
> org.apache.solr.prometheus.collector.MetricsCollector.lambda$collect$0(MetricsCollector.java:38)
> at
> org.apache.solr.prometheus.collector.MetricsCollector$$Lambda$127/68757342.accept(Unknown
> Source)
> at java.util.HashMap.forEach(HashMap.java:1291)
> at
> org.apache.solr.prometheus.collector.MetricsCollector.collect(MetricsCollector.java:38)
> at
> org.apache.solr.prometheus.collector.SchedulerMetricsCollector.lambda$collectMetrics$0(SchedulerMetricsCollector.java:91)
> at
> org.apache.solr.prometheus.collector.SchedulerMetricsCollector$$Lambda$75/817493591.get(Unknown
> Source)
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:212)
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$39/351002168.run(Unknown
> Source)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> {code}
>
> {*}"contains" method takes 90% of execution time{*}.
>
> Looking at the MetricSamples.java code, "sample" will be deduplicated before
> adding to "sampleFamily.samples", when "sampleFamily.samples" reaches 20,000,
> "sampleFamily.samples.contains" is very inefficient
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]