Hi,

Starting with Solr 7.0 all JMX metrics are actually internally driven by the 
metrics API - JMX (or Prometheus) is just a way of exposing them.

I agree that we need more documentation on metrics - contributions are welcome 
:)

Regarding your specific examples (btw. our mailing lists aggressively strip all 
attachments - your graphs didn’t make it):

* time units in time-based counters are in nanoseconds. This is just a unit of 
value, not necessarily precision. In this specific example 
`ADMIN./admin/collections.totalTime` (and similarly named metrics for all other 
request handlers) represents the total elapsed time spent processing requests.
* time-based histograms are expressed in milliseconds, where it is indicated by 
the “_ms” suffix.
* 1-, 5- and 15-min rates represent an exponentially weighted moving average 
over that time window, expressed in events/second.
* handlerStart is initialised with System.currentTimeMillis() when this 
instance of request handler is first created.
* details on GC, memory buffer pools, and similar JVM metrics are documented in 
JDK documentation on Management Beans. For example:
https://docs.oracle.com/javase/7/docs/api/java/lang/management/GarbageCollectorMXBean.html?is-external=true
 
<https://docs.oracle.com/javase/7/docs/api/java/lang/management/GarbageCollectorMXBean.html?is-external=true>
* "A latency of 1mil” - no idea what that is, I don’t think Solr API uses this 
abbreviation anywhere.

Hope this helps.

—

Andrzej Białecki

> On 7 Oct 2019, at 13:41, Emir Arnautović <emir.arnauto...@sematext.com> wrote:
> 
> Hi Richard,
> We do not use API to collect metrics but JMX, but I believe that those are 
> the same (did not verify it in code). You can see how we handled those 
> metrics into reports/charts or even use our agent to send data to Prometheus: 
> https://github.com/sematext/sematext-agent-integrations/tree/master/solr 
> <https://github.com/sematext/sematext-agent-integrations/tree/master/solr>
> 
> You can also see some links to Solr metric related blog posts in this repo. 
> If you find out that managing your own monitoring stack is overwhelming, you 
> can try our Solr integration.
> 
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> 
> 
> 
>> On 7 Oct 2019, at 12:40, Richard Goodman <richa...@brandwatch.com> wrote:
>> 
>> Hi there,
>> 
>> I'm currently working on using the prometheus exporter to provide some 
>> detailed insights for our Solr Cloud clusters.
>> 
>> Using the provided template killed our prometheus server, as well as the 
>> exporter due to the size of our clusters (each cluster is around 96 nodes, 
>> ~300 collections with 3way replication and 16 shards), so you can imagine 
>> the amount of data that comes through /admin/metrics and not filtering it 
>> down first.
>> 
>> I've began working on writing my own template to reduce the amount of data 
>> being requested and it's working fine, and I'm starting to build some nice 
>> graphs in Grafana.
>> 
>> The only difficulty I'm having with this, is I'm struggling to find decent 
>> documentation on the metrics themselves. I was using the resources metrics 
>> reporting - metrics-api 
>> <https://lucene.apache.org/solr/guide/7_7/metrics-reporting.html#metrics-api>
>>  and monitoring solr with prometheus and grafana 
>> <https://lucene.apache.org/solr/guide/7_7/monitoring-solr-with-prometheus-and-grafana.html>
>>  but there is a lack of information on most metrics. 
>> 
>> For example:
>> "ADMIN./admin/collections.totalTime":6715327903,
>> I understand this is a counter, however, I'm not sure what unit this would 
>> be represented when displaying it, for example:
>> 
>> 
>> 
>> A latency of 1mil, not sure if this means milliseconds, million, etc., 
>> Another example would be the GC metrics:
>>      "gc.ConcurrentMarkSweep.count":7,
>>      "gc.ConcurrentMarkSweep.time":1247,
>>      "gc.ParNew.count":16759,
>>      "gc.ParNew.time":884173,
>> Which when displayed, doesn't give the clearest insight as to what the unit 
>> is:
>> 
>> 
>> If anyone has any advice / guidance, that would be greatly appreciated. If 
>> there isn't documentation for the API, then this would also be something 
>> I'll look into help contributing with too.
>> 
>> Thanks,
>> -- 
>> Richard Goodman
> 

Reply via email to