Re: Metrics API - Documentation

Richard Goodman Tue, 15 Oct 2019 08:33:47 -0700

Many thanks both for your responses, they've been helpful.

@Andrzej - Sorry I wasn't clear on the "A latency of 1mil" as I wasn't
aware the image wouldn't come through. But following your bullet points
helped me present a better unit for measurement in the axis.


In regards to contributing, would absolutely love to help there, just not
sure what the correct direction is? I wasn't sure if the web page source
code / contributions are in the apache-lucene repository?

Thanks,


On Tue, 8 Oct 2019 at 11:04, Andrzej Białecki <a...@getopt.org> wrote:

> Hi,
>
> Starting with Solr 7.0 all JMX metrics are actually internally driven by
> the metrics API - JMX (or Prometheus) is just a way of exposing them.
>
> I agree that we need more documentation on metrics - contributions are
> welcome :)
>
> Regarding your specific examples (btw. our mailing lists aggressively
> strip all attachments - your graphs didn’t make it):
>
> * time units in time-based counters are in nanoseconds. This is just a
> unit of value, not necessarily precision. In this specific example
> `ADMIN./admin/collections.totalTime` (and similarly named metrics for all
> other request handlers) represents the total elapsed time spent processing
> requests.
> * time-based histograms are expressed in milliseconds, where it is
> indicated by the “_ms” suffix.
> * 1-, 5- and 15-min rates represent an exponentially weighted moving
> average over that time window, expressed in events/second.
> * handlerStart is initialised with System.currentTimeMillis() when this
> instance of request handler is first created.
> * details on GC, memory buffer pools, and similar JVM metrics are
> documented in JDK documentation on Management Beans. For example:
>
> https://docs.oracle.com/javase/7/docs/api/java/lang/management/GarbageCollectorMXBean.html?is-external=true
> <
> https://docs.oracle.com/javase/7/docs/api/java/lang/management/GarbageCollectorMXBean.html?is-external=true
> >
> * "A latency of 1mil” - no idea what that is, I don’t think Solr API uses
> this abbreviation anywhere.
>
> Hope this helps.
>
> —
>
> Andrzej Białecki
>
> > On 7 Oct 2019, at 13:41, Emir Arnautović <emir.arnauto...@sematext.com>
> wrote:
> >
> > Hi Richard,
> > We do not use API to collect metrics but JMX, but I believe that those
> are the same (did not verify it in code). You can see how we handled those
> metrics into reports/charts or even use our agent to send data to
> Prometheus:
> https://github.com/sematext/sematext-agent-integrations/tree/master/solr <
> https://github.com/sematext/sematext-agent-integrations/tree/master/solr>
> >
> > You can also see some links to Solr metric related blog posts in this
> repo. If you find out that managing your own monitoring stack is
> overwhelming, you can try our Solr integration.
> >
> > HTH,
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> >> On 7 Oct 2019, at 12:40, Richard Goodman <richa...@brandwatch.com>
> wrote:
> >>
> >> Hi there,
> >>
> >> I'm currently working on using the prometheus exporter to provide some
> detailed insights for our Solr Cloud clusters.
> >>
> >> Using the provided template killed our prometheus server, as well as
> the exporter due to the size of our clusters (each cluster is around 96
> nodes, ~300 collections with 3way replication and 16 shards), so you can
> imagine the amount of data that comes through /admin/metrics and not
> filtering it down first.
> >>
> >> I've began working on writing my own template to reduce the amount of
> data being requested and it's working fine, and I'm starting to build some
> nice graphs in Grafana.
> >>
> >> The only difficulty I'm having with this, is I'm struggling to find
> decent documentation on the metrics themselves. I was using the resources
> metrics reporting - metrics-api <
> https://lucene.apache.org/solr/guide/7_7/metrics-reporting.html#metrics-api>
> and monitoring solr with prometheus and grafana <
> https://lucene.apache.org/solr/guide/7_7/monitoring-solr-with-prometheus-and-grafana.html>
> but there is a lack of information on most metrics.
> >>
> >> For example:
> >> "ADMIN./admin/collections.totalTime":6715327903,
> >> I understand this is a counter, however, I'm not sure what unit this
> would be represented when displaying it, for example:
> >>
> >>
> >>
> >> A latency of 1mil, not sure if this means milliseconds, million, etc.,
> >> Another example would be the GC metrics:
> >>      "gc.ConcurrentMarkSweep.count":7,
> >>      "gc.ConcurrentMarkSweep.time":1247,
> >>      "gc.ParNew.count":16759,
> >>      "gc.ParNew.time":884173,
> >> Which when displayed, doesn't give the clearest insight as to what the
> unit is:
> >>
> >>
> >> If anyone has any advice / guidance, that would be greatly appreciated.
> If there isn't documentation for the API, then this would also be something
> I'll look into help contributing with too.
> >>
> >> Thanks,
> >> --
> >> Richard Goodman
> >
>
>

-- 

Richard Goodman    |    Data Infrastructure engineer

richa...@brandwatch.com


NEW YORK   | BOSTON   | BRIGHTON   | LONDON   | BERLIN |   STUTTGART |
PARIS   | SINGAPORE | SYDNEY

<https://www.brandwatch.com/blog/digital-consumer-intelligence/>

Re: Metrics API - Documentation

Reply via email to