Thanks Benno, this has come up before, mainly in the context of reducing
cost of computing / serving / processing large numbers of metrics.

However, in that use case, a single prefix wasn't sufficient because the
user would be interested in the subset of the metrics that they're using
for graphs and alerts, and these probably will not share a non-empty
prefix. So, the thinking was that multiple prefixes would be needed (which
doesn't work in the case of path parameters). The thinking was also to
avoid the alternative of wildcard patterns to start with (e.g.
/master/frameworks/*/tasks_running).

To give some context on why "/snapshot" is there: originally when the
metrics library was implemented, it was envisioned that there might be
multiple endpoints to read the data (e.g. "/snapshot" is current values,
"history" might expose historical timeseries, etc). In retrospect I don't
think there will be any other support other than "give me the current
values", so attempting to get rid of the "/snapshot" suffix sounds good.
But, this is orthogonal to whether a prefix path parameter or query
parameter is added, no?

On Thu, Mar 14, 2019 at 10:03 PM Benno Evers <bev...@mesosphere.com> wrote:

> Hi all,
>
> while this proposal/idea is a very small change code-wise, but it would be
> employing libprocess HTTP routing logic in an afaik unprecedented way, so I
> wanted to open this up for discussion.
>
> # Motivation
>
> Currently, the only way to access libprocess metrics is via the
> `metrics/snapshot` endpoint, which returns the current values of all
> installed metrics.
>
> If the caller is only interested in a specific metric, or a subset of the
> metrics, this is wasteful in two ways: First the process has to do extra
> work to collect these metrics, and second the caller has to do extra work
> to filter out the unneeded metrics.
>
> # Proposal
> I'm proposing to have the `/metrics/` endpoint being able to be followed by
> an arbitrary path. The returned returned JSON object will contain only
> those metrics whose key begins with the specified path:
>
>     `/metrics` -> Return all metrics
>     `/metrics/master/messages` -> Return all metrics beginning with
> `master/messages`, e.g. `master/messages_launch_tasks`, etc.
>
> A proof of concept implementation can be found here:
> https://reviews.apache.org/r/70211
>
> # Discussion
> The current naming conventions for metrics, i.e. `master/tasks_killed`,
> suggests to the casual observer that metrics are stored and accessible in a
> hierarchical manner. Using a prefix filter allows users to filter certain
> parts of the metrics as if they were indeed hierarchical, while still
> allowing libprocess to use a flat namespace for all metric names
> internally.
>
> The method of access, using the url path directly instead of a query
> parameter, is unusual but it has the advantage that, in my obervations, it
> matches what people intuitively try to do anyways when they want to access
> a subset of metrics.
>
> One other drawback is that all other routes of the MetricsProcess will
> shadow the corresponding filter value, e.g. in right now it would not be
> possible to return all metrics whose names begin with 'snapshot/'.
>
> # Alternatives
> 1) Add a `prefix` parameter to the `snapshot` endpoint, i.e.
>
>     `/metrics/snapshot?prefix=/master/cpu`
>
> This is more in line with how we classically do libprocess endpoints, but
> from a UI perspective it's hard to discover: Many people, including some
> Mesos developers, already have trouble remembering to append `/snapshot` to
> get the metrics, so requiring to memorize an additional parameter does not
> seem nice.
>
> 2) Move the dynamic prefix under some other endpoint `/values`, i.e.
>
>     /metrics/values/master/messages`
>
> This has the main disadvantage that /values (with empty filter) and
> /snapshot will return exactly the same data, begging the question why both
> are needed.
>
>
> What do you think? I'm looking forward to hear your thoughts, ideas, etc.
>
> Best regards,
> --
> Benno Evers
> Software Engineer, Mesosphere
>

Reply via email to