Re: [systemd-devel] Thoughts about storing unit/job statistics

2019-12-05 Thread Philip Withnall
On Thu, 2019-11-28 at 09:32 +0100, Lennart Poettering wrote:
> On Mi, 27.11.19 14:26, Philip Withnall (phi...@tecnocode.co.uk)
> wrote:
> 
> > > > If I were to implement this as a separate daemon, it would need
> > > > to
> > > > be
> > > > active all the time, listening to
> > > > UnitNew/UnitRemoved/JobNew/JobRemoved
> > > > signals from systemd. That seems like a waste of a process.
> > > > Let’s
> > > > call
> > > > this problem 0.
> > > 
> > > This data is already collected and written to the journal anyway
> > > if
> > > you turn on the various XYZAccounting= properties for your
> > > services. Then use the invocation ID of the service and the
> > > MESSAGE_ID=ae8f7b866b0347b9af31fe1c80b127c0 match in journalctl
> > > to
> > > find these.
> > 
> > We’re interested in the wall clock time that a unit/scope was
> > active,
> > not the CPU time, so I suspect we’d have to add another message
> > along
> > the same lines.
> 
> Just add another structured field to the existing message. The
> message
> already contains IO/CPU/IP/… stats, hence adding more time stats
> definitely makes sense.

OK, I’ll put a merge request together for that sometime soon, since
that looks easy, useful and self-contained.

> > > A long-standing TODO item in systemd was to have some form of
> > > metrics
> > > collector, that may be turned on and that writes a time-keyed
> > > ring
> > > buffer  of metrics collected per service to disk, from the data
> > > collected via cgroup attributes, bpf and so on. But so far noone
> > > has
> > > found the time to do it. It probably should be decoupled from PID
> > > 1
> > > in
> > > some form, so that PID 1 only pings it whenever a new cgroup
> > > shall be
> > > watched but the collecting/writing of the data points is done
> > > entirely
> > > separate from it.
> > 
> > Would the idea with that be that it uses the journal, or not? Is
> > there
> > a task in GitHub for it?
> 
> In my current thinking it would be similar to journald in many way,
> but not be journald, since the data is differently structured
> (i.e. not keyed by arbitrary fields but keyed by time, just time-
> based
> ring buffers). The idea would be to mantain ring buffers in /run/ and
> /var/ similar to how journald does it, and have "systemd-metricsd"
> pull at its own pace metrics from the various cgroups/bpf tables/…
> and
> write them to these buffers. Apps could then mmap them and pull the
> data out either instantly (if they are located in /run/) or after
> substantial latency (if they are located in /var/) depending on the
> usecase.
> 
> Ideally we wouldn't even come up with our own file format for these
> ring buffers, and just use what is already established, but afaiu
> there's no established standard for time series ring buffer files so
> far, hence I figure we need to come up with our own. I mean, after
> all
> the intention is not to process this data ourselves but have other
> tools do that.
> 
> There's #10229.

Thanks for the info and reference. I’ll continue to ponder about which
approach I’ll go with, based on the time/effort required to solve the
immediate problem for the desktop parental controls feature (which is
quite a small subset of what a full ring buffer store of unit
statistics would provide).

Ta,
Philip

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Thoughts about storing unit/job statistics

2019-12-02 Thread Dan Nicholson
On Thu, Nov 28, 2019 at 1:32 AM Lennart Poettering
 wrote:
>
> Ideally we wouldn't even come up with our own file format for these
> ring buffers, and just use what is already established, but afaiu
> there's no established standard for time series ring buffer files so
> far, hence I figure we need to come up with our own. I mean, after all
> the intention is not to process this data ourselves but have other
> tools do that.

A classic and still widely used format for this is an RRD file as
popularized by RRDtool. You can see some description at
https://oss.oetiker.ch/rrdtool/tut/rrdtutorial.en.html and
https://en.wikipedia.org/wiki/RRDtool.

Another one we came across at Endless was a whisper database used by
Graphite. It's a text format similar to RRD. See
https://graphite.readthedocs.io/en/latest/whisper.html.

A more recent and popular tool is prometheus. See
https://prometheus.io/docs/prometheus/latest/storage/ for how it
stores data on disk. It appears that the actual on disk format is
documented at 
https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/README.md.
Alternatively, you could potentially setup systemd as a prometheus
exporter (https://prometheus.io/docs/instrumenting/exporters/) that
prometheus pulls from. That obviously makes systemd metrics not as
usable out of the box and more suited to server usage than desktop
usage.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Thoughts about storing unit/job statistics

2019-11-28 Thread Lennart Poettering
On Mi, 27.11.19 14:26, Philip Withnall (phi...@tecnocode.co.uk) wrote:

> > > If I were to implement this as a separate daemon, it would need to
> > > be
> > > active all the time, listening to
> > > UnitNew/UnitRemoved/JobNew/JobRemoved
> > > signals from systemd. That seems like a waste of a process. Let’s
> > > call
> > > this problem 0.
> >
> > This data is already collected and written to the journal anyway if
> > you turn on the various XYZAccounting= properties for your
> > services. Then use the invocation ID of the service and the
> > MESSAGE_ID=ae8f7b866b0347b9af31fe1c80b127c0 match in journalctl to
> > find these.
>
> We’re interested in the wall clock time that a unit/scope was active,
> not the CPU time, so I suspect we’d have to add another message along
> the same lines.

Just add another structured field to the existing message. The message
already contains IO/CPU/IP/… stats, hence adding more time stats
definitely makes sense.

> > > One approach would be to store this data in the journal, but
> > > (problems
> > > 1-3):
> > >  1. We can’t control how long the journal data is around for, or
> > > even
> > > if it’s set to persist.
> >
> > You can pull the data from the journal at your own pace, always
> > keeping the cursor you last read from around so that you don't lose
> > messages.
>
> Yes and no. The distro or admin could set the journal up to be non-
> persistent, in which case we’d need to pull the data from it before
> `systemd-journald` stops. That could work as long as we could make sure
> the pulls happen at all the right times.

Well, if people want to shoot themselves in the foot they can of
course, not sure why you should care...

> > A long-standing TODO item in systemd was to have some form of metrics
> > collector, that may be turned on and that writes a time-keyed ring
> > buffer  of metrics collected per service to disk, from the data
> > collected via cgroup attributes, bpf and so on. But so far noone has
> > found the time to do it. It probably should be decoupled from PID 1
> > in
> > some form, so that PID 1 only pings it whenever a new cgroup shall be
> > watched but the collecting/writing of the data points is done
> > entirely
> > separate from it.
>
> Would the idea with that be that it uses the journal, or not? Is there
> a task in GitHub for it?

In my current thinking it would be similar to journald in many way,
but not be journald, since the data is differently structured
(i.e. not keyed by arbitrary fields but keyed by time, just time-based
ring buffers). The idea would be to mantain ring buffers in /run/ and
/var/ similar to how journald does it, and have "systemd-metricsd"
pull at its own pace metrics from the various cgroups/bpf tables/… and
write them to these buffers. Apps could then mmap them and pull the
data out either instantly (if they are located in /run/) or after
substantial latency (if they are located in /var/) depending on the
usecase.

Ideally we wouldn't even come up with our own file format for these
ring buffers, and just use what is already established, but afaiu
there's no established standard for time series ring buffer files so
far, hence I figure we need to come up with our own. I mean, after all
the intention is not to process this data ourselves but have other
tools do that.

There's #10229.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Thoughts about storing unit/job statistics

2019-11-27 Thread Philip Withnall
Hey,

On Tue, 2019-11-19 at 10:26 +0100, Lennart Poettering wrote:
> On Fr, 08.11.19 11:15, Philip Withnall (phi...@tecnocode.co.uk)
> wrote:
> 
> > Hello all,
> > 
> > As part of work on a GNOME feature for monitoring how often the
> > user
> > uses applications (for example, to let them know that they spent 4
> > hours in the Slack app, or 17 hours playing games), I’m trying to
> > work
> > out the best way to store data like that.
> > 
> > If we assume the system is using systemd user sessions, then an
> > application being run is actually a unit being started, and so the
> > data
> > we want to store is actually the duration of each unit run.
> > 
> > A related issue is that of storing network usage data per unit, to
> > allow the user to see which apps have been using the most data over
> > the
> > last (say) month.
> 
> This data can already be tracked by systemd for you, just set
> IPAccounting=yes for the service. Alas this only works for system
> services currently, since the bpf/cgroup2 logic this relies on is
> only
> accessible to privileged processes. (Fixing that would probably mean
> introducing a tiny privileged service that takes a cgroup fd as
> input,
> and then installs the correct bpf program and bpf table into it,
> returning the fds for these to objects. Happy to take a patch for
> that.)

Yes, the plan was already to use IPAccounting=yes, although I hadn’t
realised it only worked for system services. That’s good to know, and
will be a bit of a stumbling block we’ll have to fix before adding
support for network data. The usage duration data is what we want to
focus on first, though, so the BPF helper can wait.

> > If I were to implement this as a separate daemon, it would need to
> > be
> > active all the time, listening to
> > UnitNew/UnitRemoved/JobNew/JobRemoved
> > signals from systemd. That seems like a waste of a process. Let’s
> > call
> > this problem 0.
> 
> This data is already collected and written to the journal anyway if
> you turn on the various XYZAccounting= properties for your
> services. Then use the invocation ID of the service and the
> MESSAGE_ID=ae8f7b866b0347b9af31fe1c80b127c0 match in journalctl to
> find these.

We’re interested in the wall clock time that a unit/scope was active,
not the CPU time, so I suspect we’d have to add another message along
the same lines.

> > One approach would be to store this data in the journal, but
> > (problems
> > 1-3):
> >  1. We can’t control how long the journal data is around for, or
> > even
> > if it’s set to persist.
> 
> You can pull the data from the journal at your own pace, always
> keeping the cursor you last read from around so that you don't lose
> messages.

Yes and no. The distro or admin could set the journal up to be non-
persistent, in which case we’d need to pull the data from it before
`systemd-journald` stops. That could work as long as we could make sure
the pulls happen at all the right times.

> > So I have two questions:
> >  1. Does this seem like the kind of functionality which should go
> > into
> > the journal, if it was modified to address problems 1-3 above?
> >  1a. If not, do you have any suggestions for how to implement it so
> > that problem 0 above is not an issue, i.e. we don’t have to keep a
> > daemon running all the time just to record a small chunk of data
> > once
> > every few minutes?
> >  2. Does this seem like a subset of a larger bit of functionality,
> > storing metrics about units and jobs for later analysis, which
> > might be
> > interesting to non-desktop users of systemd?
> 
> A long-standing TODO item in systemd was to have some form of metrics
> collector, that may be turned on and that writes a time-keyed ring
> buffer  of metrics collected per service to disk, from the data
> collected via cgroup attributes, bpf and so on. But so far noone has
> found the time to do it. It probably should be decoupled from PID 1
> in
> some form, so that PID 1 only pings it whenever a new cgroup shall be
> watched but the collecting/writing of the data points is done
> entirely
> separate from it.

Would the idea with that be that it uses the journal, or not? Is there
a task in GitHub for it?

Philip

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Thoughts about storing unit/job statistics

2019-11-19 Thread Lennart Poettering
On Fr, 08.11.19 11:15, Philip Withnall (phi...@tecnocode.co.uk) wrote:

> Hello all,
>
> As part of work on a GNOME feature for monitoring how often the user
> uses applications (for example, to let them know that they spent 4
> hours in the Slack app, or 17 hours playing games), I’m trying to work
> out the best way to store data like that.
>
> If we assume the system is using systemd user sessions, then an
> application being run is actually a unit being started, and so the data
> we want to store is actually the duration of each unit run.
>
> A related issue is that of storing network usage data per unit, to
> allow the user to see which apps have been using the most data over the
> last (say) month.

This data can already be tracked by systemd for you, just set
IPAccounting=yes for the service. Alas this only works for system
services currently, since the bpf/cgroup2 logic this relies on is only
accessible to privileged processes. (Fixing that would probably mean
introducing a tiny privileged service that takes a cgroup fd as input,
and then installs the correct bpf program and bpf table into it,
returning the fds for these to objects. Happy to take a patch for
that.)

> If I were to implement this as a separate daemon, it would need to be
> active all the time, listening to UnitNew/UnitRemoved/JobNew/JobRemoved
> signals from systemd. That seems like a waste of a process. Let’s call
> this problem 0.

This data is already collected and written to the journal anyway if
you turn on the various XYZAccounting= properties for your
services. Then use the invocation ID of the service and the
MESSAGE_ID=ae8f7b866b0347b9af31fe1c80b127c0 match in journalctl to
find these.

> One approach would be to store this data in the journal, but (problems
> 1-3):
>  1. We can’t control how long the journal data is around for, or even
> if it’s set to persist.

You can pull the data from the journal at your own pace, always
keeping the cursor you last read from around so that you don't lose
messages.

>  2. This data couldn’t be stored separately (for example, in a separate
> journal file) so that the user could delete it all together and
> separately from the rest of the journal. (To reset their usage data.)
>  3. Querying it from the journal would mean filtering and iterating
> through everything else in the journal, which is not going to be the
> fastest (although it probably wouldn’t be too bad, and we would be
> appending a lot more often than we would be querying).

Well, the format is indexed. It shouldn't be that bad.

> So I have two questions:
>  1. Does this seem like the kind of functionality which should go into
> the journal, if it was modified to address problems 1-3 above?
>  1a. If not, do you have any suggestions for how to implement it so
> that problem 0 above is not an issue, i.e. we don’t have to keep a
> daemon running all the time just to record a small chunk of data once
> every few minutes?
>  2. Does this seem like a subset of a larger bit of functionality,
> storing metrics about units and jobs for later analysis, which might be
> interesting to non-desktop users of systemd?

A long-standing TODO item in systemd was to have some form of metrics
collector, that may be turned on and that writes a time-keyed ring
buffer  of metrics collected per service to disk, from the data
collected via cgroup attributes, bpf and so on. But so far noone has
found the time to do it. It probably should be decoupled from PID 1 in
some form, so that PID 1 only pings it whenever a new cgroup shall be
watched but the collecting/writing of the data points is done entirely
separate from it.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] Thoughts about storing unit/job statistics

2019-11-08 Thread Philip Withnall
Hello all,

As part of work on a GNOME feature for monitoring how often the user
uses applications (for example, to let them know that they spent 4
hours in the Slack app, or 17 hours playing games), I’m trying to work
out the best way to store data like that.

If we assume the system is using systemd user sessions, then an
application being run is actually a unit being started, and so the data
we want to store is actually the duration of each unit run.

A related issue is that of storing network usage data per unit, to
allow the user to see which apps have been using the most data over the
last (say) month.

If I were to implement this as a separate daemon, it would need to be
active all the time, listening to UnitNew/UnitRemoved/JobNew/JobRemoved
signals from systemd. That seems like a waste of a process. Let’s call
this problem 0.

One approach would be to store this data in the journal, but (problems
1-3):
 1. We can’t control how long the journal data is around for, or even
if it’s set to persist.
 2. This data couldn’t be stored separately (for example, in a separate
journal file) so that the user could delete it all together and
separately from the rest of the journal. (To reset their usage data.)
 3. Querying it from the journal would mean filtering and iterating
through everything else in the journal, which is not going to be the
fastest (although it probably wouldn’t be too bad, and we would be
appending a lot more often than we would be querying).

So I have two questions:
 1. Does this seem like the kind of functionality which should go into
the journal, if it was modified to address problems 1-3 above?
 1a. If not, do you have any suggestions for how to implement it so
that problem 0 above is not an issue, i.e. we don’t have to keep a
daemon running all the time just to record a small chunk of data once
every few minutes?
 2. Does this seem like a subset of a larger bit of functionality,
storing metrics about units and jobs for later analysis, which might be
interesting to non-desktop users of systemd?

Thanks,
Philip

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel