Re: [systemd-devel] Thoughts about storing unit/job statistics
On Thu, 2019-11-28 at 09:32 +0100, Lennart Poettering wrote: > On Mi, 27.11.19 14:26, Philip Withnall (phi...@tecnocode.co.uk) > wrote: > > > > > If I were to implement this as a separate daemon, it would need > > > > to > > > > be > > > > active all the time, listening to > > > > UnitNew/UnitRemoved/JobNew/JobRemoved > > > > signals from systemd. That seems like a waste of a process. > > > > Let’s > > > > call > > > > this problem 0. > > > > > > This data is already collected and written to the journal anyway > > > if > > > you turn on the various XYZAccounting= properties for your > > > services. Then use the invocation ID of the service and the > > > MESSAGE_ID=ae8f7b866b0347b9af31fe1c80b127c0 match in journalctl > > > to > > > find these. > > > > We’re interested in the wall clock time that a unit/scope was > > active, > > not the CPU time, so I suspect we’d have to add another message > > along > > the same lines. > > Just add another structured field to the existing message. The > message > already contains IO/CPU/IP/… stats, hence adding more time stats > definitely makes sense. OK, I’ll put a merge request together for that sometime soon, since that looks easy, useful and self-contained. > > > A long-standing TODO item in systemd was to have some form of > > > metrics > > > collector, that may be turned on and that writes a time-keyed > > > ring > > > buffer of metrics collected per service to disk, from the data > > > collected via cgroup attributes, bpf and so on. But so far noone > > > has > > > found the time to do it. It probably should be decoupled from PID > > > 1 > > > in > > > some form, so that PID 1 only pings it whenever a new cgroup > > > shall be > > > watched but the collecting/writing of the data points is done > > > entirely > > > separate from it. > > > > Would the idea with that be that it uses the journal, or not? Is > > there > > a task in GitHub for it? > > In my current thinking it would be similar to journald in many way, > but not be journald, since the data is differently structured > (i.e. not keyed by arbitrary fields but keyed by time, just time- > based > ring buffers). The idea would be to mantain ring buffers in /run/ and > /var/ similar to how journald does it, and have "systemd-metricsd" > pull at its own pace metrics from the various cgroups/bpf tables/… > and > write them to these buffers. Apps could then mmap them and pull the > data out either instantly (if they are located in /run/) or after > substantial latency (if they are located in /var/) depending on the > usecase. > > Ideally we wouldn't even come up with our own file format for these > ring buffers, and just use what is already established, but afaiu > there's no established standard for time series ring buffer files so > far, hence I figure we need to come up with our own. I mean, after > all > the intention is not to process this data ourselves but have other > tools do that. > > There's #10229. Thanks for the info and reference. I’ll continue to ponder about which approach I’ll go with, based on the time/effort required to solve the immediate problem for the desktop parental controls feature (which is quite a small subset of what a full ring buffer store of unit statistics would provide). Ta, Philip ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Thoughts about storing unit/job statistics
On Thu, Nov 28, 2019 at 1:32 AM Lennart Poettering wrote: > > Ideally we wouldn't even come up with our own file format for these > ring buffers, and just use what is already established, but afaiu > there's no established standard for time series ring buffer files so > far, hence I figure we need to come up with our own. I mean, after all > the intention is not to process this data ourselves but have other > tools do that. A classic and still widely used format for this is an RRD file as popularized by RRDtool. You can see some description at https://oss.oetiker.ch/rrdtool/tut/rrdtutorial.en.html and https://en.wikipedia.org/wiki/RRDtool. Another one we came across at Endless was a whisper database used by Graphite. It's a text format similar to RRD. See https://graphite.readthedocs.io/en/latest/whisper.html. A more recent and popular tool is prometheus. See https://prometheus.io/docs/prometheus/latest/storage/ for how it stores data on disk. It appears that the actual on disk format is documented at https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/README.md. Alternatively, you could potentially setup systemd as a prometheus exporter (https://prometheus.io/docs/instrumenting/exporters/) that prometheus pulls from. That obviously makes systemd metrics not as usable out of the box and more suited to server usage than desktop usage. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Thoughts about storing unit/job statistics
On Mi, 27.11.19 14:26, Philip Withnall (phi...@tecnocode.co.uk) wrote: > > > If I were to implement this as a separate daemon, it would need to > > > be > > > active all the time, listening to > > > UnitNew/UnitRemoved/JobNew/JobRemoved > > > signals from systemd. That seems like a waste of a process. Let’s > > > call > > > this problem 0. > > > > This data is already collected and written to the journal anyway if > > you turn on the various XYZAccounting= properties for your > > services. Then use the invocation ID of the service and the > > MESSAGE_ID=ae8f7b866b0347b9af31fe1c80b127c0 match in journalctl to > > find these. > > We’re interested in the wall clock time that a unit/scope was active, > not the CPU time, so I suspect we’d have to add another message along > the same lines. Just add another structured field to the existing message. The message already contains IO/CPU/IP/… stats, hence adding more time stats definitely makes sense. > > > One approach would be to store this data in the journal, but > > > (problems > > > 1-3): > > > 1. We can’t control how long the journal data is around for, or > > > even > > > if it’s set to persist. > > > > You can pull the data from the journal at your own pace, always > > keeping the cursor you last read from around so that you don't lose > > messages. > > Yes and no. The distro or admin could set the journal up to be non- > persistent, in which case we’d need to pull the data from it before > `systemd-journald` stops. That could work as long as we could make sure > the pulls happen at all the right times. Well, if people want to shoot themselves in the foot they can of course, not sure why you should care... > > A long-standing TODO item in systemd was to have some form of metrics > > collector, that may be turned on and that writes a time-keyed ring > > buffer of metrics collected per service to disk, from the data > > collected via cgroup attributes, bpf and so on. But so far noone has > > found the time to do it. It probably should be decoupled from PID 1 > > in > > some form, so that PID 1 only pings it whenever a new cgroup shall be > > watched but the collecting/writing of the data points is done > > entirely > > separate from it. > > Would the idea with that be that it uses the journal, or not? Is there > a task in GitHub for it? In my current thinking it would be similar to journald in many way, but not be journald, since the data is differently structured (i.e. not keyed by arbitrary fields but keyed by time, just time-based ring buffers). The idea would be to mantain ring buffers in /run/ and /var/ similar to how journald does it, and have "systemd-metricsd" pull at its own pace metrics from the various cgroups/bpf tables/… and write them to these buffers. Apps could then mmap them and pull the data out either instantly (if they are located in /run/) or after substantial latency (if they are located in /var/) depending on the usecase. Ideally we wouldn't even come up with our own file format for these ring buffers, and just use what is already established, but afaiu there's no established standard for time series ring buffer files so far, hence I figure we need to come up with our own. I mean, after all the intention is not to process this data ourselves but have other tools do that. There's #10229. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Thoughts about storing unit/job statistics
Hey, On Tue, 2019-11-19 at 10:26 +0100, Lennart Poettering wrote: > On Fr, 08.11.19 11:15, Philip Withnall (phi...@tecnocode.co.uk) > wrote: > > > Hello all, > > > > As part of work on a GNOME feature for monitoring how often the > > user > > uses applications (for example, to let them know that they spent 4 > > hours in the Slack app, or 17 hours playing games), I’m trying to > > work > > out the best way to store data like that. > > > > If we assume the system is using systemd user sessions, then an > > application being run is actually a unit being started, and so the > > data > > we want to store is actually the duration of each unit run. > > > > A related issue is that of storing network usage data per unit, to > > allow the user to see which apps have been using the most data over > > the > > last (say) month. > > This data can already be tracked by systemd for you, just set > IPAccounting=yes for the service. Alas this only works for system > services currently, since the bpf/cgroup2 logic this relies on is > only > accessible to privileged processes. (Fixing that would probably mean > introducing a tiny privileged service that takes a cgroup fd as > input, > and then installs the correct bpf program and bpf table into it, > returning the fds for these to objects. Happy to take a patch for > that.) Yes, the plan was already to use IPAccounting=yes, although I hadn’t realised it only worked for system services. That’s good to know, and will be a bit of a stumbling block we’ll have to fix before adding support for network data. The usage duration data is what we want to focus on first, though, so the BPF helper can wait. > > If I were to implement this as a separate daemon, it would need to > > be > > active all the time, listening to > > UnitNew/UnitRemoved/JobNew/JobRemoved > > signals from systemd. That seems like a waste of a process. Let’s > > call > > this problem 0. > > This data is already collected and written to the journal anyway if > you turn on the various XYZAccounting= properties for your > services. Then use the invocation ID of the service and the > MESSAGE_ID=ae8f7b866b0347b9af31fe1c80b127c0 match in journalctl to > find these. We’re interested in the wall clock time that a unit/scope was active, not the CPU time, so I suspect we’d have to add another message along the same lines. > > One approach would be to store this data in the journal, but > > (problems > > 1-3): > > 1. We can’t control how long the journal data is around for, or > > even > > if it’s set to persist. > > You can pull the data from the journal at your own pace, always > keeping the cursor you last read from around so that you don't lose > messages. Yes and no. The distro or admin could set the journal up to be non- persistent, in which case we’d need to pull the data from it before `systemd-journald` stops. That could work as long as we could make sure the pulls happen at all the right times. > > So I have two questions: > > 1. Does this seem like the kind of functionality which should go > > into > > the journal, if it was modified to address problems 1-3 above? > > 1a. If not, do you have any suggestions for how to implement it so > > that problem 0 above is not an issue, i.e. we don’t have to keep a > > daemon running all the time just to record a small chunk of data > > once > > every few minutes? > > 2. Does this seem like a subset of a larger bit of functionality, > > storing metrics about units and jobs for later analysis, which > > might be > > interesting to non-desktop users of systemd? > > A long-standing TODO item in systemd was to have some form of metrics > collector, that may be turned on and that writes a time-keyed ring > buffer of metrics collected per service to disk, from the data > collected via cgroup attributes, bpf and so on. But so far noone has > found the time to do it. It probably should be decoupled from PID 1 > in > some form, so that PID 1 only pings it whenever a new cgroup shall be > watched but the collecting/writing of the data points is done > entirely > separate from it. Would the idea with that be that it uses the journal, or not? Is there a task in GitHub for it? Philip ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Thoughts about storing unit/job statistics
On Fr, 08.11.19 11:15, Philip Withnall (phi...@tecnocode.co.uk) wrote: > Hello all, > > As part of work on a GNOME feature for monitoring how often the user > uses applications (for example, to let them know that they spent 4 > hours in the Slack app, or 17 hours playing games), I’m trying to work > out the best way to store data like that. > > If we assume the system is using systemd user sessions, then an > application being run is actually a unit being started, and so the data > we want to store is actually the duration of each unit run. > > A related issue is that of storing network usage data per unit, to > allow the user to see which apps have been using the most data over the > last (say) month. This data can already be tracked by systemd for you, just set IPAccounting=yes for the service. Alas this only works for system services currently, since the bpf/cgroup2 logic this relies on is only accessible to privileged processes. (Fixing that would probably mean introducing a tiny privileged service that takes a cgroup fd as input, and then installs the correct bpf program and bpf table into it, returning the fds for these to objects. Happy to take a patch for that.) > If I were to implement this as a separate daemon, it would need to be > active all the time, listening to UnitNew/UnitRemoved/JobNew/JobRemoved > signals from systemd. That seems like a waste of a process. Let’s call > this problem 0. This data is already collected and written to the journal anyway if you turn on the various XYZAccounting= properties for your services. Then use the invocation ID of the service and the MESSAGE_ID=ae8f7b866b0347b9af31fe1c80b127c0 match in journalctl to find these. > One approach would be to store this data in the journal, but (problems > 1-3): > 1. We can’t control how long the journal data is around for, or even > if it’s set to persist. You can pull the data from the journal at your own pace, always keeping the cursor you last read from around so that you don't lose messages. > 2. This data couldn’t be stored separately (for example, in a separate > journal file) so that the user could delete it all together and > separately from the rest of the journal. (To reset their usage data.) > 3. Querying it from the journal would mean filtering and iterating > through everything else in the journal, which is not going to be the > fastest (although it probably wouldn’t be too bad, and we would be > appending a lot more often than we would be querying). Well, the format is indexed. It shouldn't be that bad. > So I have two questions: > 1. Does this seem like the kind of functionality which should go into > the journal, if it was modified to address problems 1-3 above? > 1a. If not, do you have any suggestions for how to implement it so > that problem 0 above is not an issue, i.e. we don’t have to keep a > daemon running all the time just to record a small chunk of data once > every few minutes? > 2. Does this seem like a subset of a larger bit of functionality, > storing metrics about units and jobs for later analysis, which might be > interesting to non-desktop users of systemd? A long-standing TODO item in systemd was to have some form of metrics collector, that may be turned on and that writes a time-keyed ring buffer of metrics collected per service to disk, from the data collected via cgroup attributes, bpf and so on. But so far noone has found the time to do it. It probably should be decoupled from PID 1 in some form, so that PID 1 only pings it whenever a new cgroup shall be watched but the collecting/writing of the data points is done entirely separate from it. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Thoughts about storing unit/job statistics
Hello all, As part of work on a GNOME feature for monitoring how often the user uses applications (for example, to let them know that they spent 4 hours in the Slack app, or 17 hours playing games), I’m trying to work out the best way to store data like that. If we assume the system is using systemd user sessions, then an application being run is actually a unit being started, and so the data we want to store is actually the duration of each unit run. A related issue is that of storing network usage data per unit, to allow the user to see which apps have been using the most data over the last (say) month. If I were to implement this as a separate daemon, it would need to be active all the time, listening to UnitNew/UnitRemoved/JobNew/JobRemoved signals from systemd. That seems like a waste of a process. Let’s call this problem 0. One approach would be to store this data in the journal, but (problems 1-3): 1. We can’t control how long the journal data is around for, or even if it’s set to persist. 2. This data couldn’t be stored separately (for example, in a separate journal file) so that the user could delete it all together and separately from the rest of the journal. (To reset their usage data.) 3. Querying it from the journal would mean filtering and iterating through everything else in the journal, which is not going to be the fastest (although it probably wouldn’t be too bad, and we would be appending a lot more often than we would be querying). So I have two questions: 1. Does this seem like the kind of functionality which should go into the journal, if it was modified to address problems 1-3 above? 1a. If not, do you have any suggestions for how to implement it so that problem 0 above is not an issue, i.e. we don’t have to keep a daemon running all the time just to record a small chunk of data once every few minutes? 2. Does this seem like a subset of a larger bit of functionality, storing metrics about units and jobs for later analysis, which might be interesting to non-desktop users of systemd? Thanks, Philip ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel