Re: [systemd-devel] journald disk space usage

2017-02-28 Thread Bill Lipa
Thank you for the response.  I was hoping that the metadata would
compress better because it's almost identical between rows in my
application.  99% of the rows are going to be from the same unit.


On Tue, Feb 28, 2017 at 7:56 AM, Lennart Poettering
 wrote:
>
> The journal generates substantially more data, simply because we
> collect a lot of implicit metadata for each log even. This data is
> usually not compressed (we only compress individually large fields,
> and usually fields are not individuall large). The implicit metadata
> means we roughly collect 10x as much data and store that away.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] journald disk space usage

2017-02-28 Thread Lennart Poettering
On Mon, 27.02.17 16:18, Bill Lipa (d...@masterleep.com) wrote:

> Hello,
> 
> I have a Rails application that produces quite a bit of log output -
> about 500MB per day, maybe 3-4 million lines.  Currently this is going
> into a normal file with daily rotation.
> 
> I tried dumping this into journald via STDOUT so that I could see
> everything in one place.  On a standard Google Cloud Platform
> instance, this used about 10% extra CPU.  I was willing to live with
> that, but more of a problem was the rapid increase in storage used for
> the log.  It was growing at about 10x the rate as a flat file for the
> 2 hours I ran the experiment.  That is, after 2 hours, the usage
> reported by 'sudo journalctl --disk-usage' was over 400MB, which is
> not much less than I would normally see for an entire day's worth of
> logging.
> 
> I am wondering if this is to be expected due to journald's extra
> functionality and complexity, or does this seem incorrect?  I'm using
> systemd 229 on Ubuntu 16.04.

The journal generates substantially more data, simply because we
collect a lot of implicit metadata for each log even. This data is
usually not compressed (we only compress individually large fields,
and usually fields are not individuall large). The implicit metadata
means we roughly collect 10x as much data and store that away. This is
easy to verify:

journalctl -n 1000 | wc -c

vs.

journalctl -n 1000 -o verbose | wc -c

The first command outputs the journal data in syslog compatible
format, thus lacking all metadata. THe second command uses "verbose"
output mode, which includes all metadata. We output that for the 1000
most recent log events. On my system this yields 101993 and 971434.

If you are not interested in the metadata and systemd's indexing you
can of course turn off journald's storage and use something
non-indexed that carries no metadata, such as rsyslog or so.

Also note that beyond the mere metadata we also tend to collect more
data, simply because we also hook into audit, and every service's
stdout/stderr, as well as early boot logging, which syslog
traditionally didn't do this level.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] journald disk space usage

2017-02-27 Thread Bill Lipa
Hello,

I have a Rails application that produces quite a bit of log output -
about 500MB per day, maybe 3-4 million lines.  Currently this is going
into a normal file with daily rotation.

I tried dumping this into journald via STDOUT so that I could see
everything in one place.  On a standard Google Cloud Platform
instance, this used about 10% extra CPU.  I was willing to live with
that, but more of a problem was the rapid increase in storage used for
the log.  It was growing at about 10x the rate as a flat file for the
2 hours I ran the experiment.  That is, after 2 hours, the usage
reported by 'sudo journalctl --disk-usage' was over 400MB, which is
not much less than I would normally see for an entire day's worth of
logging.

I am wondering if this is to be expected due to journald's extra
functionality and complexity, or does this seem incorrect?  I'm using
systemd 229 on Ubuntu 16.04.

Thank you,
Bill Lipa
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel