Hi, On a relatively average journal it can take a looooooooong time to page through all the data collected.
With data stored from 5th August to 25th September and running "journalctl" and pressing G to jump to the end in less, and it takes several minutes before the end of the messages are reached with journalctl taking 100% CPU for most of that time. This is on a slightly older systemd (approximately similar to the 195 version used in Suse - i.e. a shitton of backported patches!) but even on my regular-use laptop (with SSD) and logs dating back to Jun 18th it takes ~1m 30s to do the same. This is ultimately with something approximating 2.5m lines of output to be paged through (systemd debugging is on!) With this kind of performance, it's kinda a hard sell, although I'm not really sure if I should *expect* any better performance and I appreciate that listing specific date ranges or particular services can reduce the amont of data returned and thus speed things up dramitcally. So I guess my question is is it basically unrealistic to expect better performance from a simple "just output everything" operation like this? Sadly this is exactly the type of operation a typical user who is used to syslog would do with journalctl and thus don't see the beneifts. Any thoughts on this? HDD (systemd 195+patches): [root@marley ~]# du -sh /var/log/journal/ 1.5G /var/log/journal/ [root@marley ~]# date; journalctl | wc -l; date Wed 25 Sep 11:39:00 BST 2013 1957295 Wed 25 Sep 11:42:16 BST 2013 SSD (systemd 207): [root@jimmy ~]# du -sh /var/log/journal/ 2.0G /var/log/journal/ [root@jimmy ~]# date; journalctl | wc -l; date Wed 25 Sep 11:40:18 BST 2013 2391076 Wed 25 Sep 11:42:10 BST 2013 And just for some plain text comparisions on the older, HDD machine: [root@marley ~]# date; journalctl >/home/journal; date Wed 25 Sep 11:50:41 BST 2013 Wed 25 Sep 11:53:59 BST 2013 [root@marley ~]# wc -l /home/journal 1957527 /home/journal [root@marley ~]# date; cat /home/journal >/dev/null; date Wed 25 Sep 11:54:49 BST 2013 Wed 25 Sep 11:54:50 BST 2013 [root@marley ~]# date; cat /home/journal | gzip >/home/journal.gz; date Wed 25 Sep 11:55:23 BST 2013 Wed 25 Sep 11:55:28 BST 2013 [root@marley ~]# date; zcat /home/journal.gz >/dev/null; date Wed 25 Sep 11:55:50 BST 2013 Wed 25 Sep 11:55:51 BST 2013 [root@marley ~]# date; cat /home/journal | xz >/home/journal.xz; date Wed 25 Sep 11:56:15 BST 2013 Wed 25 Sep 11:58:12 BST 2013 [root@marley ~]# date; xzcat /home/journal.xz >/dev/null; date Wed 25 Sep 12:01:25 BST 2013 Wed 25 Sep 12:01:27 BST 2013 [root@marley ~]# ls -lh /home/journal* -rw-r--r-- 1 root root 244M Sep 25 11:53 /home/journal -rw-r--r-- 1 root root 17M Sep 25 11:55 /home/journal.gz -rw-r--r-- 1 root root 9.8M Sep 25 11:58 /home/journal.xz So, 2 seconds to page through 9.8M of compressed data directly with log files, or ~2 minutes, 30 seconds to page through 1.5GB of journal based logs to produce the same result.... (and I know the files here will be "hot" in terms of cache etc which gives them a slightly unfair advantage, but this would factor into real world usage too) Now, of course I do know that in the journalctl case, there is both more to look at, perhaps some old journals that are ultimately analysed at but never used because they are corrupt or something, and a whole bunch of other data that is not synthesizable to the journalctl syslog-style output, but forgetting features and looking at raw numbers and, as I said above, it's a hard sell on the surface! Is there something wrong here? Are my numbers unrealistic? Is this pointing at a larger problem with my setup/usage? Col -- Colin Guthrie gmane(at)colin.guthr.ie http://colin.guthr.ie/ Day Job: Tribalogic Limited http://www.tribalogic.net/ Open Source: Mageia Contributor http://www.mageia.org/ PulseAudio Hacker http://www.pulseaudio.org/ Trac Hacker http://trac.edgewall.org/ _______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel