Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-11 Thread Vito Caputo
On Thu, Feb 11, 2021 at 09:19:07AM -0500, Phillip Susi wrote: > > Phillip Susi writes: > > > Wait, what do you mean the inode nr changes? I thought the whole point > > of the block donating thing was that you get a contiguous set of blocks > > in the new file, then transfer those blocks back to

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-11 Thread Phillip Susi
Colin Guthrie writes: > I think the defaults are more complex than just "each journal file can > grow to 128M" no? Not as far as I can see. > I mean there is SystemMaxUse= which defaults to 10% of the partition on > which journal files live (this is for all journal files, not just the >

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-11 Thread Colin Guthrie
Phillip Susi wrote on 11/02/2021 16:29: Colin Guthrie writes: Are those journal files suffixed with a ~. Only ~ suffixed journals represent a dirty journal file (i.e. from an unexpected shutdown). Nope. Journals rotate for other reason too (e.g. user request, overall space requirements

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-11 Thread Phillip Susi
Colin Guthrie writes: > Are those journal files suffixed with a ~. Only ~ suffixed journals > represent a dirty journal file (i.e. from an unexpected shutdown). Nope. > Journals rotate for other reason too (e.g. user request, overall space > requirements etc.) which might explain this

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-11 Thread Colin Guthrie
Phillip Susi wrote on 11/02/2021 14:19: Looking at the archived journals though, I wonder why am I seeing so many unwritten areas? Just the last extent of this file has nearly 4 mb that were never written to. This system has never had an unexpected shutdown. Attached is the extent map. Are

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-11 Thread Phillip Susi
Phillip Susi writes: > Wait, what do you mean the inode nr changes? I thought the whole point > of the block donating thing was that you get a contiguous set of blocks > in the new file, then transfer those blocks back to the old inode so > that the inode number and timestamps of the file don't

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-10 Thread Phillip Susi
Lennart Poettering writes: > inode, and then donate the old blocks over. This means the inode nr > changes, which is something I don't like. Semantically it's only > marginally better than just creating a new file from scratch. Wait, what do you mean the inode nr changes? I thought the whole

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-10 Thread Lennart Poettering
On Di, 09.02.21 10:17, Phillip Susi (ph...@thesusis.net) wrote: > > Chris Murphy writes: > > > And I agree 8MB isn't a big deal. Does anyone complain about journal > > fragmentation on ext4 or xfs? If not, then we come full circle to my > > second email in the thread which is don't defragment

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-10 Thread Lennart Poettering
On Mo, 08.02.21 22:13, Chris Murphy (li...@colorremedies.com) wrote: > On Mon, Feb 8, 2021 at 7:56 AM Phillip Susi wrote: > > > > > > Chris Murphy writes: > > > > >> It sounds like you are arguing that it is better to do the wrong thing > > >> on all SSDs rather than do the right thing on ones

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-09 Thread Phillip Susi
Chris Murphy writes: > And I agree 8MB isn't a big deal. Does anyone complain about journal > fragmentation on ext4 or xfs? If not, then we come full circle to my > second email in the thread which is don't defragment when nodatacow, > only defragment when datacow. Or use BTRFS_IOC_DEFRAG_RANGE

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-08 Thread Chris Murphy
On Mon, Feb 8, 2021 at 8:20 AM Phillip Susi wrote: > > > Chris Murphy writes: > > > I showed that the archived journals have way more fragmentation than > > active journals. And the fragments in active journals are > > insignificant, and can even be reduced by fully allocating the journal > >

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-08 Thread Chris Murphy
On Mon, Feb 8, 2021 at 7:56 AM Phillip Susi wrote: > > > Chris Murphy writes: > > >> It sounds like you are arguing that it is better to do the wrong thing > >> on all SSDs rather than do the right thing on ones that aren't broken. > > > > No I'm suggesting there isn't currently a way to isolate

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-08 Thread Lennart Poettering
On Mo, 08.02.21 10:09, Phillip Susi (ph...@thesusis.net) wrote: > That's a fair point: if btrfs isn't any worse than other filessytems, > then why is it the only one that gets a defrag? As answered elsewhere: 1. only btrfs has a cow mode, where fragmentation is through the roof for randomly

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-08 Thread Lennart Poettering
On Sa, 06.02.21 12:51, Chris Murphy (li...@colorremedies.com) wrote: > The original commit description only mentions COW, it doesn't mention > being predicated on nodatacow. In effect commit > f27a386430cc7a27ebd06899d93310fb3bd4cee7 is obviated by commit >

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-08 Thread Lennart Poettering
On Sa, 06.02.21 19:47, Chris Murphy (li...@colorremedies.com) wrote: 65;6201;1c > On Fri, Feb 5, 2021 at 8:23 AM Phillip Susi wrote: > > > Chris Murphy writes: > > > > > But it gets worse. The way systemd-journald is submitting the journals > > > for defragmentation is making them more fragmented

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-08 Thread Lennart Poettering
On Fr, 05.02.21 17:44, Chris Murphy (li...@colorremedies.com) wrote: > On Fri, Feb 5, 2021 at 3:55 PM Lennart Poettering > wrote: > > > > On Fr, 05.02.21 20:58, Maksim Fomin (ma...@fomin.one) wrote: > > > > > > You know, we issue the btrfs ioctl, under the assumption that if the > > > > file is

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-08 Thread Phillip Susi
Chris Murphy writes: > I showed that the archived journals have way more fragmentation than > active journals. And the fragments in active journals are > insignificant, and can even be reduced by fully allocating the journal Then clearly this is a problem with btrfs: it absolutely should not

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-08 Thread Phillip Susi
Chris Murphy writes: >> It sounds like you are arguing that it is better to do the wrong thing >> on all SSDs rather than do the right thing on ones that aren't broken. > > No I'm suggesting there isn't currently a way to isolate > defragmentation to just HDDs. Yes, but it sounded like you

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-06 Thread Andrei Borzenkov
06.02.2021 00:33, Phillip Susi пишет: > > Lennart Poettering writes: > >> journalctl gives you one long continues log stream, joining everything >> available, archived or not into one big interleaved stream. > > If you ask for everything, yes... but if you run journalctl -b then > shuoldn't it

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-06 Thread Chris Murphy
On Fri, Feb 5, 2021 at 8:23 AM Phillip Susi wrote: > Chris Murphy writes: > > > But it gets worse. The way systemd-journald is submitting the journals > > for defragmentation is making them more fragmented than just leaving > > them alone. > > Wait, doesn't it just create a new file, fallocate

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-06 Thread Vito Caputo
On Fri, Feb 05, 2021 at 05:44:03PM -0700, Chris Murphy wrote: > On Fri, Feb 5, 2021 at 3:55 PM Lennart Poettering > wrote: > > > > On Fr, 05.02.21 20:58, Maksim Fomin (ma...@fomin.one) wrote: > > > > > > You know, we issue the btrfs ioctl, under the assumption that if the > > > > file is already

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-06 Thread Chris Murphy
More data points. 1. An ext4 file system with a 112M system.journal, it has 15 extents. >From FIEMAP we can pretty much see it's really made from 14 8MB extents, consistent with multiple appends. And it's the exact same behavior seen on Btrfs with nodatacow journals.

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Chris Murphy
On Fri, Feb 5, 2021 at 3:55 PM Lennart Poettering wrote: > > On Fr, 05.02.21 20:58, Maksim Fomin (ma...@fomin.one) wrote: > > > > You know, we issue the btrfs ioctl, under the assumption that if the > > > file is already perfectly defragmented it's a NOP. Are you suggesting > > > it isn't a NOP

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Lennart Poettering
On Fr, 05.02.21 20:58, Maksim Fomin (ma...@fomin.one) wrote: > > You know, we issue the btrfs ioctl, under the assumption that if the > > file is already perfectly defragmented it's a NOP. Are you suggesting > > it isn't a NOP in that case? > > So, what is the reason for defragmenting journal is

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Lennart Poettering
On Fr, 05.02.21 16:16, Phillip Susi (ph...@thesusis.net) wrote: > > Lennart Poettering writes: > > > Nope. We always interleave stuff. We currently open all journal files > > in parallel. The system one and the per-user ones, the current ones > > and the archived ones. > > Wait... every time you

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Phillip Susi
Lennart Poettering writes: > journalctl gives you one long continues log stream, joining everything > available, archived or not into one big interleaved stream. If you ask for everything, yes... but if you run journalctl -b then shuoldn't it only read back until it finds the start of the

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Lennart Poettering
On Fr, 05.02.21 20:43, Dave Howorth (syst...@howorth.org.uk) wrote: > 128 MB files, and I might allocate an extra MB or two for overhead, I > don't know. So when it first starts there'll be 128 MB allocated and > 384 MB free. In stable state there'll be 512 MB allocated and nothing > free. One

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Phillip Susi
Maksim Fomin writes: > I would say it depends on whether defragmentation issues are feature > of btrfs. As Chris mentioned, if root fs is snapshotted, > 'defragmenting' the journal can actually increase fragmentation. This > is an example when the problem is caused by a feature (not a bug) in >

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Phillip Susi
Dave Howorth writes: > PS I'm subscribed to the list. I don't need a copy. FYI, rather than ask others to go out of their way when replying to you, you should configure your mail client to set the Reply-To: header to point to the mailing list address so that other people's mail clients do what

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Phillip Susi
Lennart Poettering writes: > Nope. We always interleave stuff. We currently open all journal files > in parallel. The system one and the per-user ones, the current ones > and the archived ones. Wait... every time you look at the journal at all, it has to read back through ALL of the archived

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Maksim Fomin
‐‐‐ Original Message ‐‐‐ On Friday, February 5, 2021 3:23 PM, Lennart Poettering wrote: > On Do, 04.02.21 12:51, Chris Murphy (li...@colorremedies.com) wrote: > > > On Thu, Feb 4, 2021 at 6:49 AM Lennart Poettering > > lenn...@poettering.net wrote: > > > > > You want to optimize write

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Dave Howorth
On Fri, 5 Feb 2021 17:44:14 +0100 Lennart Poettering wrote: > On Fr, 05.02.21 16:06, Dave Howorth (syst...@howorth.org.uk) wrote: > > > On Fri, 5 Feb 2021 16:23:02 +0100 > > Lennart Poettering wrote: > > > I don't think that makes much sense: we rotate and start new > > > files for a

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Lennart Poettering
On Fr, 05.02.21 16:06, Dave Howorth (syst...@howorth.org.uk) wrote: > On Fri, 5 Feb 2021 16:23:02 +0100 > Lennart Poettering wrote: > > I don't think that makes much sense: we rotate and start new files for > > a multitude of reasons, such as size overrun, time jumps, abnormal > > shutdown and

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Dave Howorth
On Fri, 5 Feb 2021 16:23:02 +0100 Lennart Poettering wrote: > I don't think that makes much sense: we rotate and start new files for > a multitude of reasons, such as size overrun, time jumps, abnormal > shutdown and so on. If we'd always leave a fully allocated file around > people would hate

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Lennart Poettering
On Fr, 05.02.21 10:24, Phillip Susi (ph...@thesusis.net) wrote: > > Lennart Poettering writes: > > > You are focussing only on the one-time iops generated during archival, > > and are ignoring the extra latency during access that fragmented files > > cost. Show me that the iops reduction during

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Phillip Susi
Lennart Poettering writes: > You are focussing only on the one-time iops generated during archival, > and are ignoring the extra latency during access that fragmented files > cost. Show me that the iops reduction during the one-time operation > matters and the extra latency during access

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Phillip Susi
Chris Murphy writes: > But it gets worse. The way systemd-journald is submitting the journals > for defragmentation is making them more fragmented than just leaving > them alone. Wait, doesn't it just create a new file, fallocate the whole thing, copy the contents, and delete the original?

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-05 Thread Lennart Poettering
On Do, 04.02.21 12:51, Chris Murphy (li...@colorremedies.com) wrote: > On Thu, Feb 4, 2021 at 6:49 AM Lennart Poettering > wrote: > > > You want to optimize write pattersn I understand, i.e. minimize > > iops. Hence start with profiling iops, i.e. what defrag actually costs > > and then weight

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-04 Thread Chris Murphy
On Thu, Feb 4, 2021 at 6:49 AM Lennart Poettering wrote: > You want to optimize write pattersn I understand, i.e. minimize > iops. Hence start with profiling iops, i.e. what defrag actually costs > and then weight that agains the reduced access time when accessing the > files. In particular on

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-04 Thread Phillip Susi
Lennart Poettering writes: > Well, at least on my system here there are still like 20 fragments per > file. That's not nothin? In a 100 mb file? It could be better, but I very much doubt you're going to notice a difference after defragmenting that. I may be the nut that rescued the old ext2

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-04 Thread Lennart Poettering
On Mi, 03.02.21 23:11, Chris Murphy (li...@colorremedies.com) wrote: > On Wed, Feb 3, 2021 at 9:46 AM Lennart Poettering > wrote: > > > > Performance is terrible if cow is used on journal files while we write > > them. > > I've done it for a year on NVMe. The latency is so low, it doesn't >

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-04 Thread Lennart Poettering
On Mi, 03.02.21 22:51, Chris Murphy (li...@colorremedies.com) wrote: > > > Since systemd-journald sets nodatacow on /var/log/journal the journals > > > don't really fragment much. I typically see 2-4 extents for the life > > > of the journal, depending on how many times it's grown, in what looks

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-03 Thread Chris Murphy
On Wed, Feb 3, 2021 at 9:46 AM Lennart Poettering wrote: > > Performance is terrible if cow is used on journal files while we write > them. I've done it for a year on NVMe. The latency is so low, it doesn't matter. > It would be great if we could turn datacow back on once the files are >

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-03 Thread Chris Murphy
On Wed, Feb 3, 2021 at 9:41 AM Lennart Poettering wrote: > > On Di, 05.01.21 10:04, Chris Murphy (li...@colorremedies.com) wrote: > > > f27a386430cc7a27ebd06899d93310fb3bd4cee7 > > journald: whenever we rotate a file, btrfs defrag it > > > > Since systemd-journald sets nodatacow on

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-03 Thread Lennart Poettering
On Di, 26.01.21 21:00, Chris Murphy (li...@colorremedies.com) wrote: > On Tue, Jan 5, 2021 at 10:04 AM Chris Murphy wrote: > > > > f27a386430cc7a27ebd06899d93310fb3bd4cee7 > > journald: whenever we rotate a file, btrfs defrag it > > > > Since systemd-journald sets nodatacow on

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-02-03 Thread Lennart Poettering
On Di, 05.01.21 10:04, Chris Murphy (li...@colorremedies.com) wrote: > f27a386430cc7a27ebd06899d93310fb3bd4cee7 > journald: whenever we rotate a file, btrfs defrag it > > Since systemd-journald sets nodatacow on /var/log/journal the journals > don't really fragment much. I typically see 2-4

Re: [systemd-devel] consider dropping defrag of journals on btrfs

2021-01-26 Thread Chris Murphy
On Tue, Jan 5, 2021 at 10:04 AM Chris Murphy wrote: > > f27a386430cc7a27ebd06899d93310fb3bd4cee7 > journald: whenever we rotate a file, btrfs defrag it > > Since systemd-journald sets nodatacow on /var/log/journal the journals > don't really fragment much. I typically see 2-4 extents for the

[systemd-devel] consider dropping defrag of journals on btrfs

2021-01-05 Thread Chris Murphy
f27a386430cc7a27ebd06899d93310fb3bd4cee7 journald: whenever we rotate a file, btrfs defrag it Since systemd-journald sets nodatacow on /var/log/journal the journals don't really fragment much. I typically see 2-4 extents for the life of the journal, depending on how many times it's grown, in