On Sat, 14.06.14 09:52, Goffredo Baroncelli (kreij...@libero.it) wrote: > > Which effectively means that by the time the 8 MiB is filled, each 4 KiB > > block has been rewritten to a new location and is now an extent unto > > itself. So now that 8 MiB is composed of 2048 new extents, each one a > > single 4 KiB block in size. > > Several people pointed fallocate as the problem. But I don't > understand the reason.
BTW, the reason we use fallocate() in journald is not about trying to optimize anything. It's only used for one reason: to avoid SIGBUS on disk/quota full, since we actually write everything to the files using mmap(). I mean, writing things with mmap() is always problematic, and handling write errors is awfully difficult, but at least two of the most common reasons for failure we'd like protect against in advance, under the assumption that disk/quota full will be reported immediately by the fallocate(), and the mmap writes later on will then necessarily succeed. I am not really following though why this trips up btrfs though. I am not sure I understand why this breaks btrfs COW behaviour. I mean, fallocate() isn't necessarily supposed to write anything really, it's mostly about allocating disk space in advance. I would claim that journald's usage of it is very much within the entire reason why it exists... Anyway, happy to change these things around if necesary, but first I'd like to have a very good explanation why fallocate() wouldn't be the right thing to invoke here, and a suggestion what we should do instead to cover this usecase... Lennart -- Lennart Poettering, Red Hat _______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel