On Fri, 2010-11-05 at 23:18:47 +0100, Philipp Kern wrote: > On Fri, Oct 22, 2010 at 11:35:54AM +0200, Guillem Jover wrote: > > Those are related to the fsync()/sync() changes in dpkg from some time > > ago, the patches would: > > > > 1) Switch back from sync() to fsync() before rename() (while keeping > > the sync() code around for the benefit of other distributions > > that might not want to switch just yet). So to avoid unrelated > > I/O when there's background work being done for example. This > > hack also only works on Linux where sync() is synchronous, so > > it would unify that code path for all dpkg supported platforms. > > > > Bug: #588339 > > > > <http://git.hadrons.org/?p=debian/dpkg/dpkg.git;a=commitdiff;h=87740373> > > I do have to say that I'm not overly happy about this change this late in the > freeze.
My idea was to get those on the planned 1.15.9, but after the sudden freeze and the initial reactions to the first unblock request I didn't feel like asking for any more changes. Anyway, the blame would be obviously on me in case these do not get accepted due to the delay. > I suppose this resembles what we currently have in Lenny? No, lenny's dpkg does not do sync() nor fsync() before renames on extracted files from archives. dpkg 1.15.6 started doing fsync() for extracted files. dpkg 1.15.7.2 switched from fsync() to sync() on Linux. > What are the implications for ext4 and btrfs? They are going to be slower with this change. > Why was it changed in the first place? I wrote about this in <http://lwn.net/Articles/392599/>. Extremely short summary: because fsync() was considered too slow on ext4 (#578635). > > 2) Add a new --force-unsafe-io to disable those fsync() for people > > using certain file systems where the only option is to choose > > between reliability or acceptable performance. Or in cases where > > the data is cached and it might not be important to lose it. > > > > Bug: #584254 > > > > <http://git.hadrons.org/?p=debian/dpkg/dpkg.git;a=commitdiff;h=0484cd7d> > > So even when you don't do sync() anymore you still want to disable those > fsyncs()? Given the performance degradation the first patch will introduce on at least ext4, users might want to disable fsync()s. Arguably they seem to want to do that now even with sync(), as people are already using hacks like eatmydata to disable them, the problem is that this will disable *all* syncs, including the ones for the database among others. If the file system gets corrupted you might still have a chance of recovering, if the databases do, you most probably need either backups or a reinstall. > Granted, it's a self-contained new feature. What's the use case > for squeeze? d-i I heared, but I'm not sure that they'd use it in this > cycle. And tmpfs users, or who? Other users are buildds using fresh unpacked snapshots for each build session. ext4 users who would prefer faster dpkg runs rather than data safety. The problem with sync() is that even if it gives better performance than fsync()s on ext4, it's just the wrong thing to use, sync() affects all mount points, so if you happen to have for example a server with lots of writes on say a /srv mount point, and the admin decides to do a large install/upgrade, it will force all that data down to all mount points, for each archive unpacked. The target for this is ext4 users mainly, the syncs were included mostly to protect them from data loss. They are also the ones suffering the I/O performance degradation. I don't think it's fair to make users not suffering such problems suffer unrelated issues like the ones imposed by sync(). As such I think it should be them who have to decide between performance or data loss. Also I found it interesting that the ext4 developers were complaining about userspace developers not doing the right thing regarding atomic renames (which would include fsync()) during the huge zero-length discussions, to then state that fsync() is obviously too slow and heavy handed and thus should better not be used. <https://bugzilla.kernel.org/show_bug.cgi?id=15910> Ultimately I just think this is a problem with ext4. Regardless of that, I'll try to come up with a better, portable alternative to improve the performance for 1.16.x if possible. Finally, while accepting patch 2) alone might make sense, accepting 1) alone would not. thanks, guillem -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/[email protected]

