Hi! [ Colin, CCing you as I'm not sure you follow the list. ]
On Fri, 12 Mar 2010 15:57:28 +0000, Colin Watson wrote: > I'm worried about the syncing changes though; > apparently they're *really* *really* pessimal on some systems, e.g. ext4 > with data=ordered (which considers rename() as a barrier itself so the > fsync() isn't necessary in that configuration). Scott James Remnant > reported that it took over an hour to unpack a linux-headers-* package! Ugh. :/ That's probably going to be really bad on buildds... > I don't know what the right answer is here. On the one hand, not > fsyncing kills reliability on some systems; on the other hand, fsyncing > kills performance on other systems. Yeah, it's a bit sad though that the changes to ext4 to accomodate non-behaving applications penalize so much the ones that try to do the right thing. On Fri, 2010-03-12 at 10:19:50 +0100, Raphael Hertzog wrote: > On Fri, 12 Mar 2010, Sven Joachim wrote: > > The decision to immediately fsync() all files written to disk has a > > detrimental influence on dpkg's performance, especially when unpacking > > large packages. On my system which has a 2.5" hard disk with an ext4 > > filesystem, installing emacs23-common (containing 2123 files) with a hot > > cache takes 76 seconds, almost all of which is spent during unpacking. > > With dpkg 1.15.5.6, it takes 5.3 seconds, including processing three > > triggers. This is really painful. :-( > Not as bad here with ext3 but still worrying, taking gnome-icon-theme > (6534 files) I get ~3 seconds for 1.15.5.6 and 12 seconds for 1.15.6. > > That's still a 300% increase. Right, I didn't see much degradation on ext3, or I'd probably would have considered doing alternative changes instead. I'll be testing on a slower box with ext3 to see how it behaves there though. > Removing the single fsync() added in tarobject() completely restores the > original performance. Adding a single sync() after the whole unpack has > way less impact (1 or 2 seconds more). > Other possibility would be to use the loop afterwards to reopen all > installed files and call fsync() on them. The disadvantage of sync() is > obviously when unrelated disk activity happens in parallel to dpkg, it > will have to wait more due to this. Neither of those are good replacements, as the fsync() must be done before the rename(), as we want the guarantee that there's always a valid file in place in case of a crash, either the old or the new, which dpkg should be able to discern and roll-back if needed on reexecution. Doing a sync afterwards might only guarantee the package is not wrongly marked as properly installed if there's a system crash, but that's it. And in such case there's a high probability the files will be zero-length, which would be pretty bad for example Essential packages. In addition POSIX does not guarantee sync() will wait until the writes have finished (only Linux seems to be doing that though). A possible solution could be to do the unpack for all files in a package, just leaving the new files as file.dpkg-new, not do either of fsync() or replace, and with one pass afterwads fsync() and do the "atomic" (except for dirs etc) replace. I guess this might improve a bit the situation for packages with lots of files, but not sure how much. regards, guillem -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/[email protected]

