Am 14.02.2012 10:57, schrieb Joerg Schilling: > Florian Philipp <[email protected]> wrote: > >>> Even if the i-nodes are sequential on-disk, there's no reason to think >>> that the data blocks associated with the inodes are in any particular >>> order with respect to the i-nodes themselves. >> >> You could probably find the intended order by using debugfs (at least >> for ext*). The following command should output the first physical block >> of every file: >> find /var/db/portage/ -type f -printf 'bmap <%i> 0\n' | sudo debugfs >> /dev/mapper/vg-portage > > This kind of order is not important for copy speed. > > Copy speed is dominated by write speed and write speed is dominated by seeks > that are a result of keeping meta data up to date. > > Jörg >
I cannot verify that hypothesis. Test setup: 1x 7200rpm 2,5" HDD /var/db/portage is my portage tree, ext4 /dev/mapper/vg-portage is its block device /tmp is ext4 First test --- copy whole tree just with `cpio` (performance tested and similar to `cp -a`): $ echo 1 >/proc/sys/vm/drop_caches $ time find /var/db/portage/ -type f -print0 | $ cpio -p0 --make-directories /tmp/portage/ real 11m52.657s user 0m1.848s sys 0m19.802s Second test --- Sort by starting physical block number: $ echo 1 >/proc/sys/vm/drop_caches $ FIFO=/tmp/$(uuidgen).fifo $ mkfifo "$FIFO" $ time find /var/db/portage/ -type f \ $ -fprintf "$FIFO" 'bmap <%i> 0\n' -print0 | $ tr '\n\0' '\0\n' | paste <( $ debugfs -f "$FIFO" /dev/mapper/vg-portage | $ grep -E '^[[:digit:]]+') - | $ sort -k 1,1n | cut -f 2- | tr '\n\0' '\0\n' | $ cpio -p0 --make-directories /tmp/portage/ $ unlink "$FIFO" real 2m8.400s user 0m1.888s sys 0m15.417s Using `xargs -0 cat >/dev/null` instead of `cpio` yields 9m27.745s and 1m11.087s, respectively. Some comments to the sorting script: - Using a fifo instead of a pipe for issuing commands to debugfs is faster. - If it is not obvious, the two `tr` commands are there because `paste` and `cut` cannot handle zero-terminated lines but file names might contain line breaks. - `grep` is there because `debugfs` echoes all commands. Filtering every odd numbered line should also work. - A production-ready script should probably use `join` instead of `paste` to deal with read errors of `debugfs` (for example if files are removed between `find` and `debugfs`). Currently, this leads to misaligned output. BTW: I wanted to test it with `star -copy` but this resulted in buffer overflows similar to these: http://permalink.gmane.org/gmane.comp.archivers.star.user/752 Regards, Florian Philipp
signature.asc
Description: OpenPGP digital signature

