Mikhael Goikhman <[EMAIL PROTECTED]> writes: > % revision=archzoom--devel--0--patch-300 > % cd `tla library-find $revision`/.. > % tar cf - --exclude $revision/,,patch-set --exclude $revision/,,index \ > --exclude $revision/,,index-by-name $revision | gzip -9 >$revision.tar.gz > % du -s --block-size=1 $revision > % ls -s --block-size=1 $revision.tar.gz > 3403776 archzoom--devel--0--patch-300 > 163840 archzoom--devel--0--patch-300.tar.gz > > The ratio is 21. There is a small, but increasing gain when compared with > earlier revisions (18), in particular because {arch} contains a lot of > small files that are compressed nicely. Probably better than hardlinking.
You're comparing the size of a *single* revision directory against tar+gz. This doesn't make much sense since, by definition, the hard link trick compresses data *across* several revisions. > Please don't forget that a hardlink costs more than 0, Can you elaborate on that? > and also that for > every merged external revision there are at least 2 more files, in {arch} > and ,,patch-log/, and possibly new subdirs too (not hardlink-able). Right. > For me (and for du/rm) it is not the size, but number of inodes that is > more important, so this very CPU expensive solution would not solve much. There are several good papers on the topic [0,1,2]. I'm pretty confident that hard link + gzip of individual files would yield a better compression ratio than keeping several whole revision tarballs, *when* several subsequent revisions are kept. Thanks, Ludovic. [0] http://ssrc.cse.ucsc.edu/Papers/you-mss04.pdf [1] http://ssrc.cse.ucsc.edu/Papers/you-icde05.pdf [2] http://www.usenix.org/events/usenix04/tech/general/full_papers/kulkarni/kulkarni_html/paper.html _______________________________________________ Gnu-arch-users mailing list Gnu-arch-users@gnu.org http://lists.gnu.org/mailman/listinfo/gnu-arch-users GNU arch home page: http://savannah.gnu.org/projects/gnu-arch/