On Tue, 13 May 2014 07:55:56 +0200 Ulrich Mueller wrote: > >>>>> On Tue, 13 May 2014, Andrew Savchenko wrote: > > > Please consider that by default du shows block size, not byte size. > > Than means that if file is actually 1234 bytes large, without -b it > > will be still accounted for 4096 bytes on 4K-block filesystem. > > This raises another question, namely if files with <= 4096 bytes size > should be compressed at all? Portage already has a fixed size limit of > 128 bytes (see bug 169260), but maybe this could be made configurable.
In no doubt this limit should be configurable, because defaults fine for one setup may harm another. If we are trying to consider all possible cases, some filesystems may benefit even from compression of very small files (e.g. from 140 to 100 bytes) due to packing of multiple small files in the same inode. ReiserFS is a good example, but more may be somewhere there. If we are trying to consider a majority of users (and thus to select reasonable defaults), from disk usage + decompression overhead point of view it will be the best to store compressed files if they are at least one filesystem block smaller than original file. FS block size may be extracted runtime for any man or doc, or alike directory used, so this is doable. But this approach may overcomplicate implementation. Best regards, Andrew Savchenko
pgpD0q4Q5vY8_.pgp
Description: PGP signature
