retitle 536598 consider using a hash-directory layout or a filename map severity 536598 wishlist
also sprach martin f krafft <madd...@debian.org> [2009.07.11.1542 +0200]: > The problem is that rsync (or tar) fail to copy all entries in large > directories (50,000+ entries), because apparently the directory > index (dir_index feature of ext2/3) gets exhausted. The problem was that the destination filesystem has a 1k block size, since it was originally intended to be used as Maildir storage. Theodore Tso explains in the thread [0] that the block size b (in kilobytes) determines the size of the directory index: n = 200,000 × b³ which is 200,000 for 1k blocks, 1.6 million for 2k blocks, and 12.8 million for 4k blocks. I don’t know where the 200,000 constant comes from. 0. http://www.linux-archive.org/ext3-users/90496-ext3_dx_add_entry-directory-index-full.html In my case, using 4k (or even just 2k) fixed the issue. Nevertheless, > Anyway, the problem is a function of encfs, which inflates the > filenames. Notably, the problem occurs with block-encrypting > filenames, *as well as* stream encryption. > > Arguably, encfs might simply not be usable for this use-case, but on > the other hand I think that it wouldn't be too hard to solve this > problem, for instance by hashing each directory transparently. > > A trivial implementation might be the following: since encrypted > filenames seem to be made up of letters, digits, and some special > characters, let's assume the set of possible characters is > 26+10+6==42. It would already help if each directory had 42 > single-letter/digit subdirectories and files would be sorted into > those accordingly. > > An alternative might be to store all files in a giant 3-4-level > directory hash structure and to maintain an (encrypted) database of > filename -> hashed file mappings. In --reverse mode, this database > would have to be virtual and simulated by the encfs code. I think these two options are still worth considering, especially in the light of #536752. -- .''`. martin f. krafft <madd...@d.o> Related projects: : :' : proud Debian developer http://debiansystem.info `. `'` http://people.debian.org/~madduck http://vcs-pkg.org `- Debian - when you have better things to do than fixing systems
digital_signature_gpg.asc
Description: Digital signature (see http://martin-krafft.net/gpg/)