Hash: SHA1

On 8/9/2012 9:50 AM, John Simpson wrote:
> On 2012-08-08, at 2132, Eric Shubert wrote:
>> #define MAX_USERS_PER_LEVEL 100
>> ...
>> In an ext3 environment, it could be set (by the admin) to 30000 (ext3 
>> supports 32000 subdirectories), and with ext4 it could be set to 60000 (ext4 
>> supports 64000). These settings would for the most part disable hashed 
>> directories, while still allowing hashes should the filesystem limits be 
>> approached. Of course, a default value in dir_control could still be 100, 
>> which would maintain former behavior. If this were done, the 
>> --disable-users-big-dir option should probably be changed to 
>> --allow-single-digit-users as well. ;)
>> Please let me know what the prospects of such changes are. If it doesn't 
>> look like anything that might ever happen in this area, I just may patch the 
>> vauth.h file to be 30000 and call it done.
> The filesystem's limit on how many entries can exist in a directory is not 
> the only issue... the other issue is performance.
> On most filesystems (including ext2/3/4), in order to find a particular file 
> within a directory, the kernel has to do a linear search on the contents. It 
> can take longer to do a linear search across 30K items than it does to search 
> through 100 entries, open a new directory, and do a second search through 100 
> entries. This isn't an issue for filesystems which implement directories as 
> binary trees instead of linear lists.
> Personally, I don't build servers without both hashing options enabled. The 
> hashing doesn't affect small machines (or small domains) because it doesn't 
> kick in until a certain number of domains or mailboxes exist. And if the 
> server becomes busy after the fact, the hashing code kicks in when needed and 
> keeps mailbox access from being slow.
> The scripts that I write which access the mailboxes all use "vdominfo" or 
> "vuserinfo" (or the qmail virtualdomains and users/assign files, and the 
> domain's vpasswd.cdb file) to locate the directories, rather than making 
> assumptions about where a particular domain or mailbox might be on the disk. 
> This way I'm using the same exact method that qmail uses to deliver mail, so 
> I know I'm ending up in the right place.
> If I'm not mistaken, the limitation on single-character mailbox names has 
> something to do with how the hashing is implemented. The hash directories all 
> have single-digit or single-letter names, and if a mailbox exists with the 
> same name, it causes problems (or at least confusion.) Personally, I always 
> thought they should have given the hash directories names which aren't used 
> in SMTP address, like ",0" or ",a", but that's not how it was originally 
> written.

John has basically said everything I was going to :)  The only thing I would
mention is that the 5.4.32 and 5.4.33 both include changes that re-populate old
hash directories that have been made lighter by user deletion.  It's the
"backfill" feature.
- -- 
    Matt Brookings <m...@inter7.com>       GnuPG Key 5F3258AD
    Software developer                     Systems technician
    Inter7 Internet Technologies, Inc.     (815)776-9465
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/


Reply via email to