On 2012-08-08, at 2132, Eric Shubert wrote:
> 
> #define MAX_USERS_PER_LEVEL 100
> ...
> 
> In an ext3 environment, it could be set (by the admin) to 30000 (ext3 
> supports 32000 subdirectories), and with ext4 it could be set to 60000 (ext4 
> supports 64000). These settings would for the most part disable hashed 
> directories, while still allowing hashes should the filesystem limits be 
> approached. Of course, a default value in dir_control could still be 100, 
> which would maintain former behavior. If this were done, the 
> --disable-users-big-dir option should probably be changed to 
> --allow-single-digit-users as well. ;)
> 
> Please let me know what the prospects of such changes are. If it doesn't look 
> like anything that might ever happen in this area, I just may patch the 
> vauth.h file to be 30000 and call it done.

The filesystem's limit on how many entries can exist in a directory is not the 
only issue... the other issue is performance.

On most filesystems (including ext2/3/4), in order to find a particular file 
within a directory, the kernel has to do a linear search on the contents. It 
can take longer to do a linear search across 30K items than it does to search 
through 100 entries, open a new directory, and do a second search through 100 
entries. This isn't an issue for filesystems which implement directories as 
binary trees instead of linear lists.

Personally, I don't build servers without both hashing options enabled. The 
hashing doesn't affect small machines (or small domains) because it doesn't 
kick in until a certain number of domains or mailboxes exist. And if the server 
becomes busy after the fact, the hashing code kicks in when needed and keeps 
mailbox access from being slow.

The scripts that I write which access the mailboxes all use "vdominfo" or 
"vuserinfo" (or the qmail virtualdomains and users/assign files, and the 
domain's vpasswd.cdb file) to locate the directories, rather than making 
assumptions about where a particular domain or mailbox might be on the disk. 
This way I'm using the same exact method that qmail uses to deliver mail, so I 
know I'm ending up in the right place.

If I'm not mistaken, the limitation on single-character mailbox names has 
something to do with how the hashing is implemented. The hash directories all 
have single-digit or single-letter names, and if a mailbox exists with the same 
name, it causes problems (or at least confusion.) Personally, I always thought 
they should have given the hash directories names which aren't used in SMTP 
address, like ",0" or ",a", but that's not how it was originally written.

--------------------------------------------------------
| John M. Simpson  --  KG4ZOW  --  Programmer At Large |
| http://www.jms1.net/                 <j...@jms1.net> |
--------------------------------------------------------

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

!DSPAM:5023ce3d34216837713534!

Reply via email to