[vchkpw] Delays in vdelivermail to large default domain

2004-03-04 Thread Japheth Cleaver
Hi there,

I had a big giant email planned here, but as I was writing it I narrowed 
down the scope of the problem we're having to a recursive stat call (I 
think) in vdelivermail.c

First, some background on the setup:

I'm in the process of migrating a 12 G, ~5000 user 
sendmail/aliases/virtualuser system to a qmail/vpopmail one, using MySQL as 
the backend and am having a single problem holding me up.

We've got a cluster of 3 delivery machines, with a /vpopmail parition 
shared over NFS. The NFS server is also the MySQL DB server where the 
backend is done. /vpopmail is a 3Ware RAID 10 running ReiserFS. (We've 
tried both the defaults and noatime/notail.)

All the 800 or so virtual domains are empty (save for the postmaster 
account) and filled with .qmail-vuser files that forward to 
[EMAIL PROTECTED] When a vpopmail user is made at one of those 
domains, delivery happens instantaneously. Delivering to any vpopmail user 
at the default domain results in vdelivermail hanging for 2-10 minutes 
before finally delivering the message.

vuserinfo -d [EMAIL PROTECTED] works fine, which led me to 
believe it was not a MySQL table problem (we're not using many_domains).

The vdelivery hang occurs whether delivering directly ON the NFS server, or 
delivering on one of the cluster servers (though the time of the delay 
varies unpredictably), which leads me to think that it's not an NFS 
problem. Standard NFS read/writes are fine.

Additionally, copying files into and out of user's Maildirs manually works 
fine, and squirrelmail and courier-imap are handling the situation fine as 
well.

Attempted delivery to non-existant addresses gives a failure message 
immediately.

Manual testing was done with a line like below, to verify it wasn't 
anything else in qmail:

cat /vpopmail/testing/samplemail.txt | env EXT=cleaver 
HOST=defaultdomain.com vdelivermail '' bounce-no-mailbox



Okay, as I was writing the above message, I decided to strace the running 
vdelivermail process and discovered that vdelivermail was looping here:

stat64(/etc/vpopmail/domains/defaultdomain.com/5/charlenes/Maildir//new/1078418383.M015727P2293.haku.defaultdomain.com, 
{st_mode=S_IFREG|0644, st_size=11180, ...}) = 0
stat64(/etc/vpopmail/domains/defaultdomain.com/5/charlenes/Maildir//new/1078418397.M208677P5866.haku.defaultdomain.com, 
{st_mode=S_IFREG|0644, st_size=2123, ...}) = 0
stat64(/etc/vpopmail/domains/defaultdomain.com/5/charlenes/Maildir//new/1078418401.M185492P7109.haku.defaultdomain.com, 
{st_m

 [later]
stat64(/etc/vpopmail/domains/defaultdomain.com/E/gary/Maildir//new/1078419549.M564758P6609.haku.defaultdomain.com, 
{st_mode=S_IFREG|0644, st_size=2744, ...}) = 0
stat64(/etc/vpopmail/domains/defaultdomain.com/E/gary/Maildir//new/1078419549.M438602P6573.haku.defaultdomain.com, 
{st_mode=S

It appears to be stating every single message in every user underneath the 
default domain's directory(!). Given that there is about 12 GB of mail 
that's being transferred over in the test systems (before we go live), that 
would explain the long delay. As it gets cached by NFS or the local disk 
array, the time the stats take vary.

Any ideas on why it might be doing this? I'm looking over count_dir in 
vdelivermail.c right now and not seeing it. =(

Sincerely,
Japheth J.C. Cleaver


Re: [vchkpw] Delays in vdelivermail to large default domain

2004-03-04 Thread Tom Collins
On Mar 4, 2004, at 1:36 PM, Japheth Cleaver wrote:
It appears to be stating every single message in every user underneath 
the default domain's directory(!). Given that there is about 12 GB of 
mail that's being transferred over in the test systems (before we go 
live), that would explain the long delay. As it gets cached by NFS or 
the local disk array, the time the stats take vary.
Maybe domain quotas were turned on, and it's trying to see how much 
space is used?

--
Tom Collins  -  [EMAIL PROTECTED]
QmailAdmin: http://qmailadmin.sf.net/  Vpopmail: http://vpopmail.sf.net/
Info on the Sniffter hand-held Network Tester: http://sniffter.com/


Re: [vchkpw] Delays in vdelivermail to large default domain

2004-03-04 Thread Japheth Cleaver
D'oh! That makes total sense, and I hadn't considered it all. I've 
recompiled with --disable-domainquotas and things are delivering fine.

It might be worth putting a warning in the migration FAQ about long 
delivery times for people who move everything over to a single domain like 
this...

Thanks again!

-jc

At 12:55 PM 3/4/2004, you wrote:
On Mar 4, 2004, at 1:36 PM, Japheth Cleaver wrote:
It appears to be stating every single message in every user underneath 
the default domain's directory(!). Given that there is about 12 GB of 
mail that's being transferred over in the test systems (before we go 
live), that would explain the long delay. As it gets cached by NFS or the 
local disk array, the time the stats take vary.
Maybe domain quotas were turned on, and it's trying to see how much space 
is used?

--
Tom Collins  -  [EMAIL PROTECTED]
QmailAdmin: http://qmailadmin.sf.net/  Vpopmail: http://vpopmail.sf.net/
Info on the Sniffter hand-held Network Tester: http://sniffter.com/