Peter van Dijk and markd wrote:
> > > Or, *duh*: the homedir check is in qmail-getpw. Since you've already
> > > modified it, modify it some more :)
> >
> > Right. But he may not actually have to check for the existance of HOME
currently
> > and in any event there is a timing window between qmail-getpw and the
> > invocation of qmail-local. So it may disappear after the check in
qmail-getpw.
>
> That's what I thought, I considered a race attack, but there is none.
> qmail-local *defers* on homedir failures. Only qmail-getpw actually
> *bounces* on homedir failures.
>
> He's using a *modified* qmail-getpw, not a rewritten one. The homedir
> check is probably just still in there.
>
> > Having said all that, qmail-local exit with a *temp* error if it cannot
> > stat the home directory, so I'm not sure what the exact problem is. If
the
> > nfs home is gone, then this stat() should fail at some point and defer
> > the delivery.
>
> Yeah, that's because qmail-getpw does the bouncing.
Makes sense. Okay, so if I make qmail-getpw either not do a directory
check, or handle the results differently, then there shouldn't be any lost
or bounced email, even if the NFS mount happens to disappear between
qmail-getpw and qmail-local. Correct?
> > The only general problem is that the NFS timeouts may clog the
concurrencylocal
> > limits, but then if you have no homes, there's nothing to delivery
anyway.
>
> That depends. Where I work we have homedirs spread over about 40
> userservers, which means indeed one can be down while the others are up.
There will only be one server for user directories, at least to begin with.
So, yeah, hitting the concurrencylocal limit won't be an issue.
Michael Boyiaz's idea is a good one too. Sounds like it would make planned
outages easy to wade through.
Thanks for the input!
---Kris Kelley