Well;
People... (Dan?) Is there any particular reason why qmail-rspawn's default
behaviour is to mark failures as permanent, instead of temporary?
--
Here's why I ask... Our relays have an incoming interface, and an outgoing
interface... also, those relays send the mail back in if it's recipient is a customer,
and send it out if not. Yesterday, one of the mail relays my ISP uses for customers
had a hardware problem: the "outgoing" NIC was malfunctioning. So, qmail-remote
crashed whenever it was called by qmail-rspawn.
According to this snippet of rspawn's code...
switch(wait_exitcode(wstat))
{
case 0: break;
case 111: substdio_puts(ss,"ZUnable to run qmail-remote.\n"); return;
default: substdio_puts(ss,"DUnable to run qmail-remote.\n"); return;
}
The default behaviour is to return "D" status (permanent failure)...
So, for a period of 13 hours, over 12k mails were lost (bounced to
postmaster), instead of remaining in queue. (needless to say, that was not a nice
thing to happen)
I've done a few tests (removing the exec bit from qmail-remote, and replacing
qmail-remote with a non-functioning binary) and confirmed all mails are bounced to
postmaster, instead of being queued.
Then, I changed the default exit code to "Z" (temporary failure), recompiled,
and ran the same tests. This time, it went like it should... mails were not delivered,
but were stored in queue... qmail-remote was restored, I SIGALRM'ed qmail-send, and
all of the test messages were delivered.
So... Is there any reason why it should return "D"? Or is there any reason why
it shouldn't return "Z"?
Best regards;
Ricardo Cerqueira
+-------------------
| Ricardo Cerqueira
| PGP Key fingerprint - B7 05 13 CE 48 0A BF 1E 87 21 83 DB 28 DE 03 42
| Novis - Rede T�cnica
| P�. Duque Saldanha, 1, 7� E / 1050-094 Lisboa / Portugal