Problem 1:

Various qmail components check the return code of processes they spawn,
and use that value to decide whether the processes completed
successfully, or whether the failure was permanent or transient (zero
return code is always interpreted as a success, termination by a signal
is always interpreted as a transient failure).

source (executed program)        return code   interpretation

qmail.c (qmail-queue)            11..40,115     permanent f.
                                 other          transient f.
qmail-rspawn.c (qmail-remote)    111            transient f.
                                 other          permanent f. (!!!)
qmail-lspawn.c (qmail-local,     QLX_EXECHARD   permanent f.
                qmail-getpw)     other QLX_*    transient f.
                                 111,71,74,75   transient f.
                                 other          permanent f. (!!!)
qmail-local.c (|command)         100,64,65,70,  permanent f.
                                 76,77,78,112
                                 99             special
                                 other          transient f.

You can see that qmail-rspawn.c and qmail-lspawn.c interpret all unknown
errors as permanent. Unfortunately, many existing dynamic linker
implementations (see appendix A) call _exit() with some specific return
code when the dynamic linking fails (e.g. due to a momentary shortage of
free resources). qmail-[rl]spawn would interpret it as a permanent
failure and bounce the mail. If your karma is bad enough, and
qmail-queue keeps working even when qmail-remote and qmail-local keep 
failing in that "permanent" way, the bounce will turn into a
double-bounce, and the double-bounce will be bounced into oblivion. In
other words, qmail will discard the mail for something that looks a
lot like a "frivolous reason". I leave it up to you to decide how the
blame should be distributed between qmail and the dynamic linker.

Surprisingly, qmail.c and qmail-local.c are much more careful: everything
but the explicitly named values is interpreted as a transient failure
(this also means a mistyped command after | in .qmail will probably not
cause bounces, see appendices B and C.)

(Well, qmail-queue's return codes (interpreted in qmail.c) indicating
a permanent failure collide with the codes used by the old Linux ld.so
but this is not very important because libc5 is really obsolete.)


Problem 2:

I think someone have already mentioned it: qmail inteprets execve()
failure as a permanent delivery failure in many situations when it makes
no sense.

source (execute program)         interpretation of execve() failure

qmail.c (qmail-queue)            transient (_exit(120))
qmail-rspawn.c (qmail-remote)    determined by error_temp(errno)
qmail-lspawn.c (qmail-local)     determined by error_temp(errno)
qmail-lspawn.c (qmail-getpw)     transient (_exit(QLX_EXECPW))
qmail-local (|command)           transient (_exit(111))

error_temp(errno) checks whether the error is of transient nature (i.e. it
is going to disappear soon without an external intervention). But is a
missing (ENOENT) or corrupted (ENOEXEC) file, or a file with incorrect
permissions (EACCES) a good reason to start bouncing mails? (With a
"very descriptive" error message saying something like "Unable to run
qmail-remote".)


Appendix A: dynamic linker failure:

Linux ld-linux.so 1.9.9 (libc5 ld.so)    _exit(N) for 1 <= N <= cca. 20
Linux glibc 2.0.7, 2.1.3 ld-linux.so.2   _exit(127)
FreeBSD 3.4-STABLE ld-elf.so.1           _exit(1)
Solaris 2.5, 2.6 ld.so.1                 kill(getpid(), SIGKILL)     
HP-UX 10.20 dld.sl                       ??? _exit(12) (*)

(*) dld.sl(5) says the dynamic linker will abort (SIGABRT) when it
encounters a fatal error but the only failure I was able to induce
experimentally lead to _exit(12)


Appendix B: sh -c '('  (syntax error)

GNU bash 1.14.7, 2.01.1       _exit(2)
FreeBSD 3.4-STABLE sh         _exit(2)
Solaris 2.6 sh                _exit(2)
HP-UX 10.20 sh                _exit(2)


Appendix C: sh -c blah  (nonexistent command)

GNU bash 1.14.7, 2.01.1       _exit(127)
FreeBSD 3.4-STABLE sh         _exit(127)
Solaris 2.6 sh                _exit(1)
HP-UX 10.20 sh                _exit(1)


--Pavel Kankovsky aka Peak  [ Boycott Microsoft--http://www.vcnet.com/bms ]
"Resistance is futile. Open your source code and prepare for assimilation."

Reply via email to