On 5 Jan 2011, at 17:29, Richard Clayton wrote:

> -----Original message-----
> Subject:    [exim] exim dies on the interrupted system call
> To:         [email protected]
> Cc:         
> =?UTF-8?Q?=D0=9A=D0=B0=D1=8F=D0=BB=D0=B0=D0=B9=D0=BD=D0=B5=D0=BD?= 
> <[email protected]>,
>            [email protected]
> From:       David Woodhouse <[email protected]>
> Date:       Sat, 1 Jan 2011 00:43:01 +0000
> Message-ID: <[email protected]>
> 
> On Mon, 2010-12-27 at 21:39 -0500, Phil Pennock wrote:
>> This is a bug in Exim.  Looking at the code, I'm rather shocked that
>> it has never bitten us before now. 
> 
> It doesn't bite because most operating systems don't actually return
> short writes on a real file except on EOF. Even though POSIX permits
> them to.
> 
> (The case you've seen is actually returning -1 / EINTR rather than a
> short write where it writes fewer bytes than you asked, but that's just
> a special case of the same thing.)
> 
> In Linux we avoid doing short writes because we *know* a lot of
> userspace will break if we do that. Exim will not be the only program
> which breaks on the FreeBSD system in question.
> 
> But yes, strictly speaking it *is* a bug in Exim. There are a bunch of
> write() calls which we should wrap with our own function that loops
> until it's either written all it had to write, or got a *real* error.

Hi David, et al,

As you observe, returning (-1, EINTR) is probably technically to spec and 
correct, but actually something you never want the file system to do. The only 
cases I'm aware of where FreeBSD file systems intentionally return EINTR are 
soft mounts of NFS, or in some rare edge cases, Coda (and maybe AFS by 
implication). As such, I'd consider it a bug if EINTR is getting returned from 
write(2) on a regular file in UFS2 -- and also a surprising one.

It would be worth tracking this down a bit more, since if such a bug does 
exist, we want to fix it. Is there any chance the write(2) is being sent to a 
FIFO in the file system, rather than a regular file, or even a socket? Could 
Exim have its file descriptors mixed up? Is Exim using threading, in which case 
we could be looking at a threading library bug?

(Normally sleeps performed inside the file system on block I/O are 
non-interruptible, for all the reasons cited above).

Robert
-- 
## List details at http://lists.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/

Reply via email to