Daniel Roethlisberger <[email protected]> 2009-07-10: > I guess it is easier to lock the queue file instead of fully > reopening and fixing the unlink() race.
I cleaned up my original hack, maybe this version is a tad more convincing :) -- Daniel Roethlisberger http://daniel.roe.ch/
>From c1d6aaa03e765d3f6d71b0e6300b303bf44a4200 Mon Sep 17 00:00:00 2001 From: Daniel Roethlisberger <[email protected]> Date: Fri, 10 Jul 2009 11:53:42 +0200 Subject: [PATCH] Fix queue file seek position races by locking Lock the per-message queue file during delivery to each recipient. When delivering mail to multiple recipients, dma(8) fork for each recipient and attempt parallel delivery. Separate delivery processes for multiple recipients of the same message all use the same FILE* pointing to the same file descriptor on the same file pointer and thus a common seek position. If a process lost the CPU during the delivery process, another delivery process for the same message will wreak havoc on the seek position of the shared queue file. This resulted either in silent corruption of the delivered message content or bounces due to apparent queue file corruption. --- libexec/dma/dma.c | 19 +++++++++++++++++++ 1 files changed, 19 insertions(+), 0 deletions(-) diff --git a/libexec/dma/dma.c b/libexec/dma/dma.c index 511b936..e520199 100644 --- a/libexec/dma/dma.c +++ b/libexec/dma/dma.c @@ -612,6 +612,7 @@ deliver(struct qitem *it) const char *errmsg = "unknown bounce reason"; struct timeval now; struct stat st; + struct flock fl; syslog(LOG_INFO, "%s: mail from=<%s> to=<%s>", it->queueid, it->sender, it->addr); @@ -620,11 +621,29 @@ retry: syslog(LOG_INFO, "%s: trying delivery", it->queueid); + bzero(&fl, sizeof(fl)); + fl.l_type = F_WRLCK; + fl.l_whence = SEEK_SET; + if (fcntl(fileno(it->queuef), F_SETLKW, &fl) == -1) { + syslog(LOG_ERR, "%s: failed to lock queue file: %m", + it->queueid); + goto bounce; + } + if (it->remote) error = deliver_remote(it, &errmsg); else error = deliver_local(it, &errmsg); + bzero(&fl, sizeof(fl)); + fl.l_type = F_UNLCK; + fl.l_whence = SEEK_SET; + if (fcntl(fileno(it->queuef), F_SETLKW, &fl) == -1) { + syslog(LOG_ERR, "%s: failed to unlock queue file: %m", + it->queueid); + /* let `error' decide whether we bounce or not */ + } + switch (error) { case 0: unlink(it->queuefn); -- 1.6.0.4
