Daniel Roethlisberger <[email protected]> 2009-07-10:
> I guess it is easier to lock the queue file instead of fully
> reopening and fixing the unlink() race.

I cleaned up my original hack, maybe this version is a tad more
convincing :)

-- 
Daniel Roethlisberger
http://daniel.roe.ch/
>From c1d6aaa03e765d3f6d71b0e6300b303bf44a4200 Mon Sep 17 00:00:00 2001
From: Daniel Roethlisberger <[email protected]>
Date: Fri, 10 Jul 2009 11:53:42 +0200
Subject: [PATCH] Fix queue file seek position races by locking

Lock the per-message queue file during delivery to each recipient.
When delivering mail to multiple recipients, dma(8) fork for each
recipient and attempt parallel delivery.  Separate delivery
processes for multiple recipients of the same message all use the
same FILE* pointing to the same file descriptor on the same file
pointer and thus a common seek position.  If a process lost the
CPU during the delivery process, another delivery process for the
same message will wreak havoc on the seek position of the shared
queue file.  This resulted either in silent corruption of the
delivered message content or bounces due to apparent queue file
corruption.
---
 libexec/dma/dma.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/libexec/dma/dma.c b/libexec/dma/dma.c
index 511b936..e520199 100644
--- a/libexec/dma/dma.c
+++ b/libexec/dma/dma.c
@@ -612,6 +612,7 @@ deliver(struct qitem *it)
 	const char *errmsg = "unknown bounce reason";
 	struct timeval now;
 	struct stat st;
+	struct flock fl;
 
 	syslog(LOG_INFO, "%s: mail from=<%s> to=<%s>",
 	       it->queueid, it->sender, it->addr);
@@ -620,11 +621,29 @@ retry:
 	syslog(LOG_INFO, "%s: trying delivery",
 	       it->queueid);
 
+	bzero(&fl, sizeof(fl));
+	fl.l_type = F_WRLCK;
+	fl.l_whence = SEEK_SET;
+	if (fcntl(fileno(it->queuef), F_SETLKW, &fl) == -1) {
+		syslog(LOG_ERR, "%s: failed to lock queue file: %m",
+				it->queueid);
+		goto bounce;
+	}
+
 	if (it->remote)
 		error = deliver_remote(it, &errmsg);
 	else
 		error = deliver_local(it, &errmsg);
 
+	bzero(&fl, sizeof(fl));
+	fl.l_type = F_UNLCK;
+	fl.l_whence = SEEK_SET;
+	if (fcntl(fileno(it->queuef), F_SETLKW, &fl) == -1) {
+		syslog(LOG_ERR, "%s: failed to unlock queue file: %m",
+				it->queueid);
+		/* let `error' decide whether we bounce or not */
+	}
+
 	switch (error) {
 	case 0:
 		unlink(it->queuefn);
-- 
1.6.0.4

Reply via email to