On 1/26/2010 9:06 AM, Erik Sonn wrote:
Dear everyone,
I'm working on some Antispam-Proxy, using Postfix as MTA. Postfix is
2.6.2-RC1 on an Ubuntu 8.04 LTS base-system.
Preconditions:
* Postfix shall only accept mails addressed to valid (=existing)
recipients. To accomplish this, I'm using a regexp:/ map on
relay_recipient_maps (the specific file is called "usermaps").
* This usermaps file is automatically generated from an hourly cron-job,
fetching all valid email-addresses via LDAP (however, the Postfix
installation doesn't care about LDAP at all, this is autonomously done
by some perl script).
* The data gathered from LDAP is stuffed into a temporary file until
finished, and then "atomatically" copied over the original usermaps
file, before Postfix is triggered to reload.
Problem:
* At very irregular intervals, varying in time and quantity, Postfix
refuses to accept Mails because the recipient address is seemingly
unknown, altough that specific mail address (changes every time,
unpredictable) is correctly defined in the usermaps file. The
log-messages are like:
2010-01-26T15:10:29+01:00 hostmail postfix/smtpd[22884]: NOQUEUE:
reject: RCPT from smtp.citrix.com[66.165.176.89]: 550 5.1.1
<alexxxx...@xxxxxxx.de>: Recipient address rejected: User unknown in
relay recipient table; from=<no.repl...@citrix.com>
to=<alexxxxx...@xxxxxxxx.de> proto=ESMTP helo=<SMTP.CITRIX.COM>
* Assuming the hourly cron-job is executed 24 times a day, 1-4 times
Postfix logs the following message:
2010-01-26T08:57:25+01:00 hostmail postfix/smtpd[3398]: warning: regexp
map /etc/postfix/usermaps, line 2434: no closing regexp delimiter "/":
skipping this rule
The lines-number is always randomly changing, and I have made quite some
effort to make sure that the usermaps file is always complete,
syntactically correct and consistent. As you see, the logentry above is
timed "08:57:25" (the cron-job begins fetching addresses via LDAP always
at *:57).
Interestingly, my 'watch stat /etc/postfix/usermaps' shows this:
# Before the 08:57 cron-job touches usermaps
@Tue Jan 26 08:57:24 CET 2010
Access: 2010-01-26 07:57:24.000000000 +0100
Modify: 2010-01-26 07:57:22.000000000 +0100
Change: 2010-01-26 07:57:22.000000000 +0100
# After the 08:57 cron-job re-wrote usermaps, but Postfix hasn't read it
# yet
@Tue Jan 26 08:57:26 CET 2010
Access: 2010-01-26 08:57:25.000000000 +0100
Modify: 2010-01-26 08:57:25.000000000 +0100
Change: 2010-01-26 08:57:25.000000000 +0100
# After Postfix read the new usermaps after reloading
@Tue Jan 26 08:57:36 CET 2010
Access: 2010-01-26 08:57:35.000000000 +0100
Modify: 2010-01-26 08:57:25.000000000 +0100
Change: 2010-01-26 08:57:25.000000000 +0100
If you look at these times, the file is *read* by Postfix at 08:57:35,
but the log-line above claims the warning at 07:57:25. How can this be?
The 10 seconds delay is because of an intended sleep() between writing
the usermaps and reloading Postfix.
Moreover, when mails a rejected as described above, the *time* these
rejects happen do not seem to correlate with the regexp-warnings, nor do
the rejected recipient mail-addresses. It seems like everything happens
quite random here.
What I've already checked:
* Generation of usermaps file is OK and always succeeds. All addresses
are successfully fetched, the file is writen syntactically correct and
complete.
* I/O- and buffering-issues have been tested and shouldn't be the
problem (e.g. reloading Postfix while I/O buffer hasn't been flushed
yet).
* The basic Postfix configuration works perfectly and never made any
troubles. That usermaps issue seems to occur only then the usermaps is
getting large (>1k lines; in this specific case, it's about 10k lines
large).
The installation runs on a virtualized platform, using XEN. Postfinger
output is attached. I should also mention that, for various reasons,
it's not *easily* possible for me to simply upgrade the Postfix version.
Thank you very much,
Erik
Postfix is reading a half-written file. A new smtpd process
started while the file copy was in progress.
Running "postfix reload" on a busy system is a killer for
performance. With 10k entries, you'll be much better off
using the hash: or cdb: file type since these file types
detect changes automatically with no need for a reload.[1]
Here's an example how to fix your problem. Although this uses
hash:, the same basic idea (atomic move rather than copy) will
work with regexp: files.
http://www.postfix.org/DATABASE_README.html#safe_db
It's probably not a good idea to run -RC level software long term.
-- Noel Jones
[1] half-baked workaround: include an empty hash: file along
with your regexp file in your config, then rebuild the hash:
file whenever the regexp file changes.