Wagner, Patrick:
[ Charset windows-1252 converted... ]
> On 10.12.2016 22:12, Wietse Venema wrote:
> > So there are two possible explanations:
> >
> > 1) Your SMTP server was recently under overload (look for "STRESS"
> > in the maillog file). To avoid accepting unverified mail under
> > overload, remove the "unverified_recipient_defer_code = 250" setting.
> >
> > 2) Your address_verify_map database is corrupted. Remove the .db
> > file, and execute "postfix reload".
> >
> >     Wietse
> According to the log files the server wasn't in STRESS mode at this 
> point in time (about an hour before, it had entered and left STRESS mode 
> within 6 seconds), so this leaves a corrupted verify_cache.db.

That is incorrect. STRESS mode persists for at least 1000 seconds.

        if (serv->stress_param_val != 0) {
            now = event_time();
            if (serv->busy_warn_time < now - 1000) {
                serv->busy_warn_time = now;
                msg_warn("service \"%s\" (%s) has reached its process limit \"%d

The STRESS warning is logged at 1000-second intervals. All this is
done to avoid spamming the logfile and flapping the service as the
load fluctuates.

> I've removed the database, reloaded postfix, and a test mail to this 
> non-existent address has correctly detected and honored the 
> undeliverable status and I got the NOQUEUE: reject line that I expected.

So it was a corrupted database, which means that (the information
you read) differs from (the information you wrote).

> Apparently the probe couldn't update the address status to 
> "undeliverable" result in the DB - the address in question was actually 

Postfix DOES NOT IGNORE database write errors.  The database wrote
the information somewhere, but it could not find the information
later.

> Apparently the probe couldn't update the address status to 
> "undeliverable" result in the DB - the address in question was actually 
> perfectly valid until November 22th, so still within 
> address_verify_positive_expire_time = 31d , but not 
> address_verify_positive_refresh_time = 7d, which is why postfix kicked 
> off the probe every time someone tried to send a mail to this recipient.

A "bad" refresh does not destroy a "good" cache entry.

            /*
             * Robustness: don't allow a failed probe to clobber an OK
             * address before it expires. The failed probe is ignored so that
             * the address will be re-probed upon the next query. As long as
             * some probes succeed the address will remain cached as OK.
             */
            if (addr_status == DEL_RCPT_STAT_OK
                || (raw_data = dict_cache_lookup(verify_map, STR(addr))) == 0
                || STATUS_FROM_RAW_ENTRY(raw_data) != DEL_RCPT_STAT_OK) {
                probed = 0;
                updated = (long) time((time_t *) 0);
                verify_make_entry(buf, addr_status, probed, updated, STR(text));
                if (msg_verbose)
                    msg_info("PUT %s status=%d probed=%ld updated=%ld text=%s",
                        STR(addr), addr_status, probed, updated, STR(text));
                dict_cache_update(verify_map, STR(addr), STR(buf));
            }

> There's no way to monitor for this kind of corruption then, as I've got 
> no messages in my log telling me that the verify service was unable to 
> update certain database entries?

The update did happen. So the question is how would you detect that
that you wrote has disappeared?  Note that this corruption may
happen while other transactions are made to the database. 

If you can reliably detect that information disappears after an
abritrary number of unrelated transactions, then there may be a
Turing award waiting for you.

        Wietse

Reply via email to