Mats Luspa:
> Hello!
> 
> Thank you for the exhausting explanation of the problem.
> 
> Here you got the requested information about the system:
> root@outgoingmail-2:~# uname -a
> Linux outgoingmail-2 3.16.0-29-generic #39-Ubuntu SMP Mon Dec 15  
> 22:27:29 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

What is the name of the Ubuntu distribution?

Can you test mail to a non-existent address using a non-container
MTA?  Containers involve a complex blend of resource management
tweaks in order to achieve isolation.

I just tested Postfix 2.11.4 on a 3.17 kernel (Fedora Core 20) and
the bounce service works as expected.

Feb 14 13:45:44 fc20 postfix/smtp[8288]: 29891200686: 
to=<sjklhs...@porcupine.org>, relay=spike.porcupine.org[168.100.189.2]:25, 
delay=116, delays=110/0/5.9/0.04, dsn=5.1.1, status=bounced (host 
spike.porcupine.org[168.100.189.2] said: 550 5.1.1 <sjklhs...@porcupine.org>: 
Recipient address rejected: User unknown (in reply to RCPT TO command))
Feb 14 13:45:44 fc20 postfix/cleanup[8286]: 27A8420068D: 
message-id=<20150214184544.27a84200...@fc20.porcupine.org>
Feb 14 13:45:44 fc20 postfix/bounce[8289]: 29891200686: sender non-delivery 
notification: 27A8420068D
Feb 14 13:45:44 fc20 postfix/qmgr[8285]: 27A8420068D: from=<>, size=2466, 
nrcpt=1 (queue active)
Feb 14 13:45:44 fc20 postfix/qmgr[8285]: 29891200686: removed
Feb 14 13:45:44 fc20 postfix/local[8290]: 27A8420068D: 
to=<wie...@fc20.porcupine.org>, relay=local, delay=0.07, 
delays=0.02/0.01/0/0.04, dsn=2.0.0, status=sent (delivered to mailbox)
Feb 14 13:45:44 fc20 postfix/qmgr[8285]: 27A8420068D: removed

        Wietse

> root@outgoingmail-2:~# postconf mail_version
> mail_version = 2.11.1
> 
> I've installed postfix from the Ubuntu utopic package.
> 
> More insformation about the system is that it is a linux container.  
> But I don't think that has any effect to the software.
> 
> It's no problem for me to let you get access to the system to debug if  
> you would prefer that. If you are familiar with linux-containers I  
> could also make a clone and you could get it.
> I could also get the source code from the package, compile and add the  
> debugging code that you mentioned in the mail.
> 
> Mail me what method that you want to use.
> 
> /Regards Mats
> 
> Quoting Wietse Venema <wie...@porcupine.org>:
> 
> > Thanks for the "smtp -v/relay -v" logging.  Your logging confirms
> > that there is a bogus error talking to your bounce daemon.
> >
> > Although Postfix detects the bogus error, unfortunately it produces
> > no informative logging for this particular error.
> >
> > Questions:
> >
> > - What is the output from "uname -a"? Postfix has bug workarounds
> > for MacOSX and Solaris, but maybe they improved their software and
> > we need to add a new workaround.
> >
> > - What is the output from "postconf mail_version"? maybe you are
> > running a Postfix version that does not have all the known OS bug
> > workarounds.
> >
> > - Did you build Postfix from source code, or is this a binary
> > distribution? If you built from source I will ask you to modify the
> > source for better logging. If you installed a binary distribution,
> > I can set up a VM with the same OS and Postfix distribution and do
> > the experiments myself.
> >
> > The remainder of this email is annotated logging that shows problems
> > talking to the bounce daemon, in both the Postfix SMTP client and
> > in the queue manager. The logging also illustrates that is a bogus
> > error because the bounce service requests actually succeed.
> >
> > First, the Postfix SMTP client receives a 550 reply to an RCPT TO command,
> > and attempts to report the delivery error to the bounce service (this
> > appends the error report to a file /var/spool/postfix/48E68BF5F).
> >
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: <  
> > mail.telia.com[62.20.233.128]:25: 550 RCPT  
> > TO:<nonexistingm...@telia.com> User unknown
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > connect to subsystem private/bounce
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr nrequest = 0
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr flags = 0
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr queue_id = 48E68BF5F
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr original_recipient = nonexistingm...@telia.com
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr recipient = nonexistingm...@telia.com
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr offset = 751
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr dsn_orig_rcpt = rfc822;nonexistingm...@telia.com
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr notify_flags = 0
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr status = 5.0.0
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr diag_type = smtp
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr diag_text = 550 RCPT TO:<nonexistingm...@telia.com> User unknown
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr mta_type = dns
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr mta_mname = mail.telia.com
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr action = failed
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr reason = host mail.telia.com[62.20.233.128] said: 550 RCPT  
> > TO:<nonexistingm...@telia.com> User unknown (in reply to RCPT TO  
> > command)
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > connect to subsystem private/defer
> >
> > After "send attr reason" the bounce daemon is supposed to reply
> > with a status (success or failure). But the logging shows that the
> > Postfix SMTP client does not attempt to receive that response.
> > Instead it has decided that the bounce request could not be sent,
> > and makes a connection to the defer service to report that error.
> >
> > As we will see later the bounce request did actually succeed.
> >
> > Here is the function that sends the bounce request and receives the
> > status reply.
> >
> > int     mail_command_client(const char *class, const char *name,...)
> > {
> >     va_list ap;
> >     VSTREAM *stream;
> >     int     status;
> >
> >     /*
> >      * Talk a little protocol with the specified service.
> >      */
> >     if ((stream = mail_connect(class, name, BLOCKING)) == 0)
> >         return (-1);
> >     va_start(ap, name);
> >     status = attr_vprint(stream, ATTR_FLAG_NONE, ap);
> >     va_end(ap);
> >     if (status != 0
> >         || attr_scan(stream, ATTR_FLAG_STRICT,
> >                      RECV_ATTR_INT(MAIL_ATTR_STATUS, &status), 0) != 1)
> >         status = -1;
> >     (void) vstream_fclose(stream);
> >     return (status);
> > }
> >
> > Apparently, attr_vprint (an alias for attr_vprint0) returns that
> > the request could not be sent, OR attr_scan (an alias for attr_scan0)
> > fails to receive the reply.
> >
> > Here is the attr_vprint0 code fragment:
> >
> > int     attr_vprint0(VSTREAM *fp, int flags, va_list ap)
> > {
> >     ...
> >     /*
> >      * Iterate over all (type, name, value) triples, and produce output on
> >      * the fly.
> >      */
> >     while ((attr_type = va_arg(ap, int)) != ATTR_TYPE_END) {
> >         switch (attr_type) {
> >         ... code that logs "send attr xxx = yyy" and that uses
> >         vstream_fprintf() or vstream_fwrite() depending on the
> >         attribute type...
> >     }
> >     if ((flags & ATTR_FLAG_MORE) == 0)
> >         VSTREAM_PUTC('\0', fp);
> >     return (vstream_ferror(fp));
> > }
> >
> > The vstream_ferror() at the end implements the error check on exit
> > from the attr_vprint0 function.
> >
> > Now it is extremely unlikely that vstream_ferror() reports an error
> > on a newly-opened stream that has only a few bytes queued for output
> > and that has never been flushed.
> >
> > So let's have a look at attr_scan0() and see if this function
> > can return without trying to read any input.
> >
> > int     attr_vscan0(VSTREAM *fp, int flags, va_list ap)
> > {
> >     ...declarations...
> >
> >     /*
> >      * EOF check.
> >      */
> >     if ((ch = VSTREAM_GETC(fp)) == VSTREAM_EOF)
> >         return (0);
> >     vstream_ungetc(fp, ch);
> >
> >     ...after this comes code that logs "wanted attribute: <name of
> >     attribute>" and that attempts to receive the attribute, and
> >     reports "premature EOF" or other read errors...
> >
> > So the more likely explanation is that VSTREAM_GETC() reports an
> > error. That error happens either as the queued output with bounce
> > request is auto-flushed, or it waits for the socket to become
> > readable.
> >
> > This activity is logged with "-v -v -v", but that may actually slow
> > down the program enough to affect kernel bug race-condition errors
> > and change the result, just like program tracing does.
> >
> > At this point I am pretty sure that we are looking at a kernel bug
> > with UNIX-domain sockets, but I don't want to debug kernel bugs
> > over email. I can debug a system that runs in a local VM.
> >
> > Debugging should be straightforward: mail_command_client() is used
> > only for bounce daemon requests (including defer and trace requests)
> > so this function can be ripped apart, and the fault will be with
> > one of the following steps:
> >
> >     write-wait for the socket to become writable
> >
> >     write the queued bounce server request
> >
> >     read-wait for the socket to become readable
> >
> > The remainder of the log shows that the bounce request succeeded anyway.
> >
> > First, the defer service request to report the bounce service error:
> >
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > connect to subsystem private/defer
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr nrequest = 0
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr flags = 0
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr queue_id = 48E68BF5F
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr original_recipient = nonexistingm...@telia.com
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr recipient = nonexistingm...@telia.com
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr offset = 751
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr dsn_orig_rcpt = rfc822;nonexistingm...@telia.com
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr notify_flags = 0
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr status = 4.3.0
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr diag_type = smtp
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr diag_text = 550 RCPT TO:<nonexistingm...@telia.com> User unknown
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr mta_type = dns
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr mta_mname = mail.telia.com
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr action = delayed
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr reason = bounce or trace service failure
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > 48E68BF5F: to=<nonexistingm...@telia.com>,  
> > relay=mail.telia.com[62.20.233.128]:25, delay=5.3,  
> > delays=0.16/0/0.06/5.1, dsn=4.3.0, status=deferred (bounce or trace  
> > service failure)
> >
> > Then some code to update the fast flush log if applicable (it is not):
> >
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > flush_add: site telia.com id 48E68BF5F
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > match_hostname: telia.com ~? irf.se
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > match_list_match: telia.com: no match
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > flush_add: site telia.com id 48E68BF5F status 4
> >
> > The last part of the outbound SMTP session:
> >
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: >  
> > mail.telia.com[62.20.233.128]:25: RSET
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > smtp_stream_setup: maxtime=20 enable_deadline=0
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: <  
> > mail.telia.com[62.20.233.128]:25: 250 RSET
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: >  
> > mail.telia.com[62.20.233.128]:25: QUIT
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > name_mask: resource
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > name_mask: software
> >
> > The report from SMTP client to queue manager that delivery was deferred:
> >
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > deliver_request_final: send: "" -1
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr status =
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr diag_type =
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr diag_text =
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr mta_type =
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr mta_mname =
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr action =
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr reason =
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr status = 4294967295
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > master_notify: status 1
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > connection closed
> >
> > And that is the end of the SMTP client's attempt to deliver the  
> > message with queue ID 48E68BF5F.
> >
> > And now for the surprise: the bounce daemon did record the request
> > to bounce the message with queue ID 48E68BF5F. The queue manager
> > sees that this file exists and asks the bounce daemon to create a
> > non-delivery notification for the message with queue ID 87282BF62.
> >
> > And although the queue manager reports an error talking to the  
> > bounce service,
> > the bounce daemon creates a delivery status notification with queue  
> > ID 87282BF62.
> >
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/cleanup[12867]:  
> > 87282BF62: message-id=<20150214081828.87282b...@outgoingmail-2.irf.se>
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/bounce[12876]:  
> > 48E68BF5F: sender non-delivery notification: 87282BF62
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/qmgr[12816]:  
> > 87282BF62: from=<>, size=2552, nrcpt=1 (queue active)
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/qmgr[12816]:  
> > 48E68BF5F: status=deferred (bounce failed)
> >
> > The next fragment is the SMTP client delivering the delivery status
> > notification with queue ID 87282BF62.
> >
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > connection established
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > master_notify: status 0
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]:  
> > deliver_request_initial: send initial status
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send  
> > attr status = 0
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: smtp  
> > socket: wanted attribute: flags
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: input  
> > attribute name: flags
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: input  
> > attribute value: 3
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: smtp  
> > socket: wanted attribute: queue_name
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: input  
> > attribute name: queue_name
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: input  
> > attribute value: active
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: smtp  
> > socket: wanted attribute: queue_id
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: input  
> > attribute name: queue_id
> > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: input  
> > attribute value: 87282BF62
> >
> > This is then delivered.
> >
> 
> 
> 
> -- 
> Mats Luspa
> Phone: +46 (0)980 79 022
> Institutet f?r rymdfysik               Fax: +46 (0)980 79 050
> Swedish Institute of Space Physics      email: ma...@irf.se
> Visiting/Delivery address: Rymdcampus 1, SE-981 92 Kiruna
> Postal address: Box 812, SE-981 28 Kiruna
> --
> PGP Public Key: http://www.irf.se/~matsl/pgpmatsl.asc
> 
> 

Reply via email to