I had a little debugging session with awolk@ over at #openbsd-daily.  His
smtpd would over time end up with hung sessions that never timeout.

The problem is related to the data_io path's congestion control which
may pause the session.  In this case the io system will not wait for
read events and as such will not have a chance to timeout until it is
resumed.

If the pause happens when a full message is just about to pass through
the data_io path, the session is never resumed, even though there is
obviously no more congestion and the session should be reading more
input from the client again.

A debug trace excerpt shows the course of events:

mtp: 0xe54baa1e000: IO_DATAIN <io:0xe551d0d5000 fd=25 to=300000 fl=R ib=16839 
ob=0>
debug: smtp: 0xe54baa1e000: filter congestion: pausing session
smtp: 0xe54baa1e000 (data): IO_LOWAT <io:0xe55ac8e8c00 fd=32 to=-1 fl=W ib=0 
ob=0>
debug: smtp: 0xe54baa1e000: data io done (259146 bytes)
debug: 0xe54baa1e000: end of message, msgflags=0x0000
smtp: 0xe54baa1e000: >>> 250 2.0.0: 4f36f9e3 Message accepted for delivery
smtp: 0xe54baa1e000: STATE_BODY -> STATE_HELO
smtp: 0xe54baa1e000: IO_LOWAT <io:0xe551d0d5000 fd=25 to=300000 fl=W,F_PI ib=0 
ob=0>

>From this point on, session 0xe54baa1e000 and its io 0xe551d0d5000
(which has the pause_in flag) are never seen again in the trace, and
fstat shows a corresponding connection to smtpd that never goes away.

The proposed fix is to always resume the session if the data_io path
hits the low water mark.

Mr. Wolk tested this diff against smtpd on 6.1 as well as a against
-current version of smtpd (compiled on the same system running 6.1).


Index: usr.sbin/smtpd/smtp_session.c
===================================================================
RCS file: /cvs/src/usr.sbin/smtpd/smtp_session.c,v
retrieving revision 1.303
diff -u -p -r1.303 smtp_session.c
--- usr.sbin/smtpd/smtp_session.c       17 May 2017 14:00:06 -0000      1.303
+++ usr.sbin/smtpd/smtp_session.c       15 Jun 2017 20:28:12 -0000
@@ -1474,9 +1474,10 @@ smtp_data_io(struct io *io, int evt, voi
                break;
 
        case IO_LOWAT:
-               if (s->tx->dataeom && io_queued(s->tx->oev) == 0) {
+               if (s->tx->dataeom && io_queued(s->tx->oev) == 0)
                        smtp_data_io_done(s);
-               } else if (io_paused(s->io, IO_IN)) {
+
+               if (io_paused(s->io, IO_IN)) {
                        log_debug("debug: smtp: %p: filter congestion over: 
resuming session", s);
                        io_resume(s->io, IO_IN);
                }


Reply via email to