Victor Duchovni escribió:
On Mon, Dec 15, 2008 at 05:53:24PM -0700, Jes?s Manuel Loaiza Vidal wrote:

Here is the log with smtpd_tls_loglevel = 4 (with a retry just after the error) and the corresponding pcap file:
postfix-2.txt <http://www.ich.edu.mx/attachments/postfix-2.txt>
tcp-2.cap <http://www.ich.edu.mx/attachments/tcp-2.cap>

It is still a bit of a mystery. The client (PCAP trace) sends data that
the server does not appear to read (SSL_read() returns -1). The server
responds to the client's packet with RST, but did not previously send a
FIN, which suggests that the socket was closed after the client's data
arrived, though of course it is possible that the application close
logic was initiated before that.

Sadly, we need to make more measurements to distinguish between the
following possibilities:

    - Buggy kernel, erroneously reporting EOF on a socket.

    - Buggy OpenSSL library. Please report which version you are using,
      as Gentoo builds from source consider posting the entries for
      the most recent released version from OpenSSL CHANGES file
      or corresponding Gentoo file.

    - Buggy compiler (generates broken Kernel, C-library, OpenSSL or Postfix
      code).

    - Erratic hardware (memory, CPU, ...)

The following patch should provide more detailed debugging information
(with the TLS loglevel still set to 4). Please build a patched 2.5.5
and provide a new set of logs.

Index: src/tls/tls_bio_ops.c
--- src/tls/tls_bio_ops.c       10 Jan 2008 03:00:29 -0000      1.1.1.2
+++ src/tls/tls_bio_ops.c       16 Dec 2008 07:34:55 -0000
@@ -188,12 +188,17 @@
            want_write = sizeof(buffer);
        from_bio = BIO_read(network_bio, buffer, want_write);
+ msg_info("%s fd=%d want_write=%d from_bio=%d",
+                myname, fd, want_write, from_bio);
+
        /*
         * Write the complete buffer contents to the network.
         */
        for (write_pos = 0; write_pos < from_bio; /* see below */ ) {
-           if (timeout > 0 && write_wait(fd, timeout) < 0)
+           if (timeout > 0 && write_wait(fd, timeout) < 0) {
+               msg_info("%s fd=%d write timeout=%d", myname, fd, timeout);
                return (-1);
+           }
            num_write = write(fd, buffer + write_pos, from_bio - write_pos);
            if (num_write <= 0) {
                if ((num_write < 0) && (timeout > 0) && (errno == EAGAIN)) {
@@ -217,12 +222,19 @@
     while ((want_read = BIO_ctrl_get_read_request(network_bio)) > 0) {
        if (want_read > sizeof(buffer))
            want_read = sizeof(buffer);
-       if (timeout > 0 && read_wait(fd, timeout) < 0)
+
+       msg_info("%s fd=%d want_read=%d", myname, fd, want_read);
+
+       if (timeout > 0 && read_wait(fd, timeout) < 0) {
+           msg_info("%s fd=%d read timeout=%d", myname, fd, timeout);
            return (-1);
+       }
        num_read = read(fd, buffer, want_read);
-       if (num_read == 0)
+       if (num_read == 0) {
+           msg_info("%s fd=%d EOF", myname, fd);
            /* FIX 200412 Cannot return a zero read count. */
            return (-1);
+       }
        if (num_read < 0) {
            if ((num_read < 0) && (timeout > 0) && (errno == EAGAIN)) {
                msg_warn("read() returns EAGAIN on a readable file 
descriptor!");
@@ -307,6 +319,9 @@
        }
 #endif
+ if (TLScontext->log_level >= 4)
+           msg_info("TLS I/O status %d error %d", status, err);
+
        /*
         * Find out if we must retry the operation and/or if there is pending
         * network I/O.
@@ -324,8 +339,11 @@
        case SSL_ERROR_WANT_READ:
            biop_retval =
                network_biopair_interop(fd, timeout, TLScontext->network_bio);
-           if (biop_retval < 0)
+           if (biop_retval < 0) {
+               if (TLScontext->log_level >= 4)
+                   msg_info("TLS bio I/O error %d", biop_retval);
                return (-1);                    /* network read/write error */
+           }
            break;
        default:
            retval = status;
Index: src/tls/tls_stream.c
--- src/tls/tls_stream.c        10 Jan 2008 03:00:34 -0000      1.1.1.2
+++ src/tls/tls_stream.c        16 Dec 2008 07:17:10 -0000
@@ -82,6 +82,8 @@
     if (ret > 0 && TLScontext->log_level >= 4)
        msg_info("Read %ld chars: %.*s",
                 (long) ret, (int) (ret > 40 ? 40 : ret), (char *) buf);
+    else if (ret <= 0 && TLScontext->log_level >= 4)
+       msg_info("Read %ld chars", (long) ret);
     return (ret);
 }

I'm using OpenSSL 0.9.8i with USE flags 'kerberos zlib'
OpenSSL changelog <http://www.ich.edu.mx/attachments/openssl-ChangeLog>

Here are the log and pcap file with the patch:

Mail log <http://www.ich.edu.mx/attachments/postfix-3.txt>
PCAP file <http://www.ich.edu.mx/attachments/tcp-3.cap>

Reply via email to