Greetings.

The following four commits are some of the changes that have been made
to the tty layer since kernel version 3.11:

1) f95499c3030fe1bfad57745f2db1959c5b43dca8
    n_tty: Don't wait for buffer work in read() loop

2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
    tty: Fix pty master read() after slave closes

3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
    pty, n_tty: Simplify input processing on final close

4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
    pty: Fix input race when closing

Commit "4)" corrected an issue whereby EIO could be prematurely
returned on a read of one end of a master/slave pty pair after the
other had been completely closed.  Yet, I would argue that EAGAIN
should not be returned either when there actually is data to be
returned.  This whether or not the other end has been completely
closed.

Indeed, the previous code (before commit "1)") checked the other end
of the pty pair for any data before returning EAGAIN.  This mimics the
behaviour of other System-V variants (Solaris, AIX, etc.) in this
regard and ensured that EAGAIN really did mean no data was available
at the time of the call.

Portable OpenSSH, since release 4.6p1 in 2007, relies on this being
the case and has been broken since commit "1)" introduced spurious
EAGAIN returns (i.e. as of 3.12 kernels).  The scenario at hand is
as follows.

After sshd has been SIGCHLD'ed about the shell's termination, it
continues to read the master pty until an error occurs.  This error
will be EIO if no process has the slave pty open.  Otherwise (for
example when the shell spawned long-running processes in the
background before terminating), that error is expected to be EAGAIN.
sshd cannot continue to read until an EIO in all cases, because doing
so causes the session to hang until all processes have closed the
slave pty, which is not the desired behaviour.  Thus a spurious EAGAIN
return causes sshd to lose data, whether or not the slave pty is
completely closed.

I've been using the following script to reproduce the problem.  It
loops until the issue is detected.

        #! /bin/bash

        LOG=sshlog-`date "+%F.%T"`

        touch ${LOG}

        while test -z "`grep -n '^Connection' ${LOG} | grep -v '0:Connection'`"
        do
         ssh -p 22 -tt root@localhost \
          '/bin/bash -c "/bin/ping -c4 8.8.8.8"' 2>&1 | \
          tee -a ${LOG}
        done

It should be noted that the problem is extremely rare, but still
occurs, on real hardware.  This bug is easier to replicate in a
virtual machine such as those that can be created using Google Cloud.

The patch below is a suggested fix.  It was developed using a 4.3.0
kernel and should apply, modulo fuzz, to any release >= 4.0.5.  My
suggested fix is modeled after commit "2)" mentionned above.  Given
commit "2)" was later reworked by commit "3)", I fully expect my fix
to be reworked as well.

I volunteer to backport the fix this ends up being to any stable
release >= 3.12 deemed needed.

Please Reply-To-All.

Thanks and have a great day.

Marc.

Reported-by: Volth <open...@volth.com>
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492
Signed-off-by: Marc Aurele La France <t...@tuyoix.net>

--- a/drivers/tty/n_tty.c
+++ a/drivers/tty/n_tty.c
@@ -2281,20 +2281,26 @@ static ssize_t n_tty_read(struct tty_struct *tty, 
struct file *file,
                        if (!timeout)
                                break;
                        if (file->f_flags & O_NONBLOCK) {
-                               retval = -EAGAIN;
-                               break;
-                       }
-                       if (signal_pending(current)) {
-                               retval = -ERESTARTSYS;
-                               break;
-                       }
-                       up_read(&tty->termios_rwsem);
+                               up_read(&tty->termios_rwsem);
+                               flush_work(&tty->port->buf.work);
+                               down_read(&tty->termios_rwsem);
+                               if (!input_available_p(tty, 0)) {
+                                       retval = -EAGAIN;
+                                       break;
+                               }
+                       } else {
+                               if (signal_pending(current)) {
+                                       retval = -ERESTARTSYS;
+                                       break;
+                               }
+                               up_read(&tty->termios_rwsem);

-                       timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
-                                            timeout);
+                               timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
+                                                    timeout);

-                       down_read(&tty->termios_rwsem);
-                       continue;
+                               down_read(&tty->termios_rwsem);
+                               continue;
+                       }
                }

                if (ldata->icanon && !L_EXTPROC(tty)) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to