>Synopsis:      ftp segfaults when aborting a stalled download
>Category:      ftp
>Environment:
        System      : OpenBSD 6.2
        Details     : OpenBSD 6.2-current (GENERIC.MP) #267: Sat Dec  9 
19:08:28 MST 2017
                         
[email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP

        Architecture: OpenBSD.amd64
        Machine     : amd64
>Description:
When downloading large-ish files (~100MB) over mobile data, my
ftp sessions often end up stalled.  When I then kill the process
with Ctrl-C, this happens:

        nasu$ ftp speedtest.tele2.net:/1000GB.zip # bit of an exaggeration...
        Trying 90.130.70.73...
        Connected to speedtest.tele2.net.
        220 (vsFTPd 2.3.5)
        331 Please specify the password.
        230 Login successful.
        Remote system type is UNIX.
        Using binary mode to transfer files.
        200 Switching to Binary mode.
        Retrieving /1000GB.zip
        local: 1000GB.zip remote: 1000GB.zip
        150 Opening BINARY mode data connection for 1000GB.zip (1073741824000 
bytes).
          0% |                                                  |   184 MB  - 
stalled -^C
        receive aborted
        waiting for remote to finish abort.
        ftp: abort: Broken pipe
        Segmentation fault (core dumped)

This is exactly the same issue as reported here in 2016:
https://marc.info/?l=openbsd-bugs&m=145941236019090

The problem is due to the lostpeer() SIGPIPE handler defined
(and installed) in main.c.  If the control connection has been
cut due to the stall, then the following line in ftp.c:

        2059    if (send(fileno(cout), buf, 3, MSG_OOB) != 3)

will trigger a SIGPIPE, and lostpeer() will dutifully fclose() and set
cout to NULL.  Unfortunately it's used again just two lines later:

        2061    fprintf(cout, "%cABOR\r\n", DM);

causing a NULL pointer dereference, and thus a segfault.
        
>How-To-Repeat:
If you don't have a spotty internet connection on which to test this,
it's pretty easy to reproduce in gdb(1):

        $ gdb /usr/obj/usr.bin/ftp/ftp # need to have compiled with debug 
symbols
        GNU gdb 6.3
        Copyright 2004 Free Software Foundation, Inc.
        GDB is free software, covered by the GNU General Public License, and 
you are
        welcome to change it and/or distribute copies of it under certain 
conditions.
        Type "show copying" to see the conditions.
        There is absolutely no warranty for GDB.  Type "show warranty" for 
details.
        This GDB was configured as "amd64-unknown-openbsd6.2"...
        (gdb) br ftp.c:2059 # line which would trigger the SIGPIPE naturally
        Breakpoint 1 at 0xe756: file /usr/src/usr.bin/ftp/ftp.c, line 2059.
        (gdb) run speedtest.tele2.net:/1000GB.zip # download some file
        Starting program: /usr/obj/usr.bin/ftp/ftp 
speedtest.tele2.net:/1000GB.zip
        Breakpoint 1 at 0x1df7cd00e756: file /usr/src/usr.bin/ftp/ftp.c, line 
2059.
        Trying 90.130.70.73...
        Connected to speedtest.tele2.net.
        220 (vsFTPd 2.3.5)
        331 Please specify the password.
        230 Login successful.
        Remote system type is UNIX.
        Using binary mode to transfer files.
        200 Switching to Binary mode.
        Retrieving /1000GB.zip
        local: 1000GB.zip remote: 1000GB.zip
        150 Opening BINARY mode data connection for 1000GB.zip (1073741824000 
bytes).
           0% |                                                  |  1534 KB 
567:55:14 ETA^C # go to gdb command mode
        Program received signal SIGINT, Interrupt.
        _thread_sys_read () at -:3
        3       -: No such file or directory.
                in -
        Current language:  auto; currently asm
        (gdb) signal SIGINT # interrupt the download
        Continuing with signal SIGINT.

        Program received signal SIGINT, Interrupt.
        _thread_sys_read () at -:3
        3       in -
        (gdb) signal SIGINT # this needs to be done twice for gdb reasons, I 
think?
        Continuing with signal SIGINT.
          0% |                                                  |  2290 KB 
380:25:53 ETA
        receive aborted
        waiting for remote to finish abort.

        Breakpoint 1, abort_remote (din=0x1dfa7fdcf880)
            at /usr/src/usr.bin/ftp/ftp.c:2059
        2059            if (send(fileno(cout), buf, 3, MSG_OOB) != 3)
        Current language:  auto; currently minimal
        (gdb) signal SIGPIPE # manually trigger a sigpipe
        Continuing with signal SIGPIPE.

        Breakpoint 1, abort_remote (din=0x1dfa7fdcf880)
            at /usr/src/usr.bin/ftp/ftp.c:2059
        2059            if (send(fileno(cout), buf, 3, MSG_OOB) != 3)
        (gdb) continue # and continue on to the segfault.
        Continuing.

        Program received signal SIGSEGV, Segmentation fault.
        0x00001df7cd00e774 in abort_remote (din=0x1dfa7fdcf880)
            at /usr/src/usr.bin/ftp/ftp.c:2059
        2059            if (send(fileno(cout), buf, 3, MSG_OOB) != 3)
        (gdb) continue
        Continuing.

        Program terminated with signal SIGSEGV, Segmentation fault.
        The program no longer exists.

>Fix:
One could just tell the offending send() call not to raise SIGPIPE:

diff --git usr.bin/ftp/ftp.c usr.bin/ftp/ftp.c
index 8ac659d61..b22a14944 100644
--- usr.bin/ftp/ftp.c
+++ usr.bin/ftp/ftp.c
@@ -2056,7 +2056,7 @@ abort_remote(FILE *din)
         * after urgent byte rather than before as is protocol now
         */
        snprintf(buf, sizeof buf, "%c%c%c", IAC, IP, IAC);
-       if (send(fileno(cout), buf, 3, MSG_OOB) != 3)
+       if (send(fileno(cout), buf, 3, MSG_OOB|MSG_NOSIGNAL) != 3)
                warn("abort");
        fprintf(cout, "%cABOR\r\n", DM);
        (void)fflush(cout);

However, ftp.c contains a few fprintf()/fputc() calls to cout (including
one in the very snippet above) immediately followed by a fflush(cout),
which could trigger the same crash if the fprintf()/fputc() call chooses
to flush its buffers to the disconnected socket.  I expect this can't
actually happen in the current implementation, since the constant
fflush()ing keeps the write buffer very near empty, but I don't know if
one should rely on that.  But then again, I don't know how this should
be fixed if you don't...

Reply via email to