>Synopsis: ftp segfaults when aborting a stalled download
>Category: ftp
>Environment:
System : OpenBSD 6.2
Details : OpenBSD 6.2-current (GENERIC.MP) #267: Sat Dec 9
19:08:28 MST 2017
[email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
Architecture: OpenBSD.amd64
Machine : amd64
>Description:
When downloading large-ish files (~100MB) over mobile data, my
ftp sessions often end up stalled. When I then kill the process
with Ctrl-C, this happens:
nasu$ ftp speedtest.tele2.net:/1000GB.zip # bit of an exaggeration...
Trying 90.130.70.73...
Connected to speedtest.tele2.net.
220 (vsFTPd 2.3.5)
331 Please specify the password.
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
200 Switching to Binary mode.
Retrieving /1000GB.zip
local: 1000GB.zip remote: 1000GB.zip
150 Opening BINARY mode data connection for 1000GB.zip (1073741824000
bytes).
0% | | 184 MB -
stalled -^C
receive aborted
waiting for remote to finish abort.
ftp: abort: Broken pipe
Segmentation fault (core dumped)
This is exactly the same issue as reported here in 2016:
https://marc.info/?l=openbsd-bugs&m=145941236019090
The problem is due to the lostpeer() SIGPIPE handler defined
(and installed) in main.c. If the control connection has been
cut due to the stall, then the following line in ftp.c:
2059 if (send(fileno(cout), buf, 3, MSG_OOB) != 3)
will trigger a SIGPIPE, and lostpeer() will dutifully fclose() and set
cout to NULL. Unfortunately it's used again just two lines later:
2061 fprintf(cout, "%cABOR\r\n", DM);
causing a NULL pointer dereference, and thus a segfault.
>How-To-Repeat:
If you don't have a spotty internet connection on which to test this,
it's pretty easy to reproduce in gdb(1):
$ gdb /usr/obj/usr.bin/ftp/ftp # need to have compiled with debug
symbols
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "amd64-unknown-openbsd6.2"...
(gdb) br ftp.c:2059 # line which would trigger the SIGPIPE naturally
Breakpoint 1 at 0xe756: file /usr/src/usr.bin/ftp/ftp.c, line 2059.
(gdb) run speedtest.tele2.net:/1000GB.zip # download some file
Starting program: /usr/obj/usr.bin/ftp/ftp
speedtest.tele2.net:/1000GB.zip
Breakpoint 1 at 0x1df7cd00e756: file /usr/src/usr.bin/ftp/ftp.c, line
2059.
Trying 90.130.70.73...
Connected to speedtest.tele2.net.
220 (vsFTPd 2.3.5)
331 Please specify the password.
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
200 Switching to Binary mode.
Retrieving /1000GB.zip
local: 1000GB.zip remote: 1000GB.zip
150 Opening BINARY mode data connection for 1000GB.zip (1073741824000
bytes).
0% | | 1534 KB
567:55:14 ETA^C # go to gdb command mode
Program received signal SIGINT, Interrupt.
_thread_sys_read () at -:3
3 -: No such file or directory.
in -
Current language: auto; currently asm
(gdb) signal SIGINT # interrupt the download
Continuing with signal SIGINT.
Program received signal SIGINT, Interrupt.
_thread_sys_read () at -:3
3 in -
(gdb) signal SIGINT # this needs to be done twice for gdb reasons, I
think?
Continuing with signal SIGINT.
0% | | 2290 KB
380:25:53 ETA
receive aborted
waiting for remote to finish abort.
Breakpoint 1, abort_remote (din=0x1dfa7fdcf880)
at /usr/src/usr.bin/ftp/ftp.c:2059
2059 if (send(fileno(cout), buf, 3, MSG_OOB) != 3)
Current language: auto; currently minimal
(gdb) signal SIGPIPE # manually trigger a sigpipe
Continuing with signal SIGPIPE.
Breakpoint 1, abort_remote (din=0x1dfa7fdcf880)
at /usr/src/usr.bin/ftp/ftp.c:2059
2059 if (send(fileno(cout), buf, 3, MSG_OOB) != 3)
(gdb) continue # and continue on to the segfault.
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x00001df7cd00e774 in abort_remote (din=0x1dfa7fdcf880)
at /usr/src/usr.bin/ftp/ftp.c:2059
2059 if (send(fileno(cout), buf, 3, MSG_OOB) != 3)
(gdb) continue
Continuing.
Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
>Fix:
One could just tell the offending send() call not to raise SIGPIPE:
diff --git usr.bin/ftp/ftp.c usr.bin/ftp/ftp.c
index 8ac659d61..b22a14944 100644
--- usr.bin/ftp/ftp.c
+++ usr.bin/ftp/ftp.c
@@ -2056,7 +2056,7 @@ abort_remote(FILE *din)
* after urgent byte rather than before as is protocol now
*/
snprintf(buf, sizeof buf, "%c%c%c", IAC, IP, IAC);
- if (send(fileno(cout), buf, 3, MSG_OOB) != 3)
+ if (send(fileno(cout), buf, 3, MSG_OOB|MSG_NOSIGNAL) != 3)
warn("abort");
fprintf(cout, "%cABOR\r\n", DM);
(void)fflush(cout);
However, ftp.c contains a few fprintf()/fputc() calls to cout (including
one in the very snippet above) immediately followed by a fflush(cout),
which could trigger the same crash if the fprintf()/fputc() call chooses
to flush its buffers to the disconnected socket. I expect this can't
actually happen in the current implementation, since the constant
fflush()ing keeps the write buffer very near empty, but I don't know if
one should rely on that. But then again, I don't know how this should
be fixed if you don't...