Hello

I'm seeing an issue that an ssh session hangs after generating a
series of ctrl-c keystrokes.

Environment description:
---------------------------------------------------
- multiple board chassis, with network connectivity from the
controller board (non-linux).
- the slave boards:
   * are running Linux (3.12.37)
   * have a PPC CPU (P4080)
   * busybox version 1.25 (but I also tested 1.26.2)
   * dropbear version: v2016.74
- SSH access to the slave board is done by relaying a TCP connection
from the controller board
    to the slave. So there is a small application running on the
controller board, which reads a
    socket on the IP stack and forwards this TCP stream to the slave
board (and visa-versa).

Test scenario:
---------------------------------------------------
- open an ssh session to the slave board (via the TCP relay).
- execute "while true; do find /; done". So this command keeps on
sending a lot of data.
- Repeat the 2 steps above 2 times so that in total 3 sessions are running.
- Press control-c continuously in one of the sessions and observe that
the "sh" process is blocked.

Issue cannot be reproduced each time, but 1 out of 3 times and the
probability increases with the
amount of sessions opened in parallel.
If I try the same scenario on another system with direct IP
connectivity to Linux (so no tcp_relay),
the issue is not observed. So it might be that the tcp_relay (and the
delay/buffering it introduces)
has an impact.


Analysis of the problem:
---------------------------------------------------
Once the session is blocked I execute:

# pstree -p 2066
dropbear(2066)---sh(2078)

When pressing enter in the ssh session I see for dropbear:
# strace -p 2066
strace: Process 2066 attached
_newselect(8, [3 5 7], [], NULL, {3516, 826061}) = 1 (in [5], left
{3512, 787390})
clock_gettime(0x6 /* CLOCK_??? */, {327, 955324187}) = 0
read(5, ";\235\21\332\365\210T\200X}\230\"\306.\363\221", 16) = 16
read(5, 
"\2\345\252\274\24Y\253\21\316>}\266\fU\20259\324\254Tu\3534\0238bMXzV\274\270",
32) = 32
clock_gettime(0x6 /* CLOCK_??? */, {327, 955324187}) = 0
writev(7, [{iov_base="\r", iov_len=1}], 1) = 1
clock_gettime(0x6 /* CLOCK_??? */, {327, 956324197}) = 0
_newselect(8, [3 5 7], [], NULL, {3600, 0}) = 1 (in [7], left {3599, 999987})
clock_gettime(0x6 /* CLOCK_??? */, {327, 956324197}) = 0
read(7, "\r\n", 16375)                  = 2
clock_gettime(0x6 /* CLOCK_??? */, {327, 956324197}) = 0
writev(5, 
[{iov_base="\231\310\271\315\354\243\342\271\22,\325Tj\n\356\345\"t\332d\205\317.\213\376\200\274h\201\347$\324"...,
iov_len=48}], 1) = 48
clock_gettime(0x6 /* CLOCK_??? */, {327, 957324207}) = 0
_newselect(8, [3 5 7], [], NULL, {3600, 0}^Cstrace: Process 2066 detached

While the sh process is not printing any additional traces. So this
process is completely blocked:
/isam/slot_default/run # strace -p 2078
strace: Process 2078 attached
futex(0xffed598, FUTEX_WAIT_PRIVATE, 2, NULL


Connecting a debugger to the system (sh pid 2078) shows that the only
thread the process has is blocked
on a mutex in the C library.

(gdb) info threads
  Id   Target Id         Frame
* 1    Thread 2078       0x1003d0ec in putprompt (s=<optimized out>)
at shell/ash.c:2455
(gdb) bt
#0  0x0ff5c708 in __lll_lock_wait_private (futex=0xffed598
<main_arena>) at ../nptl/sysdeps/unix/sysv/linux/lowlevellock.c:31
#1  0x0fef07a8 in *__GI___libc_free (mem=<optimized out>) at malloc.c:3714
#2  0x1003d0ec in putprompt (s=<optimized out>) at shell/ash.c:2455
#3  setprompt_if (do_set=<optimized out>, whichprompt=<optimized out>)
at shell/ash.c:2501
#4  0x1003d448 in parsecmd (interact=<optimized out>) at shell/ash.c:12074
#5  0x1004100c in cmdloop (top=<optimized out>) at shell/ash.c:12215
#6  0x10042730 in ash_main (argc=<optimized out>, argv=<optimized
out>) at shell/ash.c:13350
#7  0x10004468 in run_applet_no_and_exit (applet_no=<optimized out>,
argv=0xbf93b5d4) at libbb/appletlib.c:879
#8  0x1000482c in run_applet_and_exit (name=0xbf93bf3e "sh",
argv=0xbf93b5d4) at libbb/appletlib.c:893
#9  0x100048d0 in main (argc=<optimized out>, argv=0xbf93b5d4) at
libbb/appletlib.c:971
(gdb) frame 1
#1  0x0fef07a8 in *__GI___libc_free (mem=<optimized out>) at malloc.c:3714
3714      (void)mutex_lock(&ar_ptr->mutex);
(gdb) p ar_ptr
$6 = (mstate) 0xffed598 <main_arena>
(gdb) p *ar_ptr
$7 = {
  mutex = 2,
  flags = 0,
  fastbinsY = {0x100d8f78, 0x0, 0x0, 0x0, 0x0, 0x100cc5c8, 0x0, 0x0, 0x0, 0x0},
  top = 0x100dc3f8,
  last_remainder = 0x100d9040,
  bins = {0x100d8f88, 0x100d8f88, 0xffed5d0 <main_arena+56>, 0xffed5d0
<main_arena+56>, 0x100cafd0, 0x100cafd0, 0xffed5e0 <main_arena+72>,
0xffed5e0 <main_arena+72>, 0xffed5e8 <main_arena+80>, 0xffed5e8
<main_arena+80>, 0xffed5f0 <main_arena+88>,
    0xffed5f0 <main_arena+88>, 0xffed5f8 <main_arena+96>, 0xffed5f8
<main_arena+96>, 0xffed600 <main_arena+104>, 0xffed600
<main_arena+104>, 0xffed608 <main_arena+112>, 0xffed608
<main_arena+112>, 0xffed610 <main_arena+120>, 0xffed610
<main_arena+120>,
    0xffed618 <main_arena+128>, 0xffed618 <main_arena+128>, 0xffed620
<main_arena+136>, 0xffed620 <main_arena+136>, 0xffed628
<main_arena+144>, 0xffed628 <main_arena+144>, 0xffed630
<main_arena+152>, 0xffed630 <main_arena+152>, 0xffed638
<main_arena+160>,
    0xffed638 <main_arena+160>, 0xffed640 <main_arena+168>, 0xffed640
<main_arena+168>, 0xffed648 <main_arena+176>, 0xffed648
<main_arena+176>, 0x100cc600, 0x100cc600, 0xffed658 <main_arena+192>,
0xffed658 <main_arena+192>, 0xffed660 <main_arena+200>,
    0xffed660 <main_arena+200>, 0xffed668 <main_arena+208>, 0xffed668
<main_arena+208>, 0xffed670 <main_arena+216>, 0xffed670
<main_arena+216>, 0xffed678 <main_arena+224>, 0xffed678
<main_arena+224>, 0xffed680 <main_arena+232>, 0xffed680
<main_arena+232>,
    0xffed688 <main_arena+240>, 0xffed688 <main_arena+240>, 0xffed690
<main_arena+248>, 0xffed690 <main_arena+248>, 0xffed698
<main_arena+256>, 0xffed698 <main_arena+256>, 0xffed6a0
<main_arena+264>, 0xffed6a0 <main_arena+264>, 0xffed6a8
<main_arena+272>,
    0xffed6a8 <main_arena+272>, 0xffed6b0 <main_arena+280>, 0xffed6b0
<main_arena+280>, 0xffed6b8 <main_arena+288>, 0xffed6b8
<main_arena+288>, 0xffed6c0 <main_arena+296>, 0xffed6c0
<main_arena+296>, 0xffed6c8 <main_arena+304>, 0xffed6c8
<main_arena+304>,
    0xffed6d0 <main_arena+312>, 0xffed6d0 <main_arena+312>, 0xffed6d8
<main_arena+320>, 0xffed6d8 <main_arena+320>, 0xffed6e0
<main_arena+328>, 0xffed6e0 <main_arena+328>, 0xffed6e8
<main_arena+336>, 0xffed6e8 <main_arena+336>, 0xffed6f0
<main_arena+344>,
    0xffed6f0 <main_arena+344>, 0xffed6f8 <main_arena+352>, 0xffed6f8
<main_arena+352>, 0xffed700 <main_arena+360>, 0xffed700
<main_arena+360>, 0xffed708 <main_arena+368>, 0xffed708
<main_arena+368>, 0xffed710 <main_arena+376>, 0xffed710
<main_arena+376>,
    0xffed718 <main_arena+384>, 0xffed718 <main_arena+384>, 0xffed720
<main_arena+392>, 0xffed720 <main_arena+392>, 0xffed728
<main_arena+400>, 0xffed728 <main_arena+400>, 0xffed730
<main_arena+408>, 0xffed730 <main_arena+408>, 0xffed738
<main_arena+416>,
    0xffed738 <main_arena+416>, 0xffed740 <main_arena+424>, 0xffed740
<main_arena+424>, 0xffed748 <main_arena+432>, 0xffed748
<main_arena+432>, 0xffed750 <main_arena+440>, 0xffed750
<main_arena+440>, 0xffed758 <main_arena+448>, 0xffed758
<main_arena+448>,
    0xffed760 <main_arena+456>, 0xffed760 <main_arena+456>, 0xffed768
<main_arena+464>, 0xffed768 <main_arena+464>, 0xffed770
<main_arena+472>, 0xffed770 <main_arena+472>, 0xffed778
<main_arena+480>, 0xffed778 <main_arena+480>, 0xffed780
<main_arena+488>,
    0xffed780 <main_arena+488>, 0xffed788 <main_arena+496>, 0xffed788
<main_arena+496>, 0xffed790 <main_arena+504>, 0xffed790
<main_arena+504>, 0xffed798 <main_arena+512>, 0xffed798
<main_arena+512>, 0xffed7a0 <main_arena+520>, 0xffed7a0
<main_arena+520>,
    0xffed7a8 <main_arena+528>, 0xffed7a8 <main_arena+528>, 0xffed7b0
<main_arena+536>, 0xffed7b0 <main_arena+536>, 0xffed7b8
<main_arena+544>, 0xffed7b8 <main_arena+544>, 0x100d9580, 0x100d9580,
0xffed7c8 <main_arena+560>, 0xffed7c8 <main_arena+560>,
    0xffed7d0 <main_arena+568>, 0xffed7d0 <main_arena+568>, 0xffed7d8
<main_arena+576>, 0xffed7d8 <main_arena+576>, 0xffed7e0
<main_arena+584>, 0xffed7e0 <main_arena+584>, 0xffed7e8
<main_arena+592>, 0xffed7e8 <main_arena+592>, 0xffed7f0
<main_arena+600>,
    0xffed7f0 <main_arena+600>, 0xffed7f8 <main_arena+608>, 0xffed7f8
<main_arena+608>, 0xffed800 <main_arena+616>, 0xffed800
<main_arena+616>, 0xffed808 <main_arena+624>, 0xffed808
<main_arena+624>, 0xffed810 <main_arena+632>, 0xffed810
<main_arena+632>,
    0xffed818 <main_arena+640>, 0xffed818 <main_arena+640>, 0xffed820
<main_arena+648>, 0xffed820 <main_arena+648>, 0xffed828
<main_arena+656>, 0xffed828 <main_arena+656>, 0xffed830
<main_arena+664>, 0xffed830 <main_arena+664>, 0xffed838
<main_arena+672>,
    0xffed838 <main_arena+672>, 0x100cb958, 0x100cb958, 0xffed848
<main_arena+688>, 0xffed848 <main_arena+688>, 0xffed850
<main_arena+696>, 0xffed850 <main_arena+696>, 0xffed858
<main_arena+704>, 0xffed858 <main_arena+704>, 0xffed860
<main_arena+712>,
    0xffed860 <main_arena+712>, 0xffed868 <main_arena+720>, 0xffed868
<main_arena+720>, 0xffed870 <main_arena+728>, 0xffed870
<main_arena+728>, 0xffed878 <main_arena+736>, 0xffed878
<main_arena+736>, 0xffed880 <main_arena+744>, 0xffed880
<main_arena+744>,
    0xffed888 <main_arena+752>, 0xffed888 <main_arena+752>, 0xffed890
<main_arena+760>, 0xffed890 <main_arena+760>, 0xffed898
<main_arena+768>, 0xffed898 <main_arena+768>, 0xffed8a0
<main_arena+776>, 0xffed8a0 <main_arena+776>, 0xffed8a8
<main_arena+784>,
    0xffed8a8 <main_arena+784>, 0xffed8b0 <main_arena+792>, 0xffed8b0
<main_arena+792>, 0xffed8b8 <main_arena+800>, 0xffed8b8
<main_arena+800>, 0xffed8c0 <main_arena+808>, 0xffed8c0
<main_arena+808>, 0xffed8c8 <main_arena+816>, 0xffed8c8
<main_arena+816>,
    0xffed8d0 <main_arena+824>, 0xffed8d0 <main_arena+824>, 0xffed8d8
<main_arena+832>, 0xffed8d8 <main_arena+832>, 0xffed8e0
<main_arena+840>, 0xffed8e0 <main_arena+840>...},
  binmap = {33816716, 1431568384, 66049, 0},
  next = 0xb7d00010,
  system_mem = 135168,
  max_system_mem = 135168
}
(gdb) frame 0
(gdb) l
26
27      void
28      __lll_lock_wait_private (int *futex)
29      {
30        if (*futex == 2)
31          lll_futex_wait (futex, 2, LLL_PRIVATE);
32
33        while (atomic_exchange_acq (futex, 2) != 0)
34          lll_futex_wait (futex, 2, LLL_PRIVATE);
35      }


If I use version 1.26.2 of busybox, the ctrl-c much less responsive.
In most cases the find loop just continues.
The only way to stop the sessions is to kill it from a different session.

The above traces/debug information was taken/collected with the 1.25
version of busybox.

My feeling is that tcp relay function (and it buffering) is causing
this bug to become visible.
Dropbear/sh will receive more than one Ctrl-c I think.
If I increase the priority of the tcp_relay function the reproduction
rate is lower.

I also tried to reproduce the issue on a "directly connected" system
by suspending the dropbear process
(using kill -SIGSTOP / -SIGCONT) but I was not able to reproduce the problem.

Any help would be appreciated.

Best regards
Ronny
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to