Re: tty crash in Linux 4.6
Mikulas Patocka writes: > On Thu, 22 Mar 2018, Greg Kroah-Hartman wrote: > >> On Fri, Mar 23, 2018 at 12:48:06AM +1100, Daniel Axtens wrote: >> > Hi, >> > >> > >> This patch works, I've had no tty crashes since applying it. >> > >> >> > >> I've seen that you haven't sent this patch yet to Linux-4.7-rc and >> > >> Linux-4.6-stable. Will you? Or did you create a different patch? >> > > >> > > We are hitting this now on powerpc. This patch never seemed to make >> > > it upstream (drivers/tty/tty_ldisc.c hasn't been touched in 1 year). >> > >> > I seem to be hitting this too on a kernel that has the 4.6 changes >> > backported to 4.4. >> > >> > Has there been any further progress on getting this accepted? >> >> Can you try applying 28b0f8a6962a ("tty: make n_tty_read() always abort >> if hangup is in progress") to see if that helps out or not? Sorry for the delay in getting the test results; as with Mikulas, 28b0f8a6962a does not help. Regards, Daniel >> >> thanks, >> >> greg k-h > > It doesn't help. I get the same crash as before. > > Mikulas
Re: tty crash in Linux 4.6
On Thu, 22 Mar 2018, Greg Kroah-Hartman wrote: > On Fri, Mar 23, 2018 at 12:48:06AM +1100, Daniel Axtens wrote: > > Hi, > > > > >> This patch works, I've had no tty crashes since applying it. > > >> > > >> I've seen that you haven't sent this patch yet to Linux-4.7-rc and > > >> Linux-4.6-stable. Will you? Or did you create a different patch? > > > > > > We are hitting this now on powerpc. This patch never seemed to make > > > it upstream (drivers/tty/tty_ldisc.c hasn't been touched in 1 year). > > > > I seem to be hitting this too on a kernel that has the 4.6 changes > > backported to 4.4. > > > > Has there been any further progress on getting this accepted? > > Can you try applying 28b0f8a6962a ("tty: make n_tty_read() always abort > if hangup is in progress") to see if that helps out or not? > > thanks, > > greg k-h It doesn't help. I get the same crash as before. Mikulas
Re: tty crash in Linux 4.6
On Fri, Mar 23, 2018 at 12:48:06AM +1100, Daniel Axtens wrote: > Hi, > > >> This patch works, I've had no tty crashes since applying it. > >> > >> I've seen that you haven't sent this patch yet to Linux-4.7-rc and > >> Linux-4.6-stable. Will you? Or did you create a different patch? > > > > We are hitting this now on powerpc. This patch never seemed to make > > it upstream (drivers/tty/tty_ldisc.c hasn't been touched in 1 year). > > I seem to be hitting this too on a kernel that has the 4.6 changes > backported to 4.4. > > Has there been any further progress on getting this accepted? Can you try applying 28b0f8a6962a ("tty: make n_tty_read() always abort if hangup is in progress") to see if that helps out or not? thanks, greg k-h
Re: tty crash in Linux 4.6
Hi, >> This patch works, I've had no tty crashes since applying it. >> >> I've seen that you haven't sent this patch yet to Linux-4.7-rc and >> Linux-4.6-stable. Will you? Or did you create a different patch? > > We are hitting this now on powerpc. This patch never seemed to make > it upstream (drivers/tty/tty_ldisc.c hasn't been touched in 1 year). I seem to be hitting this too on a kernel that has the 4.6 changes backported to 4.4. Has there been any further progress on getting this accepted? Regards, Daniel > > Peter, can we take this patch as is, or do you have an updated version? > > Mikey > >> Mikulas >> >> >> On Tue, 17 May 2016, Peter Hurley wrote: >> >> > On 05/17/2016 08:57 AM, Peter Hurley wrote: >> > > On 05/16/2016 04:36 PM, Peter Hurley wrote: >> > >> > Hi Mikulas, >> > >> > >> > >> > On 05/16/2016 01:12 PM, Mikulas Patocka wrote: >> > >>> >> Hi >> > >>> >> >> > >>> >> In the kernel 4.6 I get crashes in the tty layer. I can reproduce >> > >>> >> the >> > >>> >> crash by logging into the machine with ssh and typing before the >> > >>> >> prompt >> > >>> >> appears. >> > >> > >> > >> > Thanks for the report. >> > >> > I tried to reproduce this a number of times on different machines >> > >> > with no luck. >> > > >> > > I was able to reproduce this crash with a test jig. >> > > The patch below fixed it, but I'm testing a better patch now, which >> > > I'll get to you asap. >> > >> > --- >% --- >> > Subject: [PATCH] tty: Fix ldisc crash on reopened tty >> > >> > If the tty has been hungup, the ldisc instance may have been destroyed. >> > Continued input to the tty will be ignored as long as the ldisc instance >> > is not visible to the flush_to_ldisc kworker. However, when the tty >> > is reopened and a new ldisc instance is created, the flush_to_ldisc >> > kworker can obtain an ldisc reference before the new ldisc is >> > completely initialized. This will likely crash: >> > >> > BUG: unable to handle kernel paging request at 2260 >> > IP: [] n_tty_receive_buf_common+0x6d/0xb80 >> > PGD 2ab581067 PUD 290c11067 PMD 0 >> > Oops: [#1] PREEMPT SMP >> > Modules linked in: nls_iso8859_1 ip6table_filter [.] >> > CPU: 2 PID: 103 Comm: kworker/u16:1 Not tainted 4.6.0-rc7+wip-xeon+debug >> > #rc7+wip >> > Hardware name: Dell Inc. Precision WorkStation T5400 /0RW203, BIOS A11 >> > 04/30/2012 >> > Workqueue: events_unbound flush_to_ldisc >> > task: 8802ad16d100 ti: 8802ad31c000 task.ti: 8802ad31c000 >> > RIP: 0010:[] [] >> > n_tty_receive_buf_common+0x6d/0xb80 >> > RSP: 0018:8802ad31fc70 EFLAGS: 00010296 >> > RAX: RBX: 8802aaddd800 RCX: 0001 >> > RDX: RSI: 810db48f RDI: 0246 >> > RBP: 8802ad31fd08 R08: R09: 0001 >> > R10: 8802aadddb28 R11: 0001 R12: 8800ba6da808 >> > R13: 8802ad18be80 R14: 8800ba6da858 R15: 8800ba6da800 >> > FS: () GS:8802b0a0() >> > knlGS: >> > CS: 0010 DS: ES: CR0: 80050033 >> > CR2: 2260 CR3: 00028ee5d000 CR4: 06e0 >> > Stack: >> > 81531219 8802aadddab8 8802aae0 8802aa78 >> > 0001 8800ba6da858 8800ba6da860 8802ad31fd30 >> > 81885f78 81531219 0002 >> > Call Trace: >> > [] ? flush_to_ldisc+0x49/0xd0 >> > [] ? mutex_lock_nested+0x2c8/0x430 >> > [] ? flush_to_ldisc+0x49/0xd0 >> > [] n_tty_receive_buf2+0x14/0x20 >> > [] tty_ldisc_receive_buf+0x22/0x50 >> > [] flush_to_ldisc+0xbe/0xd0 >> > [] process_one_work+0x1ed/0x6e0 >> > [] ? process_one_work+0x16f/0x6e0 >> > [] worker_thread+0x4e/0x490 >> > [] ? process_one_work+0x6e0/0x6e0 >> > [] kthread+0xf2/0x110 >> > [] ? preempt_count_sub+0x4c/0x80 >> > [] ret_from_fork+0x22/0x50 >> > [] ? kthread_create_on_node+0x220/0x220 >> > Code: ff ff e8 27 a0 35 00 48 8d 83 78 05 00 00 c7 45 c0 00 00 00 00 48 >> > 89 45 80 48 >> >8d 83 e0 05 00 00 48 89 85 78 ff ff ff 48 8b 45 b8 <48> 8b b8 60 22 >> > 00 00 48 >> >8b 30 89 f8 8b 8b 88 04 00 00 29 f0 8d >> > RIP [] n_tty_receive_buf_common+0x6d/0xb80 >> > RSP >> > CR2: 2260 >> > >> > Ensure the kworker cannot obtain the ldisc reference until the new ldisc >> > is completely initialized. >> > >> > Fixes: 892d1fa7eaae ("tty: Destroy ldisc instance on hangup") >> > Reported-by: Mikulas Patocka >> > Signed-off-by: Peter Hurley >> > --- >> > drivers/tty/tty_ldisc.c | 11 ++- >> > 1 file changed, 6 insertions(+), 5 deletions(-) >> > >> > diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c >> > index cdd063f..bda0c85 100644 >> > --- a/drivers/tty/tty_ldisc.c >> > +++ b/drivers/tty/tty_ldisc.c >> > @@ -669,16 +669,17 @@ int tty_ldisc_reinit(struct tty_struct *tty, int >> > disc) >> > tty_ldisc_put(tty->ldisc); >> >
Re: tty crash in Linux 4.6
> This patch works, I've had no tty crashes since applying it. > > I've seen that you haven't sent this patch yet to Linux-4.7-rc and > Linux-4.6-stable. Will you? Or did you create a different patch? We are hitting this now on powerpc. This patch never seemed to make it upstream (drivers/tty/tty_ldisc.c hasn't been touched in 1 year). Peter, can we take this patch as is, or do you have an updated version? Mikey > Mikulas > > > On Tue, 17 May 2016, Peter Hurley wrote: > > > On 05/17/2016 08:57 AM, Peter Hurley wrote: > > > On 05/16/2016 04:36 PM, Peter Hurley wrote: > > >> > Hi Mikulas, > > >> > > > >> > On 05/16/2016 01:12 PM, Mikulas Patocka wrote: > > >>> >> Hi > > >>> >> > > >>> >> In the kernel 4.6 I get crashes in the tty layer. I can reproduce the > > >>> >> crash by logging into the machine with ssh and typing before the > > >>> >> prompt > > >>> >> appears. > > >> > > > >> > Thanks for the report. > > >> > I tried to reproduce this a number of times on different machines > > >> > with no luck. > > > > > > I was able to reproduce this crash with a test jig. > > > The patch below fixed it, but I'm testing a better patch now, which > > > I'll get to you asap. > > > > --- >% --- > > Subject: [PATCH] tty: Fix ldisc crash on reopened tty > > > > If the tty has been hungup, the ldisc instance may have been destroyed. > > Continued input to the tty will be ignored as long as the ldisc instance > > is not visible to the flush_to_ldisc kworker. However, when the tty > > is reopened and a new ldisc instance is created, the flush_to_ldisc > > kworker can obtain an ldisc reference before the new ldisc is > > completely initialized. This will likely crash: > > > > BUG: unable to handle kernel paging request at 2260 > > IP: [] n_tty_receive_buf_common+0x6d/0xb80 > > PGD 2ab581067 PUD 290c11067 PMD 0 > > Oops: [#1] PREEMPT SMP > > Modules linked in: nls_iso8859_1 ip6table_filter [.] > > CPU: 2 PID: 103 Comm: kworker/u16:1 Not tainted 4.6.0-rc7+wip-xeon+debug > > #rc7+wip > > Hardware name: Dell Inc. Precision WorkStation T5400 /0RW203, BIOS A11 > > 04/30/2012 > > Workqueue: events_unbound flush_to_ldisc > > task: 8802ad16d100 ti: 8802ad31c000 task.ti: 8802ad31c000 > > RIP: 0010:[] [] > > n_tty_receive_buf_common+0x6d/0xb80 > > RSP: 0018:8802ad31fc70 EFLAGS: 00010296 > > RAX: RBX: 8802aaddd800 RCX: 0001 > > RDX: RSI: 810db48f RDI: 0246 > > RBP: 8802ad31fd08 R08: R09: 0001 > > R10: 8802aadddb28 R11: 0001 R12: 8800ba6da808 > > R13: 8802ad18be80 R14: 8800ba6da858 R15: 8800ba6da800 > > FS: () GS:8802b0a0() > > knlGS: > > CS: 0010 DS: ES: CR0: 80050033 > > CR2: 2260 CR3: 00028ee5d000 CR4: 06e0 > > Stack: > > 81531219 8802aadddab8 8802aae0 8802aa78 > > 0001 8800ba6da858 8800ba6da860 8802ad31fd30 > > 81885f78 81531219 0002 > > Call Trace: > > [] ? flush_to_ldisc+0x49/0xd0 > > [] ? mutex_lock_nested+0x2c8/0x430 > > [] ? flush_to_ldisc+0x49/0xd0 > > [] n_tty_receive_buf2+0x14/0x20 > > [] tty_ldisc_receive_buf+0x22/0x50 > > [] flush_to_ldisc+0xbe/0xd0 > > [] process_one_work+0x1ed/0x6e0 > > [] ? process_one_work+0x16f/0x6e0 > > [] worker_thread+0x4e/0x490 > > [] ? process_one_work+0x6e0/0x6e0 > > [] kthread+0xf2/0x110 > > [] ? preempt_count_sub+0x4c/0x80 > > [] ret_from_fork+0x22/0x50 > > [] ? kthread_create_on_node+0x220/0x220 > > Code: ff ff e8 27 a0 35 00 48 8d 83 78 05 00 00 c7 45 c0 00 00 00 00 48 89 > > 45 80 48 > >8d 83 e0 05 00 00 48 89 85 78 ff ff ff 48 8b 45 b8 <48> 8b b8 60 22 > > 00 00 48 > >8b 30 89 f8 8b 8b 88 04 00 00 29 f0 8d > > RIP [] n_tty_receive_buf_common+0x6d/0xb80 > > RSP > > CR2: 2260 > > > > Ensure the kworker cannot obtain the ldisc reference until the new ldisc > > is completely initialized. > > > > Fixes: 892d1fa7eaae ("tty: Destroy ldisc instance on hangup") > > Reported-by: Mikulas Patocka > > Signed-off-by: Peter Hurley > > --- > > drivers/tty/tty_ldisc.c | 11 ++- > > 1 file changed, 6 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c > > index cdd063f..bda0c85 100644 > > --- a/drivers/tty/tty_ldisc.c > > +++ b/drivers/tty/tty_ldisc.c > > @@ -669,16 +669,17 @@ int tty_ldisc_reinit(struct tty_struct *tty, int disc) > > tty_ldisc_put(tty->ldisc); > > } > > > > - /* switch the line discipline */ > > - tty->ldisc = ld; > > tty_set_termios_ldisc(tty, disc); > > - retval = tty_ldisc_open(tty, tty->ldisc); > > + retval = tty_ldisc_open(tty, ld); > > if (retval) { > > if (!WARN_ON(disc == N_TTY)) { > > -
Re: tty crash in Linux 4.6
Hi This patch works, I've had no tty crashes since applying it. I've seen that you haven't sent this patch yet to Linux-4.7-rc and Linux-4.6-stable. Will you? Or did you create a different patch? Mikulas On Tue, 17 May 2016, Peter Hurley wrote: > On 05/17/2016 08:57 AM, Peter Hurley wrote: > > On 05/16/2016 04:36 PM, Peter Hurley wrote: > >> > Hi Mikulas, > >> > > >> > On 05/16/2016 01:12 PM, Mikulas Patocka wrote: > >>> >> Hi > >>> >> > >>> >> In the kernel 4.6 I get crashes in the tty layer. I can reproduce the > >>> >> crash by logging into the machine with ssh and typing before the > >>> >> prompt > >>> >> appears. > >> > > >> > Thanks for the report. > >> > I tried to reproduce this a number of times on different machines > >> > with no luck. > > > > I was able to reproduce this crash with a test jig. > > The patch below fixed it, but I'm testing a better patch now, which > > I'll get to you asap. > > --- >% --- > Subject: [PATCH] tty: Fix ldisc crash on reopened tty > > If the tty has been hungup, the ldisc instance may have been destroyed. > Continued input to the tty will be ignored as long as the ldisc instance > is not visible to the flush_to_ldisc kworker. However, when the tty > is reopened and a new ldisc instance is created, the flush_to_ldisc > kworker can obtain an ldisc reference before the new ldisc is > completely initialized. This will likely crash: > > BUG: unable to handle kernel paging request at 2260 > IP: [] n_tty_receive_buf_common+0x6d/0xb80 > PGD 2ab581067 PUD 290c11067 PMD 0 > Oops: [#1] PREEMPT SMP > Modules linked in: nls_iso8859_1 ip6table_filter [.] > CPU: 2 PID: 103 Comm: kworker/u16:1 Not tainted 4.6.0-rc7+wip-xeon+debug > #rc7+wip > Hardware name: Dell Inc. Precision WorkStation T5400 /0RW203, BIOS A11 > 04/30/2012 > Workqueue: events_unbound flush_to_ldisc > task: 8802ad16d100 ti: 8802ad31c000 task.ti: 8802ad31c000 > RIP: 0010:[] [] > n_tty_receive_buf_common+0x6d/0xb80 > RSP: 0018:8802ad31fc70 EFLAGS: 00010296 > RAX: RBX: 8802aaddd800 RCX: 0001 > RDX: RSI: 810db48f RDI: 0246 > RBP: 8802ad31fd08 R08: R09: 0001 > R10: 8802aadddb28 R11: 0001 R12: 8800ba6da808 > R13: 8802ad18be80 R14: 8800ba6da858 R15: 8800ba6da800 > FS: () GS:8802b0a0() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 2260 CR3: 00028ee5d000 CR4: 06e0 > Stack: > 81531219 8802aadddab8 8802aae0 8802aa78 > 0001 8800ba6da858 8800ba6da860 8802ad31fd30 > 81885f78 81531219 0002 > Call Trace: > [] ? flush_to_ldisc+0x49/0xd0 > [] ? mutex_lock_nested+0x2c8/0x430 > [] ? flush_to_ldisc+0x49/0xd0 > [] n_tty_receive_buf2+0x14/0x20 > [] tty_ldisc_receive_buf+0x22/0x50 > [] flush_to_ldisc+0xbe/0xd0 > [] process_one_work+0x1ed/0x6e0 > [] ? process_one_work+0x16f/0x6e0 > [] worker_thread+0x4e/0x490 > [] ? process_one_work+0x6e0/0x6e0 > [] kthread+0xf2/0x110 > [] ? preempt_count_sub+0x4c/0x80 > [] ret_from_fork+0x22/0x50 > [] ? kthread_create_on_node+0x220/0x220 > Code: ff ff e8 27 a0 35 00 48 8d 83 78 05 00 00 c7 45 c0 00 00 00 00 48 89 > 45 80 48 >8d 83 e0 05 00 00 48 89 85 78 ff ff ff 48 8b 45 b8 <48> 8b b8 60 22 00 > 00 48 >8b 30 89 f8 8b 8b 88 04 00 00 29 f0 8d > RIP [] n_tty_receive_buf_common+0x6d/0xb80 > RSP > CR2: 2260 > > Ensure the kworker cannot obtain the ldisc reference until the new ldisc > is completely initialized. > > Fixes: 892d1fa7eaae ("tty: Destroy ldisc instance on hangup") > Reported-by: Mikulas Patocka > Signed-off-by: Peter Hurley > --- > drivers/tty/tty_ldisc.c | 11 ++- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c > index cdd063f..bda0c85 100644 > --- a/drivers/tty/tty_ldisc.c > +++ b/drivers/tty/tty_ldisc.c > @@ -669,16 +669,17 @@ int tty_ldisc_reinit(struct tty_struct *tty, int disc) > tty_ldisc_put(tty->ldisc); > } > > - /* switch the line discipline */ > - tty->ldisc = ld; > tty_set_termios_ldisc(tty, disc); > - retval = tty_ldisc_open(tty, tty->ldisc); > + retval = tty_ldisc_open(tty, ld); > if (retval) { > if (!WARN_ON(disc == N_TTY)) { > - tty_ldisc_put(tty->ldisc); > - tty->ldisc = NULL; > + tty_ldisc_put(ld); > + ld = NULL; > } > } > + > + /* switch the line discipline */ > + smp_store_release(&tty->ldisc, ld); > return retval; > } > > -- > 2.8.2 >
Re: tty crash in Linux 4.6
On Tue, 17 May 2016, Peter Hurley wrote: > On 05/17/2016 08:57 AM, Peter Hurley wrote: > > On 05/16/2016 04:36 PM, Peter Hurley wrote: > >> > Hi Mikulas, > >> > > >> > On 05/16/2016 01:12 PM, Mikulas Patocka wrote: > >>> >> Hi > >>> >> > >>> >> In the kernel 4.6 I get crashes in the tty layer. I can reproduce the > >>> >> crash by logging into the machine with ssh and typing before the > >>> >> prompt > >>> >> appears. > >> > > >> > Thanks for the report. > >> > I tried to reproduce this a number of times on different machines > >> > with no luck. > > > > I was able to reproduce this crash with a test jig. > > The patch below fixed it, but I'm testing a better patch now, which > > I'll get to you asap. > > --- >% --- Hi I confirm that this patch fixes it. (your previous patch also fixed it). Mikulas > Subject: [PATCH] tty: Fix ldisc crash on reopened tty > > If the tty has been hungup, the ldisc instance may have been destroyed. > Continued input to the tty will be ignored as long as the ldisc instance > is not visible to the flush_to_ldisc kworker. However, when the tty > is reopened and a new ldisc instance is created, the flush_to_ldisc > kworker can obtain an ldisc reference before the new ldisc is > completely initialized. This will likely crash: > > BUG: unable to handle kernel paging request at 2260 > IP: [] n_tty_receive_buf_common+0x6d/0xb80 > PGD 2ab581067 PUD 290c11067 PMD 0 > Oops: [#1] PREEMPT SMP > Modules linked in: nls_iso8859_1 ip6table_filter [.] > CPU: 2 PID: 103 Comm: kworker/u16:1 Not tainted 4.6.0-rc7+wip-xeon+debug > #rc7+wip > Hardware name: Dell Inc. Precision WorkStation T5400 /0RW203, BIOS A11 > 04/30/2012 > Workqueue: events_unbound flush_to_ldisc > task: 8802ad16d100 ti: 8802ad31c000 task.ti: 8802ad31c000 > RIP: 0010:[] [] > n_tty_receive_buf_common+0x6d/0xb80 > RSP: 0018:8802ad31fc70 EFLAGS: 00010296 > RAX: RBX: 8802aaddd800 RCX: 0001 > RDX: RSI: 810db48f RDI: 0246 > RBP: 8802ad31fd08 R08: R09: 0001 > R10: 8802aadddb28 R11: 0001 R12: 8800ba6da808 > R13: 8802ad18be80 R14: 8800ba6da858 R15: 8800ba6da800 > FS: () GS:8802b0a0() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 2260 CR3: 00028ee5d000 CR4: 06e0 > Stack: > 81531219 8802aadddab8 8802aae0 8802aa78 > 0001 8800ba6da858 8800ba6da860 8802ad31fd30 > 81885f78 81531219 0002 > Call Trace: > [] ? flush_to_ldisc+0x49/0xd0 > [] ? mutex_lock_nested+0x2c8/0x430 > [] ? flush_to_ldisc+0x49/0xd0 > [] n_tty_receive_buf2+0x14/0x20 > [] tty_ldisc_receive_buf+0x22/0x50 > [] flush_to_ldisc+0xbe/0xd0 > [] process_one_work+0x1ed/0x6e0 > [] ? process_one_work+0x16f/0x6e0 > [] worker_thread+0x4e/0x490 > [] ? process_one_work+0x6e0/0x6e0 > [] kthread+0xf2/0x110 > [] ? preempt_count_sub+0x4c/0x80 > [] ret_from_fork+0x22/0x50 > [] ? kthread_create_on_node+0x220/0x220 > Code: ff ff e8 27 a0 35 00 48 8d 83 78 05 00 00 c7 45 c0 00 00 00 00 48 89 > 45 80 48 >8d 83 e0 05 00 00 48 89 85 78 ff ff ff 48 8b 45 b8 <48> 8b b8 60 22 00 > 00 48 >8b 30 89 f8 8b 8b 88 04 00 00 29 f0 8d > RIP [] n_tty_receive_buf_common+0x6d/0xb80 > RSP > CR2: 2260 > > Ensure the kworker cannot obtain the ldisc reference until the new ldisc > is completely initialized. > > Fixes: 892d1fa7eaae ("tty: Destroy ldisc instance on hangup") > Reported-by: Mikulas Patocka > Signed-off-by: Peter Hurley > --- > drivers/tty/tty_ldisc.c | 11 ++- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c > index cdd063f..bda0c85 100644 > --- a/drivers/tty/tty_ldisc.c > +++ b/drivers/tty/tty_ldisc.c > @@ -669,16 +669,17 @@ int tty_ldisc_reinit(struct tty_struct *tty, int disc) > tty_ldisc_put(tty->ldisc); > } > > - /* switch the line discipline */ > - tty->ldisc = ld; > tty_set_termios_ldisc(tty, disc); > - retval = tty_ldisc_open(tty, tty->ldisc); > + retval = tty_ldisc_open(tty, ld); > if (retval) { > if (!WARN_ON(disc == N_TTY)) { > - tty_ldisc_put(tty->ldisc); > - tty->ldisc = NULL; > + tty_ldisc_put(ld); > + ld = NULL; > } > } > + > + /* switch the line discipline */ > + smp_store_release(&tty->ldisc, ld); > return retval; > } > > -- > 2.8.2 >
Re: tty crash in Linux 4.6
On 05/17/2016 08:57 AM, Peter Hurley wrote: > On 05/16/2016 04:36 PM, Peter Hurley wrote: >> > Hi Mikulas, >> > >> > On 05/16/2016 01:12 PM, Mikulas Patocka wrote: >>> >> Hi >>> >> >>> >> In the kernel 4.6 I get crashes in the tty layer. I can reproduce the >>> >> crash by logging into the machine with ssh and typing before the prompt >>> >> appears. >> > >> > Thanks for the report. >> > I tried to reproduce this a number of times on different machines >> > with no luck. > > I was able to reproduce this crash with a test jig. > The patch below fixed it, but I'm testing a better patch now, which > I'll get to you asap. --- >% --- Subject: [PATCH] tty: Fix ldisc crash on reopened tty If the tty has been hungup, the ldisc instance may have been destroyed. Continued input to the tty will be ignored as long as the ldisc instance is not visible to the flush_to_ldisc kworker. However, when the tty is reopened and a new ldisc instance is created, the flush_to_ldisc kworker can obtain an ldisc reference before the new ldisc is completely initialized. This will likely crash: BUG: unable to handle kernel paging request at 2260 IP: [] n_tty_receive_buf_common+0x6d/0xb80 PGD 2ab581067 PUD 290c11067 PMD 0 Oops: [#1] PREEMPT SMP Modules linked in: nls_iso8859_1 ip6table_filter [.] CPU: 2 PID: 103 Comm: kworker/u16:1 Not tainted 4.6.0-rc7+wip-xeon+debug #rc7+wip Hardware name: Dell Inc. Precision WorkStation T5400 /0RW203, BIOS A11 04/30/2012 Workqueue: events_unbound flush_to_ldisc task: 8802ad16d100 ti: 8802ad31c000 task.ti: 8802ad31c000 RIP: 0010:[] [] n_tty_receive_buf_common+0x6d/0xb80 RSP: 0018:8802ad31fc70 EFLAGS: 00010296 RAX: RBX: 8802aaddd800 RCX: 0001 RDX: RSI: 810db48f RDI: 0246 RBP: 8802ad31fd08 R08: R09: 0001 R10: 8802aadddb28 R11: 0001 R12: 8800ba6da808 R13: 8802ad18be80 R14: 8800ba6da858 R15: 8800ba6da800 FS: () GS:8802b0a0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 2260 CR3: 00028ee5d000 CR4: 06e0 Stack: 81531219 8802aadddab8 8802aae0 8802aa78 0001 8800ba6da858 8800ba6da860 8802ad31fd30 81885f78 81531219 0002 Call Trace: [] ? flush_to_ldisc+0x49/0xd0 [] ? mutex_lock_nested+0x2c8/0x430 [] ? flush_to_ldisc+0x49/0xd0 [] n_tty_receive_buf2+0x14/0x20 [] tty_ldisc_receive_buf+0x22/0x50 [] flush_to_ldisc+0xbe/0xd0 [] process_one_work+0x1ed/0x6e0 [] ? process_one_work+0x16f/0x6e0 [] worker_thread+0x4e/0x490 [] ? process_one_work+0x6e0/0x6e0 [] kthread+0xf2/0x110 [] ? preempt_count_sub+0x4c/0x80 [] ret_from_fork+0x22/0x50 [] ? kthread_create_on_node+0x220/0x220 Code: ff ff e8 27 a0 35 00 48 8d 83 78 05 00 00 c7 45 c0 00 00 00 00 48 89 45 80 48 8d 83 e0 05 00 00 48 89 85 78 ff ff ff 48 8b 45 b8 <48> 8b b8 60 22 00 00 48 8b 30 89 f8 8b 8b 88 04 00 00 29 f0 8d RIP [] n_tty_receive_buf_common+0x6d/0xb80 RSP CR2: 2260 Ensure the kworker cannot obtain the ldisc reference until the new ldisc is completely initialized. Fixes: 892d1fa7eaae ("tty: Destroy ldisc instance on hangup") Reported-by: Mikulas Patocka Signed-off-by: Peter Hurley --- drivers/tty/tty_ldisc.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c index cdd063f..bda0c85 100644 --- a/drivers/tty/tty_ldisc.c +++ b/drivers/tty/tty_ldisc.c @@ -669,16 +669,17 @@ int tty_ldisc_reinit(struct tty_struct *tty, int disc) tty_ldisc_put(tty->ldisc); } - /* switch the line discipline */ - tty->ldisc = ld; tty_set_termios_ldisc(tty, disc); - retval = tty_ldisc_open(tty, tty->ldisc); + retval = tty_ldisc_open(tty, ld); if (retval) { if (!WARN_ON(disc == N_TTY)) { - tty_ldisc_put(tty->ldisc); - tty->ldisc = NULL; + tty_ldisc_put(ld); + ld = NULL; } } + + /* switch the line discipline */ + smp_store_release(&tty->ldisc, ld); return retval; } -- 2.8.2
Re: tty crash in Linux 4.6
On 05/16/2016 04:36 PM, Peter Hurley wrote: > Hi Mikulas, > > On 05/16/2016 01:12 PM, Mikulas Patocka wrote: >> Hi >> >> In the kernel 4.6 I get crashes in the tty layer. I can reproduce the >> crash by logging into the machine with ssh and typing before the prompt >> appears. > > Thanks for the report. > I tried to reproduce this a number of times on different machines > with no luck. I was able to reproduce this crash with a test jig. The patch below fixed it, but I'm testing a better patch now, which I'll get to you asap. Regards, Peter Hurley >> The crash is caused by the pointer tty->disc_data being NULL in the >> function n_tty_receive_buf_common. The crash happens on the statement >> smp_load_acquire(&ldata->read_tail). >> >> Bisecting shows that the crashes are caused by the patch >> 892d1fa7eaaed9d3c04954cb140c34ebc3393932 ("tty: Destroy ldisc instance on >> hangup"). > > > Can you try the test patch below? > > Regards, > Peter Hurley > > >> Kernel Fault: Code=15 regs=7d9e0720 (Addr=2260) >> CPU: 0 PID: 3319 Comm: kworker/u8:0 Not tainted 4.6.0 #1 >> Workqueue: events_unbound flush_to_ldisc >> task: 7c25ea80 ti: 7d9e task.ti: 7d9e >> >> YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI >> PSW: 1100 Not tainted >> r00-03 0804000f 4076cd10 40475fb4 7f761800 >> r04-07 40749510 0001 7f761800 7d9e0490 >> r08-11 7e722890 7da4ec00 7f763823 >> r12-15 7fc08ea8 7fc08c78 4080e080 >> r16-19 7fc08c00 0001 2260 >> r20-23 7f7618b0 7c25ea80 0001 0001 >> r24-27 080f 7f7618ac 40749510 >> r28-31 0001 7d9e0840 7d9e0720 0001 >> sr00-03 086c8800 086c8800 >> sr04-07 >> >> IASQ: IAOQ: 40475fd4 >> 40475fd8 >> IIR: 0e6c00d5ISR: IOR: 2260 >> CPU:0 CR30: 7d9e CR31: ff87e7ffbc9e >> ORIG_R28: 4080a180 >> IAOQ[0]: n_tty_receive_buf_common+0xb4/0xbe0 >> IAOQ[1]: n_tty_receive_buf_common+0xb8/0xbe0 >> RP(r2): n_tty_receive_buf_common+0x94/0xbe0 >> Backtrace: >> [<40476b14>] n_tty_receive_buf2+0x14/0x20 >> [<4047a208>] tty_ldisc_receive_buf+0x30/0x90 >> [<4047a544>] flush_to_ldisc+0x144/0x1c8 >> [<402556bc>] process_one_work+0x1b4/0x460 >> [<40255bbc>] worker_thread+0x1e4/0x5e0 >> [<4025d454>] kthread+0x134/0x168 > > --- >% --- > diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c > index 68947f6..f271832 100644 > --- a/drivers/tty/tty_ldisc.c > +++ b/drivers/tty/tty_ldisc.c > @@ -653,7 +653,7 @@ static void tty_reset_termios(struct tty_struct *tty) > * Returns 0 if successful, otherwise error code < 0 > */ > > -int tty_ldisc_reinit(struct tty_struct *tty, int disc) > +static int __tty_ldisc_reinit(struct tty_struct *tty, int disc) > { > struct tty_ldisc *ld; > int retval; > @@ -682,6 +682,16 @@ int tty_ldisc_reinit(struct tty_struct *tty, int disc) > return retval; > } > > +int tty_ldisc_reinit(struct tty_struct *tty, int disc) > +{ > + int retval; > + > + tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT); > + retval = __tty_ldisc_reinit(tty, disc); > + tty_ldisc_unlock(tty); > + return retval; > +} > + > /** > * tty_ldisc_hangup- hangup ldisc reset > * @tty: tty being hung up > @@ -732,8 +742,8 @@ void tty_ldisc_hangup(struct tty_struct *tty, bool reinit) > > if (tty->ldisc) { > if (reinit) { > - if (tty_ldisc_reinit(tty, tty->termios.c_line) < 0) > - tty_ldisc_reinit(tty, N_TTY); > + if (__tty_ldisc_reinit(tty, tty->termios.c_line) < 0) > + __tty_ldisc_reinit(tty, N_TTY); > } else > tty_ldisc_kill(tty); > } >
Re: tty crash in Linux 4.6
Hi Mikulas, On 05/16/2016 01:12 PM, Mikulas Patocka wrote: > Hi > > In the kernel 4.6 I get crashes in the tty layer. I can reproduce the > crash by logging into the machine with ssh and typing before the prompt > appears. Thanks for the report. I tried to reproduce this a number of times on different machines with no luck. > The crash is caused by the pointer tty->disc_data being NULL in the > function n_tty_receive_buf_common. The crash happens on the statement > smp_load_acquire(&ldata->read_tail). > > Bisecting shows that the crashes are caused by the patch > 892d1fa7eaaed9d3c04954cb140c34ebc3393932 ("tty: Destroy ldisc instance on > hangup"). Can you try the test patch below? Regards, Peter Hurley > Kernel Fault: Code=15 regs=7d9e0720 (Addr=2260) > CPU: 0 PID: 3319 Comm: kworker/u8:0 Not tainted 4.6.0 #1 > Workqueue: events_unbound flush_to_ldisc > task: 7c25ea80 ti: 7d9e task.ti: 7d9e > > YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI > PSW: 1100 Not tainted > r00-03 0804000f 4076cd10 40475fb4 7f761800 > r04-07 40749510 0001 7f761800 7d9e0490 > r08-11 7e722890 7da4ec00 7f763823 > r12-15 7fc08ea8 7fc08c78 4080e080 > r16-19 7fc08c00 0001 2260 > r20-23 7f7618b0 7c25ea80 0001 0001 > r24-27 080f 7f7618ac 40749510 > r28-31 0001 7d9e0840 7d9e0720 0001 > sr00-03 086c8800 086c8800 > sr04-07 > > IASQ: IAOQ: 40475fd4 > 40475fd8 > IIR: 0e6c00d5ISR: IOR: 2260 > CPU:0 CR30: 7d9e CR31: ff87e7ffbc9e > ORIG_R28: 4080a180 > IAOQ[0]: n_tty_receive_buf_common+0xb4/0xbe0 > IAOQ[1]: n_tty_receive_buf_common+0xb8/0xbe0 > RP(r2): n_tty_receive_buf_common+0x94/0xbe0 > Backtrace: > [<40476b14>] n_tty_receive_buf2+0x14/0x20 > [<4047a208>] tty_ldisc_receive_buf+0x30/0x90 > [<4047a544>] flush_to_ldisc+0x144/0x1c8 > [<402556bc>] process_one_work+0x1b4/0x460 > [<40255bbc>] worker_thread+0x1e4/0x5e0 > [<4025d454>] kthread+0x134/0x168 --- >% --- diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c index 68947f6..f271832 100644 --- a/drivers/tty/tty_ldisc.c +++ b/drivers/tty/tty_ldisc.c @@ -653,7 +653,7 @@ static void tty_reset_termios(struct tty_struct *tty) * Returns 0 if successful, otherwise error code < 0 */ -int tty_ldisc_reinit(struct tty_struct *tty, int disc) +static int __tty_ldisc_reinit(struct tty_struct *tty, int disc) { struct tty_ldisc *ld; int retval; @@ -682,6 +682,16 @@ int tty_ldisc_reinit(struct tty_struct *tty, int disc) return retval; } +int tty_ldisc_reinit(struct tty_struct *tty, int disc) +{ + int retval; + + tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT); + retval = __tty_ldisc_reinit(tty, disc); + tty_ldisc_unlock(tty); + return retval; +} + /** * tty_ldisc_hangup- hangup ldisc reset * @tty: tty being hung up @@ -732,8 +742,8 @@ void tty_ldisc_hangup(struct tty_struct *tty, bool reinit) if (tty->ldisc) { if (reinit) { - if (tty_ldisc_reinit(tty, tty->termios.c_line) < 0) - tty_ldisc_reinit(tty, N_TTY); + if (__tty_ldisc_reinit(tty, tty->termios.c_line) < 0) + __tty_ldisc_reinit(tty, N_TTY); } else tty_ldisc_kill(tty); }