Re: [PATCH] do_signal_stop: use signal_group_exit()
Oleg Nesterov <[EMAIL PROTECTED]> writes: > >>From 2.6.25-rc2-mm1.bz2 patch: >> >> - .ioctl = tty_ioctl, >> + .unlocked_ioctl = tty_ioctl, > > and this is why this didn't happen before, I guess. Right. Does anyone know what kind of audit was made of the tty code to ensure everything would be fine? It it was pretty thorough and it was just this little corner case it makes sense to get the struct tty locking for pids correct. Otherwise since the tty layer is historically not especially good with it's locking. We should just revert the change above, and save attacking this until someone has time to do a thorough review. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] do_signal_stop: use signal_group_exit()
On Sun, 17 Feb 2008 21:11:14 MST, Eric W. Biederman said: > Oleg Nesterov <[EMAIL PROTECTED]> writes: > > On 02/16, Oleg Nesterov wrote: > >> On 02/15, Andrew Morton wrote: > >> > : BUG: unable to handle kernel paging request at 00200200 > >> > : Call Trace: > >> > : [] ? release_task+0x152/0x2e5 > >> > : [] ? do_wait+0x6c7/0xa1c > >> > : [] ? default_wake_function+0x0/0xe > >> > : [] ? sys_rt_sigaction+0x7a/0x98 > >> > : [] ? sys_wait4+0x8a/0xa1 > >> > : [] ? system_call_after_swapgs+0x7b/0x80 > Thanks. Looks like we need to grab a lock there. > At a quick skim I think we need the tty lock. *ping* - Any further activity on this one? I got bit by it as well on the very first attempted boot of 25-rc2-mm1, the instant it tried to leave single-user and go multi-user. pgpgv8e1NFpVN.pgp Description: PGP signature
Re: [PATCH] do_signal_stop: use signal_group_exit()
Oleg Nesterov <[EMAIL PROTECTED]> writes: > On 02/16, Oleg Nesterov wrote: >> >> On 02/15, Andrew Morton wrote: >> > >> > ug. On about the fourth boot with the current -mm lineup I hit: >> > >> > : BUG: unable to handle kernel paging request at 00200200 >> >> == LIST_POISON2 >> >> > : IP: [] free_pid+0x35/0x8e >> >> most probably == hlist_del_rcu(pid_chain) >> >> > : Call Trace: >> > : [] ? release_task+0x152/0x2e5 >> > : [] ? do_wait+0x6c7/0xa1c >> > : [] ? default_wake_function+0x0/0xe >> > : [] ? sys_rt_sigaction+0x7a/0x98 >> > : [] ? sys_wait4+0x8a/0xa1 >> > : [] ? system_call_after_swapgs+0x7b/0x80 >> >> (Can't understand why there is no detach_pid() in this stack trace, >> but it is the only possible caller of free_pid()). That is just because of tail call optimization. There is no need to return to detach_pid so we just jumped to free_pid. > >> So, detach_pid()->free_pid() hit an already unhashed pid. But this >> is not possible? >> >> This means we already did detach_pid(), but in that case the previous >> detach_pid() has set task->pids[].pid = NULL, and we should OOPS earlier, >> somewhere at "if (!hlist_empty(&pid->tasks[tmp]))". > > Yes, this is not possible. > > But what possible is: we have the unbalanced put_pid(pid) which frees the > live pid. The next alloc_pid() gets the same memory, and initializes all > pid->tasks[] lists. > > Now, if that pid was used as PIDTYPE_PGID/PIDTYPE_SID, the next detach_pid() > from this pgrp/sid sees that the pid is not used (all lists are hlist_empty), > frees this pid again, and the bug manifests itself this way. > > tiocspgrp: > > put_pid(real_tty->pgrp); > // -- WINDOW -- > real_tty->pgrp = get_pid(pgrp); > > When bash spawns the command, both parent and child do ioctl(TIOCSPGRP,child), > and it is possible that both do put_pid() on the same parent's pid. > > Damn, when you know what the bug is, the test case is trivial: > > $ while true; do perl -e0; done > > The kernel crashes. > >>From 2.6.25-rc2-mm1.bz2 patch: >> >> - .ioctl = tty_ioctl, >> + .unlocked_ioctl = tty_ioctl, > > and this is why this didn't happen before, I guess. > >> I'll try to think more about this, but I doubt very much I'll find the >> reason :( > > Ohhh... 7 (!!!) hours of hacking + some vodka did the trick. > > (Kamalesh, I think you hit the same bug). Thanks. Looks like we need to grab a lock there. At a quick skim I think we need the tty lock. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] do_signal_stop: use signal_group_exit()
On 02/16, Oleg Nesterov wrote: > > On 02/15, Andrew Morton wrote: > > > > ug. On about the fourth boot with the current -mm lineup I hit: > > > > : BUG: unable to handle kernel paging request at 00200200 > > == LIST_POISON2 > > > : IP: [] free_pid+0x35/0x8e > > most probably == hlist_del_rcu(pid_chain) > > > : Call Trace: > > : [] ? release_task+0x152/0x2e5 > > : [] ? do_wait+0x6c7/0xa1c > > : [] ? default_wake_function+0x0/0xe > > : [] ? sys_rt_sigaction+0x7a/0x98 > > : [] ? sys_wait4+0x8a/0xa1 > > : [] ? system_call_after_swapgs+0x7b/0x80 > > (Can't understand why there is no detach_pid() in this stack trace, > but it is the only possible caller of free_pid()). > > So, detach_pid()->free_pid() hit an already unhashed pid. But this > is not possible? > > This means we already did detach_pid(), but in that case the previous > detach_pid() has set task->pids[].pid = NULL, and we should OOPS earlier, > somewhere at "if (!hlist_empty(&pid->tasks[tmp]))". Yes, this is not possible. But what possible is: we have the unbalanced put_pid(pid) which frees the live pid. The next alloc_pid() gets the same memory, and initializes all pid->tasks[] lists. Now, if that pid was used as PIDTYPE_PGID/PIDTYPE_SID, the next detach_pid() from this pgrp/sid sees that the pid is not used (all lists are hlist_empty), frees this pid again, and the bug manifests itself this way. tiocspgrp: put_pid(real_tty->pgrp); // -- WINDOW -- real_tty->pgrp = get_pid(pgrp); When bash spawns the command, both parent and child do ioctl(TIOCSPGRP,child), and it is possible that both do put_pid() on the same parent's pid. Damn, when you know what the bug is, the test case is trivial: $ while true; do perl -e0; done The kernel crashes. >From 2.6.25-rc2-mm1.bz2 patch: > > - .ioctl = tty_ioctl, > + .unlocked_ioctl = tty_ioctl, and this is why this didn't happen before, I guess. > I'll try to think more about this, but I doubt very much I'll find the > reason :( Ohhh... 7 (!!!) hours of hacking + some vodka did the trick. (Kamalesh, I think you hit the same bug). Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] do_signal_stop: use signal_group_exit()
On 02/15, Andrew Morton wrote: > > ug. On about the fourth boot with the current -mm lineup I hit: > > : BUG: unable to handle kernel paging request at 00200200 == LIST_POISON2 > : IP: [] free_pid+0x35/0x8e most probably == hlist_del_rcu(pid_chain) > : PGD 2574cb067 PUD 257561067 PMD 0 > : Oops: 0002 [1] SMP > : last sysfs file: /sys/class/net/eth0/address > : CPU 2 > : Modules linked in: ipv6 dm_mirror dm_multipath dm_mod sbs sbshc dock > battery ac parport_pc lp parport snd_hda_intel snd_seq_dummy snd_seq_oss > snd_seq_midi_event snd_seq shpchp snd_seq_device sg snd_pcm_oss snd_mixer_oss > snd_pcm floppy snd_timer button i2c_i801 snd soundcore ide_cd_mod cdrom > serio_raw i2c_core snd_page_alloc pcspkr ehci_hcd ohci_hcd uhci_hcd > : Pid: 3132, comm: ifup-eth Not tainted 2.6.25-rc2-mm1 #5 > : RIP: 0010:[] [] free_pid+0x35/0x8e > : RSP: 0018:81025754de58 EFLAGS: 00010046 > : RAX: RBX: 81025f268840 RCX: 81025f263f08 > : RDX: 00200200 RSI: 0046 RDI: > : RBP: 81025f263ec0 R08: 81025f268b18 R09: 81025f268b08 > : R10: 81025f268b08 R11: R12: 810259853140 > : R13: 0c78 R14: R15: > : FS: 7f8f9ba7d6f0() GS:81025f16f0c0() knlGS: > : CS: 0010 DS: ES: CR0: 8005003b > : CR2: 00200200 CR3: 0002598d CR4: 06e0 > : DR0: DR1: DR2: > : DR3: DR6: 0ff0 DR7: 0400 > : Process ifup-eth (pid: 3132, threadinfo 81025754c000, task > 81025d467620) > : Stack: 81025f268b08 81025f268840 81025994b660 80237727 > : 81025994b660 81025994b660 80237f81 > : 05d0 810257561018 7fffa3aa9514 > : Call Trace: > : [] ? release_task+0x152/0x2e5 > : [] ? do_wait+0x6c7/0xa1c > : [] ? default_wake_function+0x0/0xe > : [] ? sys_rt_sigaction+0x7a/0x98 > : [] ? sys_wait4+0x8a/0xa1 > : [] ? system_call_after_swapgs+0x7b/0x80 (Can't understand why there is no detach_pid() in this stack trace, but it is the only possible caller of free_pid()). So, detach_pid()->free_pid() hit an already unhashed pid. But this is not possible? This means we already did detach_pid(), but in that case the previous detach_pid() has set task->pids[].pid = NULL, and we should OOPS earlier, somewhere at "if (!hlist_empty(&pid->tasks[tmp]))". > and I don't have a clue which patch caused it and I won't be near this > machine again for over a week. Definitely not this patch... I'll try to think more about this, but I doubt very much I'll find the reason :( Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] do_signal_stop: use signal_group_exit()
ug. On about the fourth boot with the current -mm lineup I hit: : security: permission sendto in class node not defined in policy : security: permission dccp_recv in class netif not defined in policy : security: permission dccp_send in class netif not defined in policy : security: permission ingress in class netif not defined in policy : security: permission egress in class netif not defined in policy : security: permission setfcap in class capability not defined in policy : security: permission forward_in in class packet not defined in policy : security: permission forward_out in class packet not defined in policy : SELinux: policy loaded with handle_unknown=deny : type=1403 audit(1203124656.152:3): policy loaded auid=4294967295 ses=4294967295 : BUG: unable to handle kernel paging request at 00200200 : IP: [] free_pid+0x35/0x8e : PGD 2574cb067 PUD 257561067 PMD 0 : Oops: 0002 [1] SMP : last sysfs file: /sys/class/net/eth0/address : CPU 2 : Modules linked in: ipv6 dm_mirror dm_multipath dm_mod sbs sbshc dock battery ac parport_pc lp parport snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq shpchp snd_seq_device sg snd_pcm_oss snd_mixer_oss snd_pcm floppy snd_timer button i2c_i801 snd soundcore ide_cd_mod cdrom serio_raw i2c_core snd_page_alloc pcspkr ehci_hcd ohci_hcd uhci_hcd : Pid: 3132, comm: ifup-eth Not tainted 2.6.25-rc2-mm1 #5 : RIP: 0010:[] [] free_pid+0x35/0x8e : RSP: 0018:81025754de58 EFLAGS: 00010046 : RAX: RBX: 81025f268840 RCX: 81025f263f08 : RDX: 00200200 RSI: 0046 RDI: : RBP: 81025f263ec0 R08: 81025f268b18 R09: 81025f268b08 : R10: 81025f268b08 R11: R12: 810259853140 : R13: 0c78 R14: R15: : FS: 7f8f9ba7d6f0() GS:81025f16f0c0() knlGS: : CS: 0010 DS: ES: CR0: 8005003b : CR2: 00200200 CR3: 0002598d CR4: 06e0 : DR0: DR1: DR2: : DR3: DR6: 0ff0 DR7: 0400 : Process ifup-eth (pid: 3132, threadinfo 81025754c000, task 81025d467620) : Stack: 81025f268b08 81025f268840 81025994b660 80237727 : 81025994b660 81025994b660 80237f81 : 05d0 810257561018 7fffa3aa9514 : Call Trace: : [] ? release_task+0x152/0x2e5 : [] ? do_wait+0x6c7/0xa1c : [] ? default_wake_function+0x0/0xe : [] ? sys_rt_sigaction+0x7a/0x98 : [] ? sys_wait4+0x8a/0xa1 : [] ? system_call_after_swapgs+0x7b/0x80 : : : Code: 80 53 41 51 e8 83 d4 28 00 31 ff 48 89 c6 eb 2e 48 63 c7 48 c1 e0 05 48 8d 44 28 40 48 8d 48 08 48 8b 40 08 48 8b 51 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 c7 41 08 00 02 20 00 ff c7 3b 7d : RIP [] free_pid+0x35/0x8e : RSP : CR2: 00200200 : ---[ end trace efde415d3f801416 ]--- : BUG: sleeping function called from invalid context at kernel/rwsem.c:21 : in_atomic():0, irqs_disabled():1 : Pid: 3132, comm: ifup-eth Tainted: G D 2.6.25-rc2-mm1 #5 : : Call Trace: : [] down_read+0x15/0x23 : [] acct_collect+0x40/0x180 : [] do_exit+0x1f3/0x6ba : [] sync_regs+0x0/0x67 : [] do_page_fault+0x755/0x80e : [] error_exit+0x0/0x51 : [] free_pid+0x35/0x8e : [] free_pid+0x13/0x8e : [] release_task+0x152/0x2e5 : [] do_wait+0x6c7/0xa1c : [] default_wake_function+0x0/0xe : [] sys_rt_sigaction+0x7a/0x98 : [] sys_wait4+0x8a/0xa1 : [] system_call_after_swapgs+0x7b/0x80 : : eth0: no IPv6 routers present : and I don't have a clue which patch caused it and I won't be near this machine again for over a week. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] do_signal_stop: use signal_group_exit()
do_signal_stop() needs signal_group_exit() but checks sig->group_exit_task. This (optimization) is correct, SIGNAL_STOP_DEQUEUED and SIGNAL_GROUP_EXIT are mutually exclusive, but looks confusing. Use signal_group_exit(), this is not fastpath, the code clarity is more important. Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]> --- 25/kernel/signal.c~6_DO_SIGNAL_STOP 2008-02-15 16:59:17.0 +0300 +++ 25/kernel/signal.c 2008-02-15 20:34:32.0 +0300 @@ -1718,7 +1718,7 @@ static int do_signal_stop(int signr) struct task_struct *t; if (!likely(sig->flags & SIGNAL_STOP_DEQUEUED) || - unlikely(sig->group_exit_task)) + unlikely(signal_group_exit(sig))) return 0; /* * There is no group stop already in progress. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/