Re: [PATCH] do_signal_stop: use signal_group_exit()

2008-02-19 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes:

>
>>From 2.6.25-rc2-mm1.bz2 patch:
>>
>> -   .ioctl  = tty_ioctl,
>> +   .unlocked_ioctl = tty_ioctl,
>
> and this is why this didn't happen before, I guess.

Right.  Does anyone know what kind of audit was made of the tty code
to ensure everything would be fine?

It it was pretty thorough and it was just this little corner case
it makes sense to get the struct tty locking for pids correct.

Otherwise since the tty layer is historically not especially good
with it's locking. We should just revert the change above, and save
attacking this until someone has time to do a thorough review.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] do_signal_stop: use signal_group_exit()

2008-02-19 Thread Valdis . Kletnieks
On Sun, 17 Feb 2008 21:11:14 MST, Eric W. Biederman said:
> Oleg Nesterov <[EMAIL PROTECTED]> writes:
> > On 02/16, Oleg Nesterov wrote:
> >> On 02/15, Andrew Morton wrote:
> >> > : BUG: unable to handle kernel paging request at 00200200

> >> > : Call Trace:
> >> > :  [] ? release_task+0x152/0x2e5
> >> > :  [] ? do_wait+0x6c7/0xa1c
> >> > :  [] ? default_wake_function+0x0/0xe
> >> > :  [] ? sys_rt_sigaction+0x7a/0x98
> >> > :  [] ? sys_wait4+0x8a/0xa1
> >> > :  [] ? system_call_after_swapgs+0x7b/0x80

> Thanks.  Looks like we need to grab a lock there.
> At a quick skim I think we need the tty lock.

*ping* - Any further activity on this one?  I got bit by it as well on
the very first attempted boot of 25-rc2-mm1, the instant it tried to leave
single-user and go multi-user.


pgpgv8e1NFpVN.pgp
Description: PGP signature


Re: [PATCH] do_signal_stop: use signal_group_exit()

2008-02-17 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes:

> On 02/16, Oleg Nesterov wrote:
>>
>> On 02/15, Andrew Morton wrote:
>> >
>> > ug.  On about the fourth boot with the current -mm lineup I hit:
>> >
>> > : BUG: unable to handle kernel paging request at 00200200
>>
>> == LIST_POISON2
>> 
>> > : IP: [] free_pid+0x35/0x8e
>> 
>> most probably == hlist_del_rcu(pid_chain)
>> 
>> > : Call Trace:
>> > :  [] ? release_task+0x152/0x2e5
>> > :  [] ? do_wait+0x6c7/0xa1c
>> > :  [] ? default_wake_function+0x0/0xe
>> > :  [] ? sys_rt_sigaction+0x7a/0x98
>> > :  [] ? sys_wait4+0x8a/0xa1
>> > :  [] ? system_call_after_swapgs+0x7b/0x80
>> 
>> (Can't understand why there is no detach_pid() in this stack trace,
>>  but it is the only possible caller of free_pid()).

That is just because of tail call optimization.  There is no need
to return to detach_pid so we just jumped to free_pid.

>
>> So, detach_pid()->free_pid() hit an already unhashed pid. But this
>> is not possible?
>> 
>> This means we already did detach_pid(), but in that case the previous
>> detach_pid() has set task->pids[].pid = NULL, and we should OOPS earlier,
>> somewhere at "if (!hlist_empty(&pid->tasks[tmp]))".
>
> Yes, this is not possible.
>
> But what possible is: we have the unbalanced put_pid(pid) which frees the
> live pid. The next alloc_pid() gets the same memory, and initializes all
> pid->tasks[] lists.
>
> Now, if that pid was used as PIDTYPE_PGID/PIDTYPE_SID, the next detach_pid()
> from this pgrp/sid sees that the pid is not used (all lists are hlist_empty),
> frees this pid again, and the bug manifests itself this way.
>
>   tiocspgrp:
>
>   put_pid(real_tty->pgrp);
>   // -- WINDOW --
>   real_tty->pgrp = get_pid(pgrp);
>
> When bash spawns the command, both parent and child do ioctl(TIOCSPGRP,child),
> and it is possible that both do put_pid() on the same parent's pid.
>
> Damn, when you know what the bug is, the test case is trivial:
>
>   $ while true; do perl -e0; done
>
> The kernel crashes.
>
>>From 2.6.25-rc2-mm1.bz2 patch:
>>
>> -   .ioctl  = tty_ioctl,
>> +   .unlocked_ioctl = tty_ioctl,
>
> and this is why this didn't happen before, I guess.
>
>> I'll try to think more about this, but I doubt very much I'll find the
>> reason :(
>
> Ohhh... 7 (!!!) hours of hacking + some vodka did the trick.
>
> (Kamalesh, I think you hit the same bug).

Thanks.  Looks like we need to grab a lock there.
At a quick skim I think we need the tty lock.


Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] do_signal_stop: use signal_group_exit()

2008-02-17 Thread Oleg Nesterov
On 02/16, Oleg Nesterov wrote:
>
> On 02/15, Andrew Morton wrote:
> >
> > ug.  On about the fourth boot with the current -mm lineup I hit:
> >
> > : BUG: unable to handle kernel paging request at 00200200
>
> == LIST_POISON2
> 
> > : IP: [] free_pid+0x35/0x8e
> 
> most probably == hlist_del_rcu(pid_chain)
> 
> > : Call Trace:
> > :  [] ? release_task+0x152/0x2e5
> > :  [] ? do_wait+0x6c7/0xa1c
> > :  [] ? default_wake_function+0x0/0xe
> > :  [] ? sys_rt_sigaction+0x7a/0x98
> > :  [] ? sys_wait4+0x8a/0xa1
> > :  [] ? system_call_after_swapgs+0x7b/0x80
> 
> (Can't understand why there is no detach_pid() in this stack trace,
>  but it is the only possible caller of free_pid()).
> 
> So, detach_pid()->free_pid() hit an already unhashed pid. But this
> is not possible?
> 
> This means we already did detach_pid(), but in that case the previous
> detach_pid() has set task->pids[].pid = NULL, and we should OOPS earlier,
> somewhere at "if (!hlist_empty(&pid->tasks[tmp]))".

Yes, this is not possible.

But what possible is: we have the unbalanced put_pid(pid) which frees the
live pid. The next alloc_pid() gets the same memory, and initializes all
pid->tasks[] lists.

Now, if that pid was used as PIDTYPE_PGID/PIDTYPE_SID, the next detach_pid()
from this pgrp/sid sees that the pid is not used (all lists are hlist_empty),
frees this pid again, and the bug manifests itself this way.

tiocspgrp:

put_pid(real_tty->pgrp);
// -- WINDOW --
real_tty->pgrp = get_pid(pgrp);

When bash spawns the command, both parent and child do ioctl(TIOCSPGRP,child),
and it is possible that both do put_pid() on the same parent's pid.

Damn, when you know what the bug is, the test case is trivial:

$ while true; do perl -e0; done

The kernel crashes.

>From 2.6.25-rc2-mm1.bz2 patch:
>
> -   .ioctl  = tty_ioctl,
> +   .unlocked_ioctl = tty_ioctl,

and this is why this didn't happen before, I guess.

> I'll try to think more about this, but I doubt very much I'll find the
> reason :(

Ohhh... 7 (!!!) hours of hacking + some vodka did the trick.

(Kamalesh, I think you hit the same bug).

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] do_signal_stop: use signal_group_exit()

2008-02-16 Thread Oleg Nesterov
On 02/15, Andrew Morton wrote:
>
> ug.  On about the fourth boot with the current -mm lineup I hit:
>
> : BUG: unable to handle kernel paging request at 00200200
   
== LIST_POISON2

> : IP: [] free_pid+0x35/0x8e

most probably == hlist_del_rcu(pid_chain)

> : PGD 2574cb067 PUD 257561067 PMD 0 
> : Oops: 0002 [1] SMP 
> : last sysfs file: /sys/class/net/eth0/address
> : CPU 2 
> : Modules linked in: ipv6 dm_mirror dm_multipath dm_mod sbs sbshc dock 
> battery ac parport_pc lp parport snd_hda_intel snd_seq_dummy snd_seq_oss 
> snd_seq_midi_event snd_seq shpchp snd_seq_device sg snd_pcm_oss snd_mixer_oss 
> snd_pcm floppy snd_timer button i2c_i801 snd soundcore ide_cd_mod cdrom 
> serio_raw i2c_core snd_page_alloc pcspkr ehci_hcd ohci_hcd uhci_hcd
> : Pid: 3132, comm: ifup-eth Not tainted 2.6.25-rc2-mm1 #5
> : RIP: 0010:[]  [] free_pid+0x35/0x8e
> : RSP: 0018:81025754de58  EFLAGS: 00010046
> : RAX:  RBX: 81025f268840 RCX: 81025f263f08
> : RDX: 00200200 RSI: 0046 RDI: 
> : RBP: 81025f263ec0 R08: 81025f268b18 R09: 81025f268b08
> : R10: 81025f268b08 R11:  R12: 810259853140
> : R13: 0c78 R14:  R15: 
> : FS:  7f8f9ba7d6f0() GS:81025f16f0c0() knlGS:
> : CS:  0010 DS:  ES:  CR0: 8005003b
> : CR2: 00200200 CR3: 0002598d CR4: 06e0
> : DR0:  DR1:  DR2: 
> : DR3:  DR6: 0ff0 DR7: 0400
> : Process ifup-eth (pid: 3132, threadinfo 81025754c000, task 
> 81025d467620)
> : Stack:  81025f268b08 81025f268840 81025994b660 80237727
> :  81025994b660 81025994b660  80237f81
> :  05d0 810257561018  7fffa3aa9514
> : Call Trace:
> :  [] ? release_task+0x152/0x2e5
> :  [] ? do_wait+0x6c7/0xa1c
> :  [] ? default_wake_function+0x0/0xe
> :  [] ? sys_rt_sigaction+0x7a/0x98
> :  [] ? sys_wait4+0x8a/0xa1
> :  [] ? system_call_after_swapgs+0x7b/0x80

(Can't understand why there is no detach_pid() in this stack trace,
 but it is the only possible caller of free_pid()).

So, detach_pid()->free_pid() hit an already unhashed pid. But this
is not possible?

This means we already did detach_pid(), but in that case the previous
detach_pid() has set task->pids[].pid = NULL, and we should OOPS earlier,
somewhere at "if (!hlist_empty(&pid->tasks[tmp]))".

> and I don't have a clue which patch caused it and I won't be near this
> machine again for over a week.

Definitely not this patch...

I'll try to think more about this, but I doubt very much I'll find the
reason :(

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] do_signal_stop: use signal_group_exit()

2008-02-15 Thread Andrew Morton
ug.  On about the fourth boot with the current -mm lineup I hit:


: security:  permission sendto in class node not defined in policy
: security:  permission dccp_recv in class netif not defined in policy
: security:  permission dccp_send in class netif not defined in policy
: security:  permission ingress in class netif not defined in policy
: security:  permission egress in class netif not defined in policy
: security:  permission setfcap in class capability not defined in policy
: security:  permission forward_in in class packet not defined in policy
: security:  permission forward_out in class packet not defined in policy
: SELinux: policy loaded with handle_unknown=deny
: type=1403 audit(1203124656.152:3): policy loaded auid=4294967295 
ses=4294967295
: BUG: unable to handle kernel paging request at 00200200
: IP: [] free_pid+0x35/0x8e
: PGD 2574cb067 PUD 257561067 PMD 0 
: Oops: 0002 [1] SMP 
: last sysfs file: /sys/class/net/eth0/address
: CPU 2 
: Modules linked in: ipv6 dm_mirror dm_multipath dm_mod sbs sbshc dock battery 
ac parport_pc lp parport snd_hda_intel snd_seq_dummy snd_seq_oss 
snd_seq_midi_event snd_seq shpchp snd_seq_device sg snd_pcm_oss snd_mixer_oss 
snd_pcm floppy snd_timer button i2c_i801 snd soundcore ide_cd_mod cdrom 
serio_raw i2c_core snd_page_alloc pcspkr ehci_hcd ohci_hcd uhci_hcd
: Pid: 3132, comm: ifup-eth Not tainted 2.6.25-rc2-mm1 #5
: RIP: 0010:[]  [] free_pid+0x35/0x8e
: RSP: 0018:81025754de58  EFLAGS: 00010046
: RAX:  RBX: 81025f268840 RCX: 81025f263f08
: RDX: 00200200 RSI: 0046 RDI: 
: RBP: 81025f263ec0 R08: 81025f268b18 R09: 81025f268b08
: R10: 81025f268b08 R11:  R12: 810259853140
: R13: 0c78 R14:  R15: 
: FS:  7f8f9ba7d6f0() GS:81025f16f0c0() knlGS:
: CS:  0010 DS:  ES:  CR0: 8005003b
: CR2: 00200200 CR3: 0002598d CR4: 06e0
: DR0:  DR1:  DR2: 
: DR3:  DR6: 0ff0 DR7: 0400
: Process ifup-eth (pid: 3132, threadinfo 81025754c000, task 
81025d467620)
: Stack:  81025f268b08 81025f268840 81025994b660 80237727
:  81025994b660 81025994b660  80237f81
:  05d0 810257561018  7fffa3aa9514
: Call Trace:
:  [] ? release_task+0x152/0x2e5
:  [] ? do_wait+0x6c7/0xa1c
:  [] ? default_wake_function+0x0/0xe
:  [] ? sys_rt_sigaction+0x7a/0x98
:  [] ? sys_wait4+0x8a/0xa1
:  [] ? system_call_after_swapgs+0x7b/0x80
: 
: 
: Code: 80 53 41 51 e8 83 d4 28 00 31 ff 48 89 c6 eb 2e 48 63 c7 48 c1 e0 05 48 
8d 44 28 40 48 8d 48 08 48 8b 40 08 48 8b 51 08 48 85 c0 <48> 89 02 74 04 48 89 
50 08 48 c7 41 08 00 02 20 00 ff c7 3b 7d 
: RIP  [] free_pid+0x35/0x8e
:  RSP 
: CR2: 00200200
: ---[ end trace efde415d3f801416 ]---
: BUG: sleeping function called from invalid context at kernel/rwsem.c:21
: in_atomic():0, irqs_disabled():1
: Pid: 3132, comm: ifup-eth Tainted: G  D   2.6.25-rc2-mm1 #5
: 
: Call Trace:
:  [] down_read+0x15/0x23
:  [] acct_collect+0x40/0x180
:  [] do_exit+0x1f3/0x6ba
:  [] sync_regs+0x0/0x67
:  [] do_page_fault+0x755/0x80e
:  [] error_exit+0x0/0x51
:  [] free_pid+0x35/0x8e
:  [] free_pid+0x13/0x8e
:  [] release_task+0x152/0x2e5
:  [] do_wait+0x6c7/0xa1c
:  [] default_wake_function+0x0/0xe
:  [] sys_rt_sigaction+0x7a/0x98
:  [] sys_wait4+0x8a/0xa1
:  [] system_call_after_swapgs+0x7b/0x80
: 
: eth0: no IPv6 routers present
: 

and I don't have a clue which patch caused it and I won't be near this
machine again for over a week.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] do_signal_stop: use signal_group_exit()

2008-02-15 Thread Oleg Nesterov
do_signal_stop() needs signal_group_exit() but checks sig->group_exit_task.
This (optimization) is correct, SIGNAL_STOP_DEQUEUED and SIGNAL_GROUP_EXIT
are mutually exclusive, but looks confusing. Use signal_group_exit(), this
is not fastpath, the code clarity is more important.

Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>

--- 25/kernel/signal.c~6_DO_SIGNAL_STOP 2008-02-15 16:59:17.0 +0300
+++ 25/kernel/signal.c  2008-02-15 20:34:32.0 +0300
@@ -1718,7 +1718,7 @@ static int do_signal_stop(int signr)
struct task_struct *t;
 
if (!likely(sig->flags & SIGNAL_STOP_DEQUEUED) ||
-   unlikely(sig->group_exit_task))
+   unlikely(signal_group_exit(sig)))
return 0;
/*
 * There is no group stop already in progress.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/