Re: [patch] Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-02 Thread Ingo Molnar

* Mike Galbraith  wrote:

> > Willing to write a changelog with the pointer to the actual 
> > oops that happens due to this issue?
> 
> I don't have a link, so reproduced/captured it.  With 
> systemd-sysvinit (bleh) installed, it's trivial to reproduce:
> 
> Add echo 0 > /proc/sys/kernel/sched_autogroup_enabled to /root/.bashrc
> (or wherever), boot box, type reboot, box explodes.
> 
> revert 800d4d30 sched, autogroup: Stop going ahead if autogroup is disabled
> 
> Between 8323f26ce and 800d4d30, autogroup is a wreck.  With both

Slightly decoded, for our human readers:

 8323f26ce342 ("sched: Fix race in task_group()")

:-)

> applied, all you have to do to crash a box is disable autogroup
> during boot up, then reboot.. boom, NULL pointer dereference due
> to 800d4d30 not allowing autogroup to move things, and 8323f26ce
> making that the only way to switch runqueues.
> 
> [  202.187747] BUG: unable to handle kernel NULL pointer dereference at   
> (null)
> [  202.191644] IP: [] effective_load.isra.43+0x50/0x90
> [  202.191644] PGD 220a74067 PUD 220402067 PMD 0 
> [  202.191644] Oops:  [#1] SMP 
> [  202.191644] Modules linked in: nfs nfsd fscache lockd nfs_acl auth_rpcgss 
> sunrpc exportfs bridge stp cpufreq_conservative cpufreq_ondemand 
> cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf ext3 jbd fuse 
> nls_iso8859_1 snd_hda_codec_realtek nls_cp437 snd_hda_intel vfat fat 
> snd_hda_codec e1000e sr_mod snd_hwdep cdrom snd_pcm sg snd_timer usb_storage 
> snd firewire_ohci usb_libusual firewire_core soundcore uas snd_page_alloc 
> i2c_i801 coretemp edd microcode hid_generic button crc_itu_t ipv6 autofs4 
> ext4 mbcache jbd2 crc16 usbhid hid sd_mod uhci_hcd ahci libahci libata 
> rtc_cmos ehci_hcd scsi_mod thermal fan usbcore processor usb_common
> [  202.191644] CPU 0 
> [  202.191644] Pid: 7047, comm: systemd-user-se Not tainted 3.6.8-smp #7 
> MEDIONPC MS-7502/MS-7502
> [  202.191644] RIP: 0010:[]  [] 
> effective_load.isra.43+0x50/0x90
> [  202.191644] RSP: 0018:880221ddfbd8  EFLAGS: 00010086
> [  202.191644] RAX: 0400 RBX: 88022621d880 RCX: 
> 
> [  202.191644] RDX:  RSI: 0002 RDI: 
> 880220a363a0
> [  202.191644] RBP: 880221ddfbd8 R08: 0400 R09: 
> 000115c0
> [  202.191644] R10:  R11: 0400 R12: 
> 8802214ed180
> [  202.191644] R13: 03fd R14:  R15: 
> 0003
> [  202.191644] FS:  7f174a81c7a0() GS:88022fc0() 
> knlGS:
> [  202.191644] CS:  0010 DS:  ES:  CR0: 80050033
> [  202.191644] CR2:  CR3: 000221fad000 CR4: 
> 07f0
> [  202.191644] DR0:  DR1:  DR2: 
> 
> [  202.191644] DR3:  DR6: 0ff0 DR7: 
> 0400
> [  202.191644] Process systemd-user-se (pid: 7047, threadinfo 
> 880221dde000, task 88022618b3a0)
> [  202.191644] Stack:
> [  202.191644]  880221ddfc88 81063d55 0400 
> 000115c0
> [  202.191644]  88022235c218 814ef9e8 ea00 
> 88022621d880
> [  202.191644]  880227007200 0003 0010 
> 00018f38
> [  202.191644] Call Trace:
> [  202.191644]  [] select_task_rq_fair+0x255/0x780
> [  202.191644]  [] try_to_wake_up+0x156/0x2c0
> [  202.191644]  [] wake_up_state+0xb/0x10
> [  202.191644]  [] signal_wake_up+0x28/0x40
> [  202.191644]  [] complete_signal+0x1d6/0x250
> [  202.191644]  [] __send_signal+0x170/0x310
> [  202.191644]  [] send_signal+0x40/0x80
> [  202.191644]  [] do_send_sig_info+0x47/0x90
> [  202.191644]  [] group_send_sig_info+0x4a/0x70
> [  202.191644]  [] kill_pid_info+0x3a/0x60
> [  202.191644]  [] sys_kill+0x97/0x1a0
> [  202.191644]  [] ? vfs_read+0x120/0x160
> [  202.191644]  [] ? sys_read+0x45/0x90
> [  202.191644]  [] system_call_fastpath+0x16/0x1b
> [  202.191644] Code: 49 0f af 41 50 31 d2 49 f7 f0 48 83 f8 01 48 0f 46 c6 48 
> 2b 07 48 8b bf 40 01 00 00 48 85 ff 74 3a 45 31 c0 48 8b 8f 50 01 00 00 <48> 
> 8b 11 4c 8b 89 80 00 00 00 49 89 d2 48 01 d0 45 8b 59 58 4c 
> [  202.191644] RIP  [] effective_load.isra.43+0x50/0x90
> [  202.191644]  RSP 
> [  202.191644] CR2: 
> 
> Signed-off-by: Mike Galbraith 
> Cc: Yong Zhang 
> Cc: sta...@vger.kernel.org

Thanks Mike!

Acked-by: Ingo Molnar 

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch] Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-02 Thread Mike Galbraith
On Sun, 2012-12-02 at 11:36 -0800, Linus Torvalds wrote: 
> On Sun, Dec 2, 2012 at 11:27 AM, Ingo Molnar  wrote:
> >
> > * Mike Galbraith  wrote:
> >
> >> On Sat, 2012-12-01 at 22:44 +0100, Ingo Molnar wrote:
> >>
> >> > Should we use some other file for that - or no file at all and
> >> > just emit a bootup printk for kernel hackers with a short
> >> > attention span?
> >>
> >> Or, whack the file and don't bother with a printk either.  If
> >> it's in your config, and your command line doesn't contain
> >> noautogroup, it's on, so the info is already present (until
> >> buffer gets full).  That makes for even fewer lines dedicated
> >> to dinky sideline feature.
> >>
> >> Or (as previously mentioned) just depreciate (or rip out) the
> >> whole thing since systemd is propagating everywhere anyway,
> >> and offers the same functionality.
> >>
> >> For 3.7, a revert of 800d4d30c8f2 would prevent the explosion
> >> when folks play with the now non-functional on/off switch
> >> (task groups are required to _always_ exist, that commit
> >> busted the autogroup assumption), so is perhaps a viable
> >> quickfix until autogroups fate is decided?
> >
> > Linus, which one would be your preference? I'm fine with the
> > first and third options - #2 that rips it all out looks like
> > a sad removal of an otherwise useful feature.
> 
> I suspect #3 is the best option right now - just revert 800d4d30c8f2.
> 
> Willing to write a changelog with the pointer to the actual oops that
> happens due to this issue?

I don't have a link, so reproduced/captured it.  With systemd-sysvinit
(bleh) installed, it's trivial to reproduce:

Add echo 0 > /proc/sys/kernel/sched_autogroup_enabled to /root/.bashrc
(or wherever), boot box, type reboot, box explodes.

revert 800d4d30 sched, autogroup: Stop going ahead if autogroup is disabled

Between 8323f26ce and 800d4d30, autogroup is a wreck.  With both
applied, all you have to do to crash a box is disable autogroup
during boot up, then reboot.. boom, NULL pointer dereference due
to 800d4d30 not allowing autogroup to move things, and 8323f26ce
making that the only way to switch runqueues.

[  202.187747] BUG: unable to handle kernel NULL pointer dereference at 
  (null)
[  202.191644] IP: [] effective_load.isra.43+0x50/0x90
[  202.191644] PGD 220a74067 PUD 220402067 PMD 0 
[  202.191644] Oops:  [#1] SMP 
[  202.191644] Modules linked in: nfs nfsd fscache lockd nfs_acl auth_rpcgss 
sunrpc exportfs bridge stp cpufreq_conservative cpufreq_ondemand 
cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf ext3 jbd fuse 
nls_iso8859_1 snd_hda_codec_realtek nls_cp437 snd_hda_intel vfat fat 
snd_hda_codec e1000e sr_mod snd_hwdep cdrom snd_pcm sg snd_timer usb_storage 
snd firewire_ohci usb_libusual firewire_core soundcore uas snd_page_alloc 
i2c_i801 coretemp edd microcode hid_generic button crc_itu_t ipv6 autofs4 ext4 
mbcache jbd2 crc16 usbhid hid sd_mod uhci_hcd ahci libahci libata rtc_cmos 
ehci_hcd scsi_mod thermal fan usbcore processor usb_common
[  202.191644] CPU 0 
[  202.191644] Pid: 7047, comm: systemd-user-se Not tainted 3.6.8-smp #7 
MEDIONPC MS-7502/MS-7502
[  202.191644] RIP: 0010:[]  [] 
effective_load.isra.43+0x50/0x90
[  202.191644] RSP: 0018:880221ddfbd8  EFLAGS: 00010086
[  202.191644] RAX: 0400 RBX: 88022621d880 RCX: 
[  202.191644] RDX:  RSI: 0002 RDI: 880220a363a0
[  202.191644] RBP: 880221ddfbd8 R08: 0400 R09: 000115c0
[  202.191644] R10:  R11: 0400 R12: 8802214ed180
[  202.191644] R13: 03fd R14:  R15: 0003
[  202.191644] FS:  7f174a81c7a0() GS:88022fc0() 
knlGS:
[  202.191644] CS:  0010 DS:  ES:  CR0: 80050033
[  202.191644] CR2:  CR3: 000221fad000 CR4: 07f0
[  202.191644] DR0:  DR1:  DR2: 
[  202.191644] DR3:  DR6: 0ff0 DR7: 0400
[  202.191644] Process systemd-user-se (pid: 7047, threadinfo 880221dde000, 
task 88022618b3a0)
[  202.191644] Stack:
[  202.191644]  880221ddfc88 81063d55 0400 
000115c0
[  202.191644]  88022235c218 814ef9e8 ea00 
88022621d880
[  202.191644]  880227007200 0003 0010 
00018f38
[  202.191644] Call Trace:
[  202.191644]  [] select_task_rq_fair+0x255/0x780
[  202.191644]  [] try_to_wake_up+0x156/0x2c0
[  202.191644]  [] wake_up_state+0xb/0x10
[  202.191644]  [] signal_wake_up+0x28/0x40
[  202.191644]  [] complete_signal+0x1d6/0x250
[  202.191644]  [] __send_signal+0x170/0x310
[  202.191644]  [] send_signal+0x40/0x80
[  202.191644]  [] do_send_sig_info+0x47/0x90
[  202.191644]  [] group_send_sig_info+0x4a/0x70
[  202.191644]  [] kill_pid_info+0x3a/0x60
[  202.191644]  [] sys_kill+0x97/0x1a0
[  

Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-02 Thread Linus Torvalds
On Sun, Dec 2, 2012 at 11:27 AM, Ingo Molnar  wrote:
>
> * Mike Galbraith  wrote:
>
>> On Sat, 2012-12-01 at 22:44 +0100, Ingo Molnar wrote:
>>
>> > Should we use some other file for that - or no file at all and
>> > just emit a bootup printk for kernel hackers with a short
>> > attention span?
>>
>> Or, whack the file and don't bother with a printk either.  If
>> it's in your config, and your command line doesn't contain
>> noautogroup, it's on, so the info is already present (until
>> buffer gets full).  That makes for even fewer lines dedicated
>> to dinky sideline feature.
>>
>> Or (as previously mentioned) just depreciate (or rip out) the
>> whole thing since systemd is propagating everywhere anyway,
>> and offers the same functionality.
>>
>> For 3.7, a revert of 800d4d30c8f2 would prevent the explosion
>> when folks play with the now non-functional on/off switch
>> (task groups are required to _always_ exist, that commit
>> busted the autogroup assumption), so is perhaps a viable
>> quickfix until autogroups fate is decided?
>
> Linus, which one would be your preference? I'm fine with the
> first and third options - #2 that rips it all out looks like
> a sad removal of an otherwise useful feature.

I suspect #3 is the best option right now - just revert 800d4d30c8f2.

Willing to write a changelog with the pointer to the actual oops that
happens due to this issue?

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-02 Thread Ingo Molnar

* Mike Galbraith  wrote:

> On Sat, 2012-12-01 at 22:44 +0100, Ingo Molnar wrote:
> 
> > Should we use some other file for that - or no file at all and 
> > just emit a bootup printk for kernel hackers with a short 
> > attention span?
> 
> Or, whack the file and don't bother with a printk either.  If 
> it's in your config, and your command line doesn't contain 
> noautogroup, it's on, so the info is already present (until 
> buffer gets full).  That makes for even fewer lines dedicated 
> to dinky sideline feature.
> 
> Or (as previously mentioned) just depreciate (or rip out) the 
> whole thing since systemd is propagating everywhere anyway, 
> and offers the same functionality.
> 
> For 3.7, a revert of 800d4d30c8f2 would prevent the explosion 
> when folks play with the now non-functional on/off switch 
> (task groups are required to _always_ exist, that commit 
> busted the autogroup assumption), so is perhaps a viable 
> quickfix until autogroups fate is decided?

Linus, which one would be your preference? I'm fine with the 
first and third options - #2 that rips it all out looks like
a sad removal of an otherwise useful feature.

( The fourth option would be to fix the dynamic knobs - there's 
  no patch for that yet. )

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-02 Thread Mike Galbraith
On Sat, 2012-12-01 at 22:44 +0100, Ingo Molnar wrote:

> Should we use some other file for that - or no file at all and 
> just emit a bootup printk for kernel hackers with a short 
> attention span?

Or, whack the file and don't bother with a printk either.  If it's in
your config, and your command line doesn't contain noautogroup, it's on,
so the info is already present (until buffer gets full).  That makes for
even fewer lines dedicated to dinky sideline feature.

Or (as previously mentioned) just depreciate (or rip out) the whole
thing since systemd is propagating everywhere anyway, and offers the
same functionality.

For 3.7, a revert of 800d4d30c8f2 would prevent the explosion when folks
play with the now non-functional on/off switch (task groups are required
to _always_ exist, that commit busted the autogroup assumption), so is
perhaps a viable quickfix until autogroups fate is decided?

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-02 Thread Mike Galbraith
On Sat, 2012-12-01 at 22:44 +0100, Ingo Molnar wrote:

 Should we use some other file for that - or no file at all and 
 just emit a bootup printk for kernel hackers with a short 
 attention span?

Or, whack the file and don't bother with a printk either.  If it's in
your config, and your command line doesn't contain noautogroup, it's on,
so the info is already present (until buffer gets full).  That makes for
even fewer lines dedicated to dinky sideline feature.

Or (as previously mentioned) just depreciate (or rip out) the whole
thing since systemd is propagating everywhere anyway, and offers the
same functionality.

For 3.7, a revert of 800d4d30c8f2 would prevent the explosion when folks
play with the now non-functional on/off switch (task groups are required
to _always_ exist, that commit busted the autogroup assumption), so is
perhaps a viable quickfix until autogroups fate is decided?

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-02 Thread Ingo Molnar

* Mike Galbraith efa...@gmx.de wrote:

 On Sat, 2012-12-01 at 22:44 +0100, Ingo Molnar wrote:
 
  Should we use some other file for that - or no file at all and 
  just emit a bootup printk for kernel hackers with a short 
  attention span?
 
 Or, whack the file and don't bother with a printk either.  If 
 it's in your config, and your command line doesn't contain 
 noautogroup, it's on, so the info is already present (until 
 buffer gets full).  That makes for even fewer lines dedicated 
 to dinky sideline feature.
 
 Or (as previously mentioned) just depreciate (or rip out) the 
 whole thing since systemd is propagating everywhere anyway, 
 and offers the same functionality.
 
 For 3.7, a revert of 800d4d30c8f2 would prevent the explosion 
 when folks play with the now non-functional on/off switch 
 (task groups are required to _always_ exist, that commit 
 busted the autogroup assumption), so is perhaps a viable 
 quickfix until autogroups fate is decided?

Linus, which one would be your preference? I'm fine with the 
first and third options - #2 that rips it all out looks like
a sad removal of an otherwise useful feature.

( The fourth option would be to fix the dynamic knobs - there's 
  no patch for that yet. )

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-02 Thread Linus Torvalds
On Sun, Dec 2, 2012 at 11:27 AM, Ingo Molnar mi...@kernel.org wrote:

 * Mike Galbraith efa...@gmx.de wrote:

 On Sat, 2012-12-01 at 22:44 +0100, Ingo Molnar wrote:

  Should we use some other file for that - or no file at all and
  just emit a bootup printk for kernel hackers with a short
  attention span?

 Or, whack the file and don't bother with a printk either.  If
 it's in your config, and your command line doesn't contain
 noautogroup, it's on, so the info is already present (until
 buffer gets full).  That makes for even fewer lines dedicated
 to dinky sideline feature.

 Or (as previously mentioned) just depreciate (or rip out) the
 whole thing since systemd is propagating everywhere anyway,
 and offers the same functionality.

 For 3.7, a revert of 800d4d30c8f2 would prevent the explosion
 when folks play with the now non-functional on/off switch
 (task groups are required to _always_ exist, that commit
 busted the autogroup assumption), so is perhaps a viable
 quickfix until autogroups fate is decided?

 Linus, which one would be your preference? I'm fine with the
 first and third options - #2 that rips it all out looks like
 a sad removal of an otherwise useful feature.

I suspect #3 is the best option right now - just revert 800d4d30c8f2.

Willing to write a changelog with the pointer to the actual oops that
happens due to this issue?

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch] Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-02 Thread Mike Galbraith
On Sun, 2012-12-02 at 11:36 -0800, Linus Torvalds wrote: 
 On Sun, Dec 2, 2012 at 11:27 AM, Ingo Molnar mi...@kernel.org wrote:
 
  * Mike Galbraith efa...@gmx.de wrote:
 
  On Sat, 2012-12-01 at 22:44 +0100, Ingo Molnar wrote:
 
   Should we use some other file for that - or no file at all and
   just emit a bootup printk for kernel hackers with a short
   attention span?
 
  Or, whack the file and don't bother with a printk either.  If
  it's in your config, and your command line doesn't contain
  noautogroup, it's on, so the info is already present (until
  buffer gets full).  That makes for even fewer lines dedicated
  to dinky sideline feature.
 
  Or (as previously mentioned) just depreciate (or rip out) the
  whole thing since systemd is propagating everywhere anyway,
  and offers the same functionality.
 
  For 3.7, a revert of 800d4d30c8f2 would prevent the explosion
  when folks play with the now non-functional on/off switch
  (task groups are required to _always_ exist, that commit
  busted the autogroup assumption), so is perhaps a viable
  quickfix until autogroups fate is decided?
 
  Linus, which one would be your preference? I'm fine with the
  first and third options - #2 that rips it all out looks like
  a sad removal of an otherwise useful feature.
 
 I suspect #3 is the best option right now - just revert 800d4d30c8f2.
 
 Willing to write a changelog with the pointer to the actual oops that
 happens due to this issue?

I don't have a link, so reproduced/captured it.  With systemd-sysvinit
(bleh) installed, it's trivial to reproduce:

Add echo 0  /proc/sys/kernel/sched_autogroup_enabled to /root/.bashrc
(or wherever), boot box, type reboot, box explodes.

revert 800d4d30 sched, autogroup: Stop going ahead if autogroup is disabled

Between 8323f26ce and 800d4d30, autogroup is a wreck.  With both
applied, all you have to do to crash a box is disable autogroup
during boot up, then reboot.. boom, NULL pointer dereference due
to 800d4d30 not allowing autogroup to move things, and 8323f26ce
making that the only way to switch runqueues.

[  202.187747] BUG: unable to handle kernel NULL pointer dereference at 
  (null)
[  202.191644] IP: [81063ac0] effective_load.isra.43+0x50/0x90
[  202.191644] PGD 220a74067 PUD 220402067 PMD 0 
[  202.191644] Oops:  [#1] SMP 
[  202.191644] Modules linked in: nfs nfsd fscache lockd nfs_acl auth_rpcgss 
sunrpc exportfs bridge stp cpufreq_conservative cpufreq_ondemand 
cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf ext3 jbd fuse 
nls_iso8859_1 snd_hda_codec_realtek nls_cp437 snd_hda_intel vfat fat 
snd_hda_codec e1000e sr_mod snd_hwdep cdrom snd_pcm sg snd_timer usb_storage 
snd firewire_ohci usb_libusual firewire_core soundcore uas snd_page_alloc 
i2c_i801 coretemp edd microcode hid_generic button crc_itu_t ipv6 autofs4 ext4 
mbcache jbd2 crc16 usbhid hid sd_mod uhci_hcd ahci libahci libata rtc_cmos 
ehci_hcd scsi_mod thermal fan usbcore processor usb_common
[  202.191644] CPU 0 
[  202.191644] Pid: 7047, comm: systemd-user-se Not tainted 3.6.8-smp #7 
MEDIONPC MS-7502/MS-7502
[  202.191644] RIP: 0010:[81063ac0]  [81063ac0] 
effective_load.isra.43+0x50/0x90
[  202.191644] RSP: 0018:880221ddfbd8  EFLAGS: 00010086
[  202.191644] RAX: 0400 RBX: 88022621d880 RCX: 
[  202.191644] RDX:  RSI: 0002 RDI: 880220a363a0
[  202.191644] RBP: 880221ddfbd8 R08: 0400 R09: 000115c0
[  202.191644] R10:  R11: 0400 R12: 8802214ed180
[  202.191644] R13: 03fd R14:  R15: 0003
[  202.191644] FS:  7f174a81c7a0() GS:88022fc0() 
knlGS:
[  202.191644] CS:  0010 DS:  ES:  CR0: 80050033
[  202.191644] CR2:  CR3: 000221fad000 CR4: 07f0
[  202.191644] DR0:  DR1:  DR2: 
[  202.191644] DR3:  DR6: 0ff0 DR7: 0400
[  202.191644] Process systemd-user-se (pid: 7047, threadinfo 880221dde000, 
task 88022618b3a0)
[  202.191644] Stack:
[  202.191644]  880221ddfc88 81063d55 0400 
000115c0
[  202.191644]  88022235c218 814ef9e8 ea00 
88022621d880
[  202.191644]  880227007200 0003 0010 
00018f38
[  202.191644] Call Trace:
[  202.191644]  [81063d55] select_task_rq_fair+0x255/0x780
[  202.191644]  [810607e6] try_to_wake_up+0x156/0x2c0
[  202.191644]  [8106098b] wake_up_state+0xb/0x10
[  202.191644]  [81044f88] signal_wake_up+0x28/0x40
[  202.191644]  [81045406] complete_signal+0x1d6/0x250
[  202.191644]  [810455f0] __send_signal+0x170/0x310
[  202.191644]  [810457d0] send_signal+0x40/0x80
[  202.191644]  [81046257] do_send_sig_info+0x47/0x90
[  202.191644]  

Re: [patch] Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-02 Thread Ingo Molnar

* Mike Galbraith efa...@gmx.de wrote:

  Willing to write a changelog with the pointer to the actual 
  oops that happens due to this issue?
 
 I don't have a link, so reproduced/captured it.  With 
 systemd-sysvinit (bleh) installed, it's trivial to reproduce:
 
 Add echo 0  /proc/sys/kernel/sched_autogroup_enabled to /root/.bashrc
 (or wherever), boot box, type reboot, box explodes.
 
 revert 800d4d30 sched, autogroup: Stop going ahead if autogroup is disabled
 
 Between 8323f26ce and 800d4d30, autogroup is a wreck.  With both

Slightly decoded, for our human readers:

 8323f26ce342 (sched: Fix race in task_group())

:-)

 applied, all you have to do to crash a box is disable autogroup
 during boot up, then reboot.. boom, NULL pointer dereference due
 to 800d4d30 not allowing autogroup to move things, and 8323f26ce
 making that the only way to switch runqueues.
 
 [  202.187747] BUG: unable to handle kernel NULL pointer dereference at   
 (null)
 [  202.191644] IP: [81063ac0] effective_load.isra.43+0x50/0x90
 [  202.191644] PGD 220a74067 PUD 220402067 PMD 0 
 [  202.191644] Oops:  [#1] SMP 
 [  202.191644] Modules linked in: nfs nfsd fscache lockd nfs_acl auth_rpcgss 
 sunrpc exportfs bridge stp cpufreq_conservative cpufreq_ondemand 
 cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf ext3 jbd fuse 
 nls_iso8859_1 snd_hda_codec_realtek nls_cp437 snd_hda_intel vfat fat 
 snd_hda_codec e1000e sr_mod snd_hwdep cdrom snd_pcm sg snd_timer usb_storage 
 snd firewire_ohci usb_libusual firewire_core soundcore uas snd_page_alloc 
 i2c_i801 coretemp edd microcode hid_generic button crc_itu_t ipv6 autofs4 
 ext4 mbcache jbd2 crc16 usbhid hid sd_mod uhci_hcd ahci libahci libata 
 rtc_cmos ehci_hcd scsi_mod thermal fan usbcore processor usb_common
 [  202.191644] CPU 0 
 [  202.191644] Pid: 7047, comm: systemd-user-se Not tainted 3.6.8-smp #7 
 MEDIONPC MS-7502/MS-7502
 [  202.191644] RIP: 0010:[81063ac0]  [81063ac0] 
 effective_load.isra.43+0x50/0x90
 [  202.191644] RSP: 0018:880221ddfbd8  EFLAGS: 00010086
 [  202.191644] RAX: 0400 RBX: 88022621d880 RCX: 
 
 [  202.191644] RDX:  RSI: 0002 RDI: 
 880220a363a0
 [  202.191644] RBP: 880221ddfbd8 R08: 0400 R09: 
 000115c0
 [  202.191644] R10:  R11: 0400 R12: 
 8802214ed180
 [  202.191644] R13: 03fd R14:  R15: 
 0003
 [  202.191644] FS:  7f174a81c7a0() GS:88022fc0() 
 knlGS:
 [  202.191644] CS:  0010 DS:  ES:  CR0: 80050033
 [  202.191644] CR2:  CR3: 000221fad000 CR4: 
 07f0
 [  202.191644] DR0:  DR1:  DR2: 
 
 [  202.191644] DR3:  DR6: 0ff0 DR7: 
 0400
 [  202.191644] Process systemd-user-se (pid: 7047, threadinfo 
 880221dde000, task 88022618b3a0)
 [  202.191644] Stack:
 [  202.191644]  880221ddfc88 81063d55 0400 
 000115c0
 [  202.191644]  88022235c218 814ef9e8 ea00 
 88022621d880
 [  202.191644]  880227007200 0003 0010 
 00018f38
 [  202.191644] Call Trace:
 [  202.191644]  [81063d55] select_task_rq_fair+0x255/0x780
 [  202.191644]  [810607e6] try_to_wake_up+0x156/0x2c0
 [  202.191644]  [8106098b] wake_up_state+0xb/0x10
 [  202.191644]  [81044f88] signal_wake_up+0x28/0x40
 [  202.191644]  [81045406] complete_signal+0x1d6/0x250
 [  202.191644]  [810455f0] __send_signal+0x170/0x310
 [  202.191644]  [810457d0] send_signal+0x40/0x80
 [  202.191644]  [81046257] do_send_sig_info+0x47/0x90
 [  202.191644]  [8104649a] group_send_sig_info+0x4a/0x70
 [  202.191644]  [810465ba] kill_pid_info+0x3a/0x60
 [  202.191644]  [81047ac7] sys_kill+0x97/0x1a0
 [  202.191644]  [810ebc10] ? vfs_read+0x120/0x160
 [  202.191644]  [810ebc95] ? sys_read+0x45/0x90
 [  202.191644]  [8134bde2] system_call_fastpath+0x16/0x1b
 [  202.191644] Code: 49 0f af 41 50 31 d2 49 f7 f0 48 83 f8 01 48 0f 46 c6 48 
 2b 07 48 8b bf 40 01 00 00 48 85 ff 74 3a 45 31 c0 48 8b 8f 50 01 00 00 48 
 8b 11 4c 8b 89 80 00 00 00 49 89 d2 48 01 d0 45 8b 59 58 4c 
 [  202.191644] RIP  [81063ac0] effective_load.isra.43+0x50/0x90
 [  202.191644]  RSP 880221ddfbd8
 [  202.191644] CR2: 
 
 Signed-off-by: Mike Galbraith efa...@gmx.de
 Cc: Yong Zhang yong.zha...@gmail.com
 Cc: sta...@vger.kernel.org

Thanks Mike!

Acked-by: Ingo Molnar mi...@kernel.org

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-01 Thread Mike Galbraith
On Sat, 2012-12-01 at 14:03 -0800, Linus Torvalds wrote: 
> On Sat, Dec 1, 2012 at 1:44 PM, Ingo Molnar  wrote:
> >
> > You are not missing anything. That flag is my fault not Mike's:
> > I booted the initial version of that patch but was unsure
> > whether autogroups was enabled - it's a pretty transparent
> > feature. So I figured that having that flag (but readonly) would
> > give us this information definitely.
> 
> So what's the advantage of it being read-only at all?
> 
> Since the flag is clearly *used*, make it read-write, and then all my
> objections go away (except for a slight worry that the dropping of
> /proc//autogroup_nice or whatever it is could break some odd
> system app, but I don't worry *too* much about that).
> 
> Disabling autogroup is clearly something people might want, since the
> code tests for it. So removing the flag entirely seems wrong too. But
> if it exists, it should be writable. No?

No, because turning autogroup off at runtime is what now makes boom.

With Peter's race fix in place, lazy movement (was noop on UP) is gone,
mandating that you either walk the box, moving all tasks when you flick
the switch, or you remove the ability to flick the switch other than
'off' at boot time.  You didn't like the original instant on/off switch,
and I didn't like the thought of making autogroup bigger to fix the
explosion either, so went for rip the switch and problematic /proc stuff
that really should have never existed in the first place out option.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-01 Thread Linus Torvalds
On Sat, Dec 1, 2012 at 1:44 PM, Ingo Molnar  wrote:
>
> You are not missing anything. That flag is my fault not Mike's:
> I booted the initial version of that patch but was unsure
> whether autogroups was enabled - it's a pretty transparent
> feature. So I figured that having that flag (but readonly) would
> give us this information definitely.

So what's the advantage of it being read-only at all?

Since the flag is clearly *used*, make it read-write, and then all my
objections go away (except for a slight worry that the dropping of
/proc//autogroup_nice or whatever it is could break some odd
system app, but I don't worry *too* much about that).

Disabling autogroup is clearly something people might want, since the
code tests for it. So removing the flag entirely seems wrong too. But
if it exists, it should be writable. No?

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-01 Thread Ingo Molnar

* Ingo Molnar  wrote:

> You are not missing anything. That flag is my fault not 
> Mike's: I booted the initial version of that patch but was 
> unsure whether autogroups was enabled - it's a pretty 
> transparent feature. So I figured that having that flag (but 
> readonly) would give us this information definitely.

The other reason was that the original version of the patch also 
added a boot parameter - to enable/disable autogroups from the 
boot command line. With *that* configuration twist it made sense 
to present this information somewhere in /proc as well.

But then we got rid of the boot parameter to simplify the patch 
- which further reduced the sense of the 
/proc/sys/kernel/sched_autogroup_enabled flag - which now can 
only ever be 1.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-01 Thread Ingo Molnar

* Linus Torvalds  wrote:

> On Sat, Dec 1, 2012 at 3:16 AM, Ingo Molnar  wrote:
> >
> > Please [RFC] pull the latest sched-urgent-for-linus git tree
> > from:
> 
> No. That patch is braindead. I wouldn't pull it even if it 
> wasn't this late.
> 
> Why the hell leave a read-only 'sched_autogroup_enabled' proc 
> file?
>
> What the f*ck is the point? It looks like the flag still 
> exists (we test it), but now there's no point to it, since you 
> can't change it.
> 
> What am I missing?

You are not missing anything. That flag is my fault not Mike's: 
I booted the initial version of that patch but was unsure 
whether autogroups was enabled - it's a pretty transparent 
feature. So I figured that having that flag (but readonly) would 
give us this information definitely.

So I suggested to Mike to keep that flag so that user-space is 
informed that autogroups is enabled. It seemed like a cute 
usability twist at that time, and there's existing precedent for 
it in /proc, but now I'm not so sure anymore...

Should we use some other file for that - or no file at all and 
just emit a bootup printk for kernel hackers with a short 
attention span?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-01 Thread Linus Torvalds
On Sat, Dec 1, 2012 at 3:16 AM, Ingo Molnar  wrote:
>
> Please [RFC] pull the latest sched-urgent-for-linus git tree
> from:

No. That patch is braindead. I wouldn't pull it even if it wasn't this late.

Why the hell leave a read-only 'sched_autogroup_enabled' proc file?
What the f*ck is the point? It looks like the flag still exists (we
test it), but now there's no point to it, since you can't change it.

What am I missing?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-01 Thread Linus Torvalds
On Sat, Dec 1, 2012 at 3:16 AM, Ingo Molnar mi...@kernel.org wrote:

 Please [RFC] pull the latest sched-urgent-for-linus git tree
 from:

No. That patch is braindead. I wouldn't pull it even if it wasn't this late.

Why the hell leave a read-only 'sched_autogroup_enabled' proc file?
What the f*ck is the point? It looks like the flag still exists (we
test it), but now there's no point to it, since you can't change it.

What am I missing?

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-01 Thread Ingo Molnar

* Linus Torvalds torva...@linux-foundation.org wrote:

 On Sat, Dec 1, 2012 at 3:16 AM, Ingo Molnar mi...@kernel.org wrote:
 
  Please [RFC] pull the latest sched-urgent-for-linus git tree
  from:
 
 No. That patch is braindead. I wouldn't pull it even if it 
 wasn't this late.
 
 Why the hell leave a read-only 'sched_autogroup_enabled' proc 
 file?

 What the f*ck is the point? It looks like the flag still 
 exists (we test it), but now there's no point to it, since you 
 can't change it.
 
 What am I missing?

You are not missing anything. That flag is my fault not Mike's: 
I booted the initial version of that patch but was unsure 
whether autogroups was enabled - it's a pretty transparent 
feature. So I figured that having that flag (but readonly) would 
give us this information definitely.

So I suggested to Mike to keep that flag so that user-space is 
informed that autogroups is enabled. It seemed like a cute 
usability twist at that time, and there's existing precedent for 
it in /proc, but now I'm not so sure anymore...

Should we use some other file for that - or no file at all and 
just emit a bootup printk for kernel hackers with a short 
attention span?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-01 Thread Ingo Molnar

* Ingo Molnar mi...@kernel.org wrote:

 You are not missing anything. That flag is my fault not 
 Mike's: I booted the initial version of that patch but was 
 unsure whether autogroups was enabled - it's a pretty 
 transparent feature. So I figured that having that flag (but 
 readonly) would give us this information definitely.

The other reason was that the original version of the patch also 
added a boot parameter - to enable/disable autogroups from the 
boot command line. With *that* configuration twist it made sense 
to present this information somewhere in /proc as well.

But then we got rid of the boot parameter to simplify the patch 
- which further reduced the sense of the 
/proc/sys/kernel/sched_autogroup_enabled flag - which now can 
only ever be 1.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-01 Thread Linus Torvalds
On Sat, Dec 1, 2012 at 1:44 PM, Ingo Molnar mi...@kernel.org wrote:

 You are not missing anything. That flag is my fault not Mike's:
 I booted the initial version of that patch but was unsure
 whether autogroups was enabled - it's a pretty transparent
 feature. So I figured that having that flag (but readonly) would
 give us this information definitely.

So what's the advantage of it being read-only at all?

Since the flag is clearly *used*, make it read-write, and then all my
objections go away (except for a slight worry that the dropping of
/proc/pid/autogroup_nice or whatever it is could break some odd
system app, but I don't worry *too* much about that).

Disabling autogroup is clearly something people might want, since the
code tests for it. So removing the flag entirely seems wrong too. But
if it exists, it should be writable. No?

 Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC GIT PULL] scheduler fix for autogroups

2012-12-01 Thread Mike Galbraith
On Sat, 2012-12-01 at 14:03 -0800, Linus Torvalds wrote: 
 On Sat, Dec 1, 2012 at 1:44 PM, Ingo Molnar mi...@kernel.org wrote:
 
  You are not missing anything. That flag is my fault not Mike's:
  I booted the initial version of that patch but was unsure
  whether autogroups was enabled - it's a pretty transparent
  feature. So I figured that having that flag (but readonly) would
  give us this information definitely.
 
 So what's the advantage of it being read-only at all?
 
 Since the flag is clearly *used*, make it read-write, and then all my
 objections go away (except for a slight worry that the dropping of
 /proc/pid/autogroup_nice or whatever it is could break some odd
 system app, but I don't worry *too* much about that).
 
 Disabling autogroup is clearly something people might want, since the
 code tests for it. So removing the flag entirely seems wrong too. But
 if it exists, it should be writable. No?

No, because turning autogroup off at runtime is what now makes boom.

With Peter's race fix in place, lazy movement (was noop on UP) is gone,
mandating that you either walk the box, moving all tasks when you flick
the switch, or you remove the ability to flick the switch other than
'off' at boot time.  You didn't like the original instant on/off switch,
and I didn't like the thought of making autogroup bigger to fix the
explosion either, so went for rip the switch and problematic /proc stuff
that really should have never existed in the first place out option.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/