Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Paolo Bonzini
On 16/09/2015 16:16, Tejun Heo wrote: > On Wed, Sep 16, 2015 at 02:22:49PM +0200, Oleg Nesterov wrote: >>> > > If the revert isn't easy, I think backporting rcu_sync is the best bet. >> > >> > I leave this to Paul and Tejun... at least I think this is not v4.2 >> > material. > Will route

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Tejun Heo
Hello, On Wed, Sep 16, 2015 at 02:22:49PM +0200, Oleg Nesterov wrote: > > If the revert isn't easy, I think backporting rcu_sync is the best bet. > > I leave this to Paul and Tejun... at least I think this is not v4.2 material. Will route reverts through cgroup branch. Should be pretty

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Christian Borntraeger
Am 16.09.2015 um 14:22 schrieb Oleg Nesterov: >> The issue is that rcu_sync doesn't eliminate synchronize_sched, > > Yes, but it eliminates _expedited(). This is good, but otoh this means > that (say) individual __cgroup_procs_write() can take much more time. > However, it won't block the

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Oleg Nesterov
On 09/16, Paolo Bonzini wrote: > > > On 16/09/2015 14:22, Oleg Nesterov wrote: > > > The issue is that rcu_sync doesn't eliminate synchronize_sched, > > > > Yes, but it eliminates _expedited(). This is good, but otoh this means > > that (say) individual __cgroup_procs_write() can take much more

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Paolo Bonzini
On 16/09/2015 14:22, Oleg Nesterov wrote: > > The issue is that rcu_sync doesn't eliminate synchronize_sched, > > Yes, but it eliminates _expedited(). This is good, but otoh this means > that (say) individual __cgroup_procs_write() can take much more time. > However, it won't block the readers

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Oleg Nesterov
On 09/16, Paolo Bonzini wrote: > > > On 16/09/2015 10:57, Christian Borntraeger wrote: > > Am 16.09.2015 um 10:32 schrieb Paolo Bonzini: > >> > >> > >> On 15/09/2015 19:38, Paul E. McKenney wrote: > >>> Excellent points! > >>> > >>> Other options in such situations include the following: > >>> >

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Christian Borntraeger
Am 16.09.2015 um 13:03 schrieb Tejun Heo: > Hello, > > On Wed, Sep 16, 2015 at 12:58:00PM +0200, Christian Borntraeger wrote: >> FWIW, I added a printk to percpu_down_write. With KVM and uprobes disabled, >> just booting up a fedora20 gives me __6749__ percpu_down_write calls on 4.2. >> systemd

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Tejun Heo
Hello, On Tue, Sep 15, 2015 at 09:35:47PM -0700, Paul E. McKenney wrote: > > > I am suggesting trying the options and seeing what works best, then > > > working to convince people as needed. > > > > Yeah, sure thing. Let's wait for Christian. > > Indeed. Is there enough benefit to risk

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Tejun Heo
Hello, On Wed, Sep 16, 2015 at 12:58:00PM +0200, Christian Borntraeger wrote: > FWIW, I added a printk to percpu_down_write. With KVM and uprobes disabled, > just booting up a fedora20 gives me __6749__ percpu_down_write calls on 4.2. > systemd seems to do that for the processes. > > So a revert

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Christian Borntraeger
Am 16.09.2015 um 09:44 schrieb Christian Borntraeger: > Am 16.09.2015 um 03:24 schrieb Tejun Heo: >> Hello, Paul. >> >> On Tue, Sep 15, 2015 at 04:38:18PM -0700, Paul E. McKenney wrote: >>> Well, the decision as to what is too big for -stable is owned by the >>> -stable maintainers, not by me. >>

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Paolo Bonzini
On 16/09/2015 10:57, Christian Borntraeger wrote: > Am 16.09.2015 um 10:32 schrieb Paolo Bonzini: >> >> >> On 15/09/2015 19:38, Paul E. McKenney wrote: >>> Excellent points! >>> >>> Other options in such situations include the following: >>> >>> o Rework so that the code uses call_rcu*()

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Christian Borntraeger
Am 16.09.2015 um 10:32 schrieb Paolo Bonzini: > > > On 15/09/2015 19:38, Paul E. McKenney wrote: >> Excellent points! >> >> Other options in such situations include the following: >> >> oRework so that the code uses call_rcu*() instead of *_expedited(). >> >> oMaintain a per-task or

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Paolo Bonzini
On 15/09/2015 19:38, Paul E. McKenney wrote: > Excellent points! > > Other options in such situations include the following: > > o Rework so that the code uses call_rcu*() instead of *_expedited(). > > o Maintain a per-task or per-CPU counter so that every so many >

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Christian Borntraeger
Am 16.09.2015 um 03:24 schrieb Tejun Heo: > Hello, Paul. > > On Tue, Sep 15, 2015 at 04:38:18PM -0700, Paul E. McKenney wrote: >> Well, the decision as to what is too big for -stable is owned by the >> -stable maintainers, not by me. > > Is it tho? Usually the subsystem maintainer knows the

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Paolo Bonzini
On 15/09/2015 19:38, Paul E. McKenney wrote: > Excellent points! > > Other options in such situations include the following: > > o Rework so that the code uses call_rcu*() instead of *_expedited(). > > o Maintain a per-task or per-CPU counter so that every so many >

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Christian Borntraeger
Am 16.09.2015 um 10:32 schrieb Paolo Bonzini: > > > On 15/09/2015 19:38, Paul E. McKenney wrote: >> Excellent points! >> >> Other options in such situations include the following: >> >> oRework so that the code uses call_rcu*() instead of *_expedited(). >> >> oMaintain a per-task or

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Christian Borntraeger
Am 16.09.2015 um 03:24 schrieb Tejun Heo: > Hello, Paul. > > On Tue, Sep 15, 2015 at 04:38:18PM -0700, Paul E. McKenney wrote: >> Well, the decision as to what is too big for -stable is owned by the >> -stable maintainers, not by me. > > Is it tho? Usually the subsystem maintainer knows the

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Paolo Bonzini
On 16/09/2015 10:57, Christian Borntraeger wrote: > Am 16.09.2015 um 10:32 schrieb Paolo Bonzini: >> >> >> On 15/09/2015 19:38, Paul E. McKenney wrote: >>> Excellent points! >>> >>> Other options in such situations include the following: >>> >>> o Rework so that the code uses call_rcu*()

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Tejun Heo
Hello, On Wed, Sep 16, 2015 at 12:58:00PM +0200, Christian Borntraeger wrote: > FWIW, I added a printk to percpu_down_write. With KVM and uprobes disabled, > just booting up a fedora20 gives me __6749__ percpu_down_write calls on 4.2. > systemd seems to do that for the processes. > > So a revert

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Christian Borntraeger
Am 16.09.2015 um 09:44 schrieb Christian Borntraeger: > Am 16.09.2015 um 03:24 schrieb Tejun Heo: >> Hello, Paul. >> >> On Tue, Sep 15, 2015 at 04:38:18PM -0700, Paul E. McKenney wrote: >>> Well, the decision as to what is too big for -stable is owned by the >>> -stable maintainers, not by me. >>

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Tejun Heo
Hello, On Tue, Sep 15, 2015 at 09:35:47PM -0700, Paul E. McKenney wrote: > > > I am suggesting trying the options and seeing what works best, then > > > working to convince people as needed. > > > > Yeah, sure thing. Let's wait for Christian. > > Indeed. Is there enough benefit to risk

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Christian Borntraeger
Am 16.09.2015 um 13:03 schrieb Tejun Heo: > Hello, > > On Wed, Sep 16, 2015 at 12:58:00PM +0200, Christian Borntraeger wrote: >> FWIW, I added a printk to percpu_down_write. With KVM and uprobes disabled, >> just booting up a fedora20 gives me __6749__ percpu_down_write calls on 4.2. >> systemd

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Oleg Nesterov
On 09/16, Paolo Bonzini wrote: > > > On 16/09/2015 10:57, Christian Borntraeger wrote: > > Am 16.09.2015 um 10:32 schrieb Paolo Bonzini: > >> > >> > >> On 15/09/2015 19:38, Paul E. McKenney wrote: > >>> Excellent points! > >>> > >>> Other options in such situations include the following: > >>> >

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Paolo Bonzini
On 16/09/2015 14:22, Oleg Nesterov wrote: > > The issue is that rcu_sync doesn't eliminate synchronize_sched, > > Yes, but it eliminates _expedited(). This is good, but otoh this means > that (say) individual __cgroup_procs_write() can take much more time. > However, it won't block the readers

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Christian Borntraeger
Am 16.09.2015 um 14:22 schrieb Oleg Nesterov: >> The issue is that rcu_sync doesn't eliminate synchronize_sched, > > Yes, but it eliminates _expedited(). This is good, but otoh this means > that (say) individual __cgroup_procs_write() can take much more time. > However, it won't block the

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Oleg Nesterov
On 09/16, Paolo Bonzini wrote: > > > On 16/09/2015 14:22, Oleg Nesterov wrote: > > > The issue is that rcu_sync doesn't eliminate synchronize_sched, > > > > Yes, but it eliminates _expedited(). This is good, but otoh this means > > that (say) individual __cgroup_procs_write() can take much more

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Paolo Bonzini
On 16/09/2015 16:16, Tejun Heo wrote: > On Wed, Sep 16, 2015 at 02:22:49PM +0200, Oleg Nesterov wrote: >>> > > If the revert isn't easy, I think backporting rcu_sync is the best bet. >> > >> > I leave this to Paul and Tejun... at least I think this is not v4.2 >> > material. > Will route

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-16 Thread Tejun Heo
Hello, On Wed, Sep 16, 2015 at 02:22:49PM +0200, Oleg Nesterov wrote: > > If the revert isn't easy, I think backporting rcu_sync is the best bet. > > I leave this to Paul and Tejun... at least I think this is not v4.2 material. Will route reverts through cgroup branch. Should be pretty

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Paul E. McKenney
On Tue, Sep 15, 2015 at 09:24:15PM -0400, Tejun Heo wrote: > Hello, Paul. > > On Tue, Sep 15, 2015 at 04:38:18PM -0700, Paul E. McKenney wrote: > > Well, the decision as to what is too big for -stable is owned by the > > -stable maintainers, not by me. > > Is it tho? Usually the subsystem

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Tejun Heo
Hello, Paul. On Tue, Sep 15, 2015 at 04:38:18PM -0700, Paul E. McKenney wrote: > Well, the decision as to what is too big for -stable is owned by the > -stable maintainers, not by me. Is it tho? Usually the subsystem maintainer knows the best and has most say in it. I was mostly curious

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Paul E. McKenney
On Tue, Sep 15, 2015 at 06:28:11PM -0400, Tejun Heo wrote: > Hello, > > On Tue, Sep 15, 2015 at 02:38:30PM -0700, Paul E. McKenney wrote: > > I did take a shot at adding the rcu_sync stuff during this past merge > > window, but it did not converge quickly enough to make it. It looks > > quite

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Tejun Heo
Hello, On Tue, Sep 15, 2015 at 02:38:30PM -0700, Paul E. McKenney wrote: > I did take a shot at adding the rcu_sync stuff during this past merge > window, but it did not converge quickly enough to make it. It looks > quite good for the next merge window. There have been changes in most > of the

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Paul E. McKenney
On Tue, Sep 15, 2015 at 05:26:22PM -0400, Tejun Heo wrote: > Hello, > > On Tue, Sep 15, 2015 at 11:11:45PM +0200, Christian Borntraeger wrote: > > > In fact, I would say that any userspace-controlled call to *_expedited() > > > is a bug waiting to happen and a bad idea---because userspace can,

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Tejun Heo
Hello, On Tue, Sep 15, 2015 at 11:11:45PM +0200, Christian Borntraeger wrote: > > In fact, I would say that any userspace-controlled call to *_expedited() > > is a bug waiting to happen and a bad idea---because userspace can, with > > little effort, end up calling it in a loop. > > Right. This

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Christian Borntraeger
Am 15.09.2015 um 18:42 schrieb Paolo Bonzini: > > > On 15/09/2015 15:36, Christian Borntraeger wrote: >> I am wondering why the old code behaved in such fatal ways. Is there >> some interaction between waiting for a reschedule in the >> synchronize_sched writer and some fork code actually

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Paul E. McKenney
On Tue, Sep 15, 2015 at 06:42:19PM +0200, Paolo Bonzini wrote: > > > On 15/09/2015 15:36, Christian Borntraeger wrote: > > I am wondering why the old code behaved in such fatal ways. Is there > > some interaction between waiting for a reschedule in the > > synchronize_sched writer and some fork

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Paolo Bonzini
On 15/09/2015 15:36, Christian Borntraeger wrote: > I am wondering why the old code behaved in such fatal ways. Is there > some interaction between waiting for a reschedule in the > synchronize_sched writer and some fork code actually waiting for the > read side to get the lock together with

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Tejun Heo
Hello, On Tue, Sep 15, 2015 at 03:36:34PM +0200, Christian Borntraeger wrote: > >> The problem seems to be that the newly used percpu_rwsem does a > >> rcu_synchronize_sched_expedited for all write downs/ups. > > > > Can you try: > > > >

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Christian Borntraeger
Am 15.09.2015 um 15:05 schrieb Peter Zijlstra: > On Tue, Sep 15, 2015 at 02:05:14PM +0200, Christian Borntraeger wrote: >> Tejun, >> >> >> commit d59cfc09c32a2ae31f1c3bc2983a0cd79afb3f14 (sched, cgroup: replace >> signal_struct->group_rwsem with a global percpu_rwsem) causes some noticably >>

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Peter Zijlstra
On Tue, Sep 15, 2015 at 02:05:14PM +0200, Christian Borntraeger wrote: > Tejun, > > > commit d59cfc09c32a2ae31f1c3bc2983a0cd79afb3f14 (sched, cgroup: replace > signal_struct->group_rwsem with a global percpu_rwsem) causes some noticably > hickups when starting several kvm guests (which libvirt

[4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Christian Borntraeger
Tejun, commit d59cfc09c32a2ae31f1c3bc2983a0cd79afb3f14 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes some noticably hickups when starting several kvm guests (which libvirt will move into cgroups - each vcpu thread and each i/o thread) When you now start

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Paul E. McKenney
On Tue, Sep 15, 2015 at 09:24:15PM -0400, Tejun Heo wrote: > Hello, Paul. > > On Tue, Sep 15, 2015 at 04:38:18PM -0700, Paul E. McKenney wrote: > > Well, the decision as to what is too big for -stable is owned by the > > -stable maintainers, not by me. > > Is it tho? Usually the subsystem

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Tejun Heo
Hello, Paul. On Tue, Sep 15, 2015 at 04:38:18PM -0700, Paul E. McKenney wrote: > Well, the decision as to what is too big for -stable is owned by the > -stable maintainers, not by me. Is it tho? Usually the subsystem maintainer knows the best and has most say in it. I was mostly curious

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Tejun Heo
Hello, On Tue, Sep 15, 2015 at 02:38:30PM -0700, Paul E. McKenney wrote: > I did take a shot at adding the rcu_sync stuff during this past merge > window, but it did not converge quickly enough to make it. It looks > quite good for the next merge window. There have been changes in most > of the

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Paul E. McKenney
On Tue, Sep 15, 2015 at 06:28:11PM -0400, Tejun Heo wrote: > Hello, > > On Tue, Sep 15, 2015 at 02:38:30PM -0700, Paul E. McKenney wrote: > > I did take a shot at adding the rcu_sync stuff during this past merge > > window, but it did not converge quickly enough to make it. It looks > > quite

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Paolo Bonzini
On 15/09/2015 15:36, Christian Borntraeger wrote: > I am wondering why the old code behaved in such fatal ways. Is there > some interaction between waiting for a reschedule in the > synchronize_sched writer and some fork code actually waiting for the > read side to get the lock together with

[4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Christian Borntraeger
Tejun, commit d59cfc09c32a2ae31f1c3bc2983a0cd79afb3f14 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes some noticably hickups when starting several kvm guests (which libvirt will move into cgroups - each vcpu thread and each i/o thread) When you now start

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Paul E. McKenney
On Tue, Sep 15, 2015 at 06:42:19PM +0200, Paolo Bonzini wrote: > > > On 15/09/2015 15:36, Christian Borntraeger wrote: > > I am wondering why the old code behaved in such fatal ways. Is there > > some interaction between waiting for a reschedule in the > > synchronize_sched writer and some fork

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Christian Borntraeger
Am 15.09.2015 um 18:42 schrieb Paolo Bonzini: > > > On 15/09/2015 15:36, Christian Borntraeger wrote: >> I am wondering why the old code behaved in such fatal ways. Is there >> some interaction between waiting for a reschedule in the >> synchronize_sched writer and some fork code actually

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Tejun Heo
Hello, On Tue, Sep 15, 2015 at 11:11:45PM +0200, Christian Borntraeger wrote: > > In fact, I would say that any userspace-controlled call to *_expedited() > > is a bug waiting to happen and a bad idea---because userspace can, with > > little effort, end up calling it in a loop. > > Right. This

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Paul E. McKenney
On Tue, Sep 15, 2015 at 05:26:22PM -0400, Tejun Heo wrote: > Hello, > > On Tue, Sep 15, 2015 at 11:11:45PM +0200, Christian Borntraeger wrote: > > > In fact, I would say that any userspace-controlled call to *_expedited() > > > is a bug waiting to happen and a bad idea---because userspace can,

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Christian Borntraeger
Am 15.09.2015 um 15:05 schrieb Peter Zijlstra: > On Tue, Sep 15, 2015 at 02:05:14PM +0200, Christian Borntraeger wrote: >> Tejun, >> >> >> commit d59cfc09c32a2ae31f1c3bc2983a0cd79afb3f14 (sched, cgroup: replace >> signal_struct->group_rwsem with a global percpu_rwsem) causes some noticably >>

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Tejun Heo
Hello, On Tue, Sep 15, 2015 at 03:36:34PM +0200, Christian Borntraeger wrote: > >> The problem seems to be that the newly used percpu_rwsem does a > >> rcu_synchronize_sched_expedited for all write downs/ups. > > > > Can you try: > > > >

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

2015-09-15 Thread Peter Zijlstra
On Tue, Sep 15, 2015 at 02:05:14PM +0200, Christian Borntraeger wrote: > Tejun, > > > commit d59cfc09c32a2ae31f1c3bc2983a0cd79afb3f14 (sched, cgroup: replace > signal_struct->group_rwsem with a global percpu_rwsem) causes some noticably > hickups when starting several kvm guests (which libvirt