On 03/06/2015 01:02 PM, Sasha Levin wrote:
> I can go redo that again if you suspect that that commit is not the cause.
I took a closer look at the logs, and I'm seeing hangs that begin this way
as well:
[ 2298.020237] NMI watchdog: BUG: soft lockup - CPU#19 stuck for 23s!
[trinity-c19:839]
[
On Fri, Mar 6, 2015 at 11:55 AM, Davidlohr Bueso wrote:
>>
>> - look up the vma in the vma lookup cache
>
> But you'd still need mmap_sem there to at least get the VMA's first
> value.
So my theory was that the vma cache is such a trivial data structure
that we could trivially make it be
On Fri, 2015-03-06 at 11:55 -0800, Davidlohr Bueso wrote:
> On Fri, 2015-03-06 at 11:32 -0800, Linus Torvalds wrote:
>
> > IOW, I wonder if we could special-case the common non-IO
> > fault-handling path something along the lines of:
> >
> > - look up the vma in the vma lookup cache
>
> But
On Fri, 2015-03-06 at 11:32 -0800, Linus Torvalds wrote:
> IOW, I wonder if we could special-case the common non-IO
> fault-handling path something along the lines of:
>
> - look up the vma in the vma lookup cache
But you'd still need mmap_sem there to at least get the VMA's first
value.
> -
On Fri, 2015-03-06 at 11:32 -0800, Linus Torvalds wrote:
> Basically, to me, the whole "if a lock is so contended that we need to
> play locking games, then we should look at why we *use* the lock,
> rather than at the lock itself" is a religion.
Oh absolutely, I'm only mentioning the locking
On Fri, Mar 6, 2015 at 11:20 AM, Davidlohr Bueso wrote:
>
> I obviously agree with all those points, however fyi most of the testing
> on rwsems I do includes scaling address space ops stressing the
> mmap_sem, which is a real world concern. So while it does include
> microbenchmarks, it is not
On Fri, 2015-03-06 at 11:05 -0800, Linus Torvalds wrote:
> On Fri, Mar 6, 2015 at 10:57 AM, Jason Low wrote:
> >
> > Right, the can_spin_on_owner() was originally added to the mutex
> > spinning code for optimization purposes, particularly so that we can
> > avoid adding the spinner to the OSQ
On Fri, 2015-03-06 at 11:05 -0800, Linus Torvalds wrote:
> On Fri, Mar 6, 2015 at 10:57 AM, Jason Low wrote:
> >
> > Right, the can_spin_on_owner() was originally added to the mutex
> > spinning code for optimization purposes, particularly so that we can
> > avoid adding the spinner to the OSQ
On Fri, Mar 6, 2015 at 10:57 AM, Jason Low wrote:
>
> Right, the can_spin_on_owner() was originally added to the mutex
> spinning code for optimization purposes, particularly so that we can
> avoid adding the spinner to the OSQ only to find that it doesn't need to
> spin. This function needing to
On Fri, 2015-03-06 at 09:19 -0800, Davidlohr Bueso wrote:
> On Fri, 2015-03-06 at 13:32 +0100, Ingo Molnar wrote:
> > * Sasha Levin wrote:
> >
> > > I've bisected this to "locking/rwsem: Check for active lock before
> > > bailing on spinning". Relevant parties Cc'ed.
> >
> > That would be:
> >
On 03/06/2015 12:19 PM, Davidlohr Bueso wrote:
>> diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
>> > index 1c0d11e8ce34..e4ad019e23f5 100644
>> > --- a/kernel/locking/rwsem-xadd.c
>> > +++ b/kernel/locking/rwsem-xadd.c
>> > @@ -298,23 +298,30 @@ static inline bool
>> >
On Fri, 2015-03-06 at 13:32 +0100, Ingo Molnar wrote:
> * Sasha Levin wrote:
>
> > I've bisected this to "locking/rwsem: Check for active lock before bailing
> > on spinning". Relevant parties Cc'ed.
>
> That would be:
>
> 1a99367023f6 ("locking/rwsem: Check for active lock before bailing
On 03/06/2015 09:45 AM, Sasha Levin wrote:
> On 03/06/2015 09:34 AM, Rafael David Tinoco wrote:
>> Are you sure about this ? I have a core dump locked on the same place
>> (state machine for powering cpu down for the task swap) from a 3.13 (+
>> upstream patches) and this commit wasn't backported
On 03/06/2015 09:34 AM, Rafael David Tinoco wrote:
> Are you sure about this ? I have a core dump locked on the same place
> (state machine for powering cpu down for the task swap) from a 3.13 (+
> upstream patches) and this commit wasn't backported yet.
bisect took me to that same commit twice,
Are you sure about this ? I have a core dump locked on the same place
(state machine for powering cpu down for the task swap) from a 3.13 (+
upstream patches) and this commit wasn't backported yet.
-> multi_cpu_stop -> do { } while (curstate != MULTI_STOP_EXIT);
In my case, curstate is WAY
* Sasha Levin wrote:
> I've bisected this to "locking/rwsem: Check for active lock before bailing on
> spinning". Relevant parties Cc'ed.
That would be:
1a99367023f6 ("locking/rwsem: Check for active lock before bailing on
spinning")
attached below.
Thanks,
Ingo
I've bisected this to "locking/rwsem: Check for active lock before bailing on
spinning". Relevant parties Cc'ed.
Thanks,
Sasha
On 03/02/2015 02:45 AM, Sasha Levin wrote:
> Hi all,
>
> I'm seeing the following lockup pretty often while fuzzing with trinity:
>
> [ 880.960250] NMI watchdog:
Are you sure about this ? I have a core dump locked on the same place
(state machine for powering cpu down for the task swap) from a 3.13 (+
upstream patches) and this commit wasn't backported yet.
- multi_cpu_stop - do { } while (curstate != MULTI_STOP_EXIT);
In my case, curstate is WAY
On 03/06/2015 09:34 AM, Rafael David Tinoco wrote:
Are you sure about this ? I have a core dump locked on the same place
(state machine for powering cpu down for the task swap) from a 3.13 (+
upstream patches) and this commit wasn't backported yet.
bisect took me to that same commit twice, and
* Sasha Levin sasha.le...@oracle.com wrote:
I've bisected this to locking/rwsem: Check for active lock before bailing on
spinning. Relevant parties Cc'ed.
That would be:
1a99367023f6 (locking/rwsem: Check for active lock before bailing on
spinning)
attached below.
Thanks,
Ingo
On 03/06/2015 09:45 AM, Sasha Levin wrote:
On 03/06/2015 09:34 AM, Rafael David Tinoco wrote:
Are you sure about this ? I have a core dump locked on the same place
(state machine for powering cpu down for the task swap) from a 3.13 (+
upstream patches) and this commit wasn't backported yet.
On Fri, 2015-03-06 at 13:32 +0100, Ingo Molnar wrote:
* Sasha Levin sasha.le...@oracle.com wrote:
I've bisected this to locking/rwsem: Check for active lock before bailing
on spinning. Relevant parties Cc'ed.
That would be:
1a99367023f6 (locking/rwsem: Check for active lock before
I've bisected this to locking/rwsem: Check for active lock before bailing on
spinning. Relevant parties Cc'ed.
Thanks,
Sasha
On 03/02/2015 02:45 AM, Sasha Levin wrote:
Hi all,
I'm seeing the following lockup pretty often while fuzzing with trinity:
[ 880.960250] NMI watchdog: BUG: soft
On 03/06/2015 12:19 PM, Davidlohr Bueso wrote:
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 1c0d11e8ce34..e4ad019e23f5 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -298,23 +298,30 @@ static inline bool
On Fri, Mar 6, 2015 at 10:57 AM, Jason Low jason.l...@hp.com wrote:
Right, the can_spin_on_owner() was originally added to the mutex
spinning code for optimization purposes, particularly so that we can
avoid adding the spinner to the OSQ only to find that it doesn't need to
spin. This
On Fri, 2015-03-06 at 09:19 -0800, Davidlohr Bueso wrote:
On Fri, 2015-03-06 at 13:32 +0100, Ingo Molnar wrote:
* Sasha Levin sasha.le...@oracle.com wrote:
I've bisected this to locking/rwsem: Check for active lock before
bailing on spinning. Relevant parties Cc'ed.
That would
On Fri, 2015-03-06 at 11:32 -0800, Linus Torvalds wrote:
IOW, I wonder if we could special-case the common non-IO
fault-handling path something along the lines of:
- look up the vma in the vma lookup cache
But you'd still need mmap_sem there to at least get the VMA's first
value.
- look
On Fri, 2015-03-06 at 11:05 -0800, Linus Torvalds wrote:
On Fri, Mar 6, 2015 at 10:57 AM, Jason Low jason.l...@hp.com wrote:
Right, the can_spin_on_owner() was originally added to the mutex
spinning code for optimization purposes, particularly so that we can
avoid adding the spinner to
On Fri, 2015-03-06 at 11:05 -0800, Linus Torvalds wrote:
On Fri, Mar 6, 2015 at 10:57 AM, Jason Low jason.l...@hp.com wrote:
Right, the can_spin_on_owner() was originally added to the mutex
spinning code for optimization purposes, particularly so that we can
avoid adding the spinner to
On Fri, Mar 6, 2015 at 11:20 AM, Davidlohr Bueso d...@stgolabs.net wrote:
I obviously agree with all those points, however fyi most of the testing
on rwsems I do includes scaling address space ops stressing the
mmap_sem, which is a real world concern. So while it does include
microbenchmarks,
On Fri, 2015-03-06 at 11:32 -0800, Linus Torvalds wrote:
Basically, to me, the whole if a lock is so contended that we need to
play locking games, then we should look at why we *use* the lock,
rather than at the lock itself is a religion.
Oh absolutely, I'm only mentioning the locking
On Fri, 2015-03-06 at 11:55 -0800, Davidlohr Bueso wrote:
On Fri, 2015-03-06 at 11:32 -0800, Linus Torvalds wrote:
IOW, I wonder if we could special-case the common non-IO
fault-handling path something along the lines of:
- look up the vma in the vma lookup cache
But you'd still
On Fri, Mar 6, 2015 at 11:55 AM, Davidlohr Bueso d...@stgolabs.net wrote:
- look up the vma in the vma lookup cache
But you'd still need mmap_sem there to at least get the VMA's first
value.
So my theory was that the vma cache is such a trivial data structure
that we could trivially make it
On 03/06/2015 01:02 PM, Sasha Levin wrote:
I can go redo that again if you suspect that that commit is not the cause.
I took a closer look at the logs, and I'm seeing hangs that begin this way
as well:
[ 2298.020237] NMI watchdog: BUG: soft lockup - CPU#19 stuck for 23s!
[trinity-c19:839]
[
Some more info:
multi_cpu_stop seems to be spinning inside do { ... } while (curstate
!= MULTI_STOP_EXIT);
So, multi_cpu_stop is an offload ([migration]) for: migrate_swap ->
stop_two_cpus -> wait_for_completion() sequence... for cross-migrating
2 tasks.
Based on task structs from callers
Some more info:
multi_cpu_stop seems to be spinning inside do { ... } while (curstate
!= MULTI_STOP_EXIT);
So, multi_cpu_stop is an offload ([migration]) for: migrate_swap -
stop_two_cpus - wait_for_completion() sequence... for cross-migrating
2 tasks.
Based on task structs from callers stacks:
Hi all,
I'm seeing the following lockup pretty often while fuzzing with trinity:
[ 880.960250] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 447s!
[migration/1:14]
[ 880.960700] Modules linked in:
[ 880.960700] irq event stamp: 380954
[ 880.960700] hardirqs last enabled at (380953):
Hi all,
I'm seeing the following lockup pretty often while fuzzing with trinity:
[ 880.960250] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 447s!
[migration/1:14]
[ 880.960700] Modules linked in:
[ 880.960700] irq event stamp: 380954
[ 880.960700] hardirqs last enabled at (380953):
38 matches
Mail list logo