On Sat, 2014-03-22 at 07:57 +0530, Srikar Dronamraju wrote:
> > > So reverting and applying v3 3/4 and 4/4 patches works for me.
> >
> > Ok, I verified that the above endds up resulting in the same tree as
> > the minimal patch I sent out, modulo (a) some comments and (b) an
> > #ifdef CONFIG_SMP
> > So reverting and applying v3 3/4 and 4/4 patches works for me.
>
> Ok, I verified that the above endds up resulting in the same tree as
> the minimal patch I sent out, modulo (a) some comments and (b) an
> #ifdef CONFIG_SMP in futex_get_mm() that doesn't really matter.
>
> So I committed
So reverting and applying v3 3/4 and 4/4 patches works for me.
Ok, I verified that the above endds up resulting in the same tree as
the minimal patch I sent out, modulo (a) some comments and (b) an
#ifdef CONFIG_SMP in futex_get_mm() that doesn't really matter.
So I committed the
On Sat, 2014-03-22 at 07:57 +0530, Srikar Dronamraju wrote:
So reverting and applying v3 3/4 and 4/4 patches works for me.
Ok, I verified that the above endds up resulting in the same tree as
the minimal patch I sent out, modulo (a) some comments and (b) an
#ifdef CONFIG_SMP in
On Thu, Mar 20, 2014 at 9:55 PM, Srikar Dronamraju
wrote:
>
> I reverted commits 99b60ce6 and b0c29f79. Then applied the patches in
> the above url. The last one had a reject but it was pretty
> straightforward to resolve it. After this, specjbb completes.
>
> So reverting and applying v3 3/4 and
>
> Ok, so a big reason why this patch doesn't apply cleanly after reverting
> is because *most* of the changes were done at the top of the file with
> regards to documenting the ordering guarantees, the actual code changes
> are quite minimal.
>
> I reverted commits 99b60ce6 (documentation) and
On Thu, Mar 20, 2014 at 1:20 PM, Davidlohr Bueso wrote:
>
> I reverted commits 99b60ce6 (documentation) and b0c29f79 (the offending
> commit), and then I cleanly applied the equivalent ones from v3 of the
> series (which was already *tested* and ready for upstream until you
> suggested looking
On Thu, 2014-03-20 at 09:31 -0700, Davidlohr Bueso wrote:
> hmmm looking at ppc spinlock code, it seems that it doesn't have ticket
> spinlocks -- in fact Torsten Duwe has been trying to get them upstream
> very recently. Since we rely on the counter for detecting waiters, this
> might explain the
On Thu, 2014-03-20 at 12:25 -0700, Linus Torvalds wrote:
> On Thu, Mar 20, 2014 at 12:08 PM, Davidlohr Bueso wrote:
> >
> > Oh, it does. This atomics technique was tested at a customer's site and
> > ready for upstream.
>
> I'm not worried about the *original* patch. I'm worried about the
>
On Thu, Mar 20, 2014 at 12:08 PM, Davidlohr Bueso wrote:
>
> Oh, it does. This atomics technique was tested at a customer's site and
> ready for upstream.
I'm not worried about the *original* patch. I'm worried about the
incremental one.
Your original patch never applied to my tree - I think it
On Thu, 2014-03-20 at 11:36 -0700, Linus Torvalds wrote:
> On Thu, Mar 20, 2014 at 10:18 AM, Davidlohr Bueso wrote:
> >
> > Comparing with the patch I sent earlier this morning, looks equivalent,
> > and fwiw, passes my initial qemu bootup, which is the first way of
> > detecting anything stupid
On Thu, Mar 20, 2014 at 10:18 AM, Davidlohr Bueso wrote:
>
> Comparing with the patch I sent earlier this morning, looks equivalent,
> and fwiw, passes my initial qemu bootup, which is the first way of
> detecting anything stupid going on.
>
> So, Srikar, please try this patch out, as opposed to
On Thu, Mar 20, 2014 at 11:03 AM, Davidlohr Bueso wrote:
>
> I still wonder about ppc and spinlocks (no ticketing!!) ... sure the
> "waiters" patch might fix the problem just because we explicitly count
> the members of the plist. And I guess if we cannot rely on all archs
> having an equivalent
On Thu, 2014-03-20 at 10:42 -0700, Linus Torvalds wrote:
> On Thu, Mar 20, 2014 at 10:18 AM, Davidlohr Bueso wrote:
> >> It strikes me that the "spin_is_locked()" test has no barriers wrt the
> >> writing of the new futex value on the wake path. And the read barrier
> >> obviously does nothing
On Thu, Mar 20, 2014 at 10:18 AM, Davidlohr Bueso wrote:
>> It strikes me that the "spin_is_locked()" test has no barriers wrt the
>> writing of the new futex value on the wake path. And the read barrier
>> obviously does nothing wrt the write either. Or am I missing
>> something? So the write
On Thu, 2014-03-20 at 09:41 -0700, Linus Torvalds wrote:
> On Wed, Mar 19, 2014 at 10:56 PM, Davidlohr Bueso wrote:
> >
> > This problem suggests that we missed a wakeup for a task that was adding
> > itself to the queue in a wait path. And the only place that can happen
> > is with the hb
On Wed, Mar 19, 2014 at 10:56 PM, Davidlohr Bueso wrote:
>
> This problem suggests that we missed a wakeup for a task that was adding
> itself to the queue in a wait path. And the only place that can happen
> is with the hb spinlock check for any pending waiters.
Ok, so thinking about
On Wed, 2014-03-19 at 22:56 -0700, Davidlohr Bueso wrote:
> On Thu, 2014-03-20 at 11:03 +0530, Srikar Dronamraju wrote:
> > > > Joy,.. let me look at that with ppc in mind.
> > >
> > > OK; so while pretty much all the comments from that patch are utter
> > > nonsense (what was I thinking), I
On Thu, 2014-03-20 at 15:38 +0530, Srikar Dronamraju wrote:
> > This problem suggests that we missed a wakeup for a task that was adding
> > itself to the queue in a wait path. And the only place that can happen
> > is with the hb spinlock check for any pending waiters. Just in case we
> > missed
> This problem suggests that we missed a wakeup for a task that was adding
> itself to the queue in a wait path. And the only place that can happen
> is with the hb spinlock check for any pending waiters. Just in case we
> missed some assumption about checking the hash bucket spinlock as a way
>
On Thu, Mar 20, 2014 at 11:03:50AM +0530, Srikar Dronamraju wrote:
> > > Joy,.. let me look at that with ppc in mind.
> >
> > OK; so while pretty much all the comments from that patch are utter
> > nonsense (what was I thinking), I cannot actually find a real bug.
> >
> > But could you try the
On Thu, Mar 20, 2014 at 11:03:50AM +0530, Srikar Dronamraju wrote:
Joy,.. let me look at that with ppc in mind.
OK; so while pretty much all the comments from that patch are utter
nonsense (what was I thinking), I cannot actually find a real bug.
But could you try the below which
This problem suggests that we missed a wakeup for a task that was adding
itself to the queue in a wait path. And the only place that can happen
is with the hb spinlock check for any pending waiters. Just in case we
missed some assumption about checking the hash bucket spinlock as a way
of
On Thu, 2014-03-20 at 15:38 +0530, Srikar Dronamraju wrote:
This problem suggests that we missed a wakeup for a task that was adding
itself to the queue in a wait path. And the only place that can happen
is with the hb spinlock check for any pending waiters. Just in case we
missed some
On Wed, 2014-03-19 at 22:56 -0700, Davidlohr Bueso wrote:
On Thu, 2014-03-20 at 11:03 +0530, Srikar Dronamraju wrote:
Joy,.. let me look at that with ppc in mind.
OK; so while pretty much all the comments from that patch are utter
nonsense (what was I thinking), I cannot actually
On Wed, Mar 19, 2014 at 10:56 PM, Davidlohr Bueso davidl...@hp.com wrote:
This problem suggests that we missed a wakeup for a task that was adding
itself to the queue in a wait path. And the only place that can happen
is with the hb spinlock check for any pending waiters.
Ok, so thinking
On Thu, 2014-03-20 at 09:41 -0700, Linus Torvalds wrote:
On Wed, Mar 19, 2014 at 10:56 PM, Davidlohr Bueso davidl...@hp.com wrote:
This problem suggests that we missed a wakeup for a task that was adding
itself to the queue in a wait path. And the only place that can happen
is with the hb
On Thu, Mar 20, 2014 at 10:18 AM, Davidlohr Bueso davidl...@hp.com wrote:
It strikes me that the spin_is_locked() test has no barriers wrt the
writing of the new futex value on the wake path. And the read barrier
obviously does nothing wrt the write either. Or am I missing
something? So the
On Thu, 2014-03-20 at 10:42 -0700, Linus Torvalds wrote:
On Thu, Mar 20, 2014 at 10:18 AM, Davidlohr Bueso davidl...@hp.com wrote:
It strikes me that the spin_is_locked() test has no barriers wrt the
writing of the new futex value on the wake path. And the read barrier
obviously does
On Thu, Mar 20, 2014 at 11:03 AM, Davidlohr Bueso davidl...@hp.com wrote:
I still wonder about ppc and spinlocks (no ticketing!!) ... sure the
waiters patch might fix the problem just because we explicitly count
the members of the plist. And I guess if we cannot rely on all archs
having an
On Thu, Mar 20, 2014 at 10:18 AM, Davidlohr Bueso davidl...@hp.com wrote:
Comparing with the patch I sent earlier this morning, looks equivalent,
and fwiw, passes my initial qemu bootup, which is the first way of
detecting anything stupid going on.
So, Srikar, please try this patch out, as
On Thu, 2014-03-20 at 11:36 -0700, Linus Torvalds wrote:
On Thu, Mar 20, 2014 at 10:18 AM, Davidlohr Bueso davidl...@hp.com wrote:
Comparing with the patch I sent earlier this morning, looks equivalent,
and fwiw, passes my initial qemu bootup, which is the first way of
detecting anything
On Thu, Mar 20, 2014 at 12:08 PM, Davidlohr Bueso davidl...@hp.com wrote:
Oh, it does. This atomics technique was tested at a customer's site and
ready for upstream.
I'm not worried about the *original* patch. I'm worried about the
incremental one.
Your original patch never applied to my tree
On Thu, 2014-03-20 at 12:25 -0700, Linus Torvalds wrote:
On Thu, Mar 20, 2014 at 12:08 PM, Davidlohr Bueso davidl...@hp.com wrote:
Oh, it does. This atomics technique was tested at a customer's site and
ready for upstream.
I'm not worried about the *original* patch. I'm worried about the
On Thu, 2014-03-20 at 09:31 -0700, Davidlohr Bueso wrote:
hmmm looking at ppc spinlock code, it seems that it doesn't have ticket
spinlocks -- in fact Torsten Duwe has been trying to get them upstream
very recently. Since we rely on the counter for detecting waiters, this
might explain the
On Thu, Mar 20, 2014 at 1:20 PM, Davidlohr Bueso davidl...@hp.com wrote:
I reverted commits 99b60ce6 (documentation) and b0c29f79 (the offending
commit), and then I cleanly applied the equivalent ones from v3 of the
series (which was already *tested* and ready for upstream until you
suggested
Ok, so a big reason why this patch doesn't apply cleanly after reverting
is because *most* of the changes were done at the top of the file with
regards to documenting the ordering guarantees, the actual code changes
are quite minimal.
I reverted commits 99b60ce6 (documentation) and
On Thu, Mar 20, 2014 at 9:55 PM, Srikar Dronamraju
sri...@linux.vnet.ibm.com wrote:
I reverted commits 99b60ce6 and b0c29f79. Then applied the patches in
the above url. The last one had a reject but it was pretty
straightforward to resolve it. After this, specjbb completes.
So reverting and
On Thu, 2014-03-20 at 11:03 +0530, Srikar Dronamraju wrote:
> > > Joy,.. let me look at that with ppc in mind.
> >
> > OK; so while pretty much all the comments from that patch are utter
> > nonsense (what was I thinking), I cannot actually find a real bug.
> >
> > But could you try the below
> > Joy,.. let me look at that with ppc in mind.
>
> OK; so while pretty much all the comments from that patch are utter
> nonsense (what was I thinking), I cannot actually find a real bug.
>
> But could you try the below which replaces a control dependency with a
> full barrier. The control
On Wed, 2014-03-19 at 18:08 +0100, Peter Zijlstra wrote:
> On Wed, Mar 19, 2014 at 04:47:05PM +0100, Peter Zijlstra wrote:
> > > I reverted b0c29f79ecea0b6fbcefc999e70f2843ae8306db on top of v3.14-rc6
> > > and confirmed that
> > > reverting the commit solved the problem.
> >
> > Joy,.. let me
On Wed, Mar 19, 2014 at 04:47:05PM +0100, Peter Zijlstra wrote:
> > I reverted b0c29f79ecea0b6fbcefc999e70f2843ae8306db on top of v3.14-rc6 and
> > confirmed that
> > reverting the commit solved the problem.
>
> Joy,.. let me look at that with ppc in mind.
OK; so while pretty much all the
On Wed, Mar 19, 2014 at 8:26 AM, Srikar Dronamraju
wrote:
>
> I reverted b0c29f79ecea0b6fbcefc999e70f2843ae8306db on top of v3.14-rc6 and
> confirmed that
> reverting the commit solved the problem.
Ok. I'll give Peter and Davidlohr a few days to perhaps find something
obvious, but I guess we'll
> >
> > Infact I can reproduce this if the java_constraint is either node, socket,
> > system.
> > However I am not able to reproduce if java_constraint is set to core.
>
> What's any of that mean?
>
Using the constraint, one can specify how many jvm instances should
participate in the
On Wed, Mar 19, 2014 at 08:56:19PM +0530, Srikar Dronamraju wrote:
> There are 332 tasks all stuck in futex_wait_queue_me().
> I am able to reproduce this consistently.
>
> Infact I can reproduce this if the java_constraint is either node, socket,
> system.
> However I am not able to reproduce
Hi,
When running specjbb on a power7 numa box, I am seeing java threads
getting stuck in futex
# ps -Ao pid,tt,user,fname,tmout,f,wchan | grep futex
14808 pts/0root java - 0 futex_wait_queue_me
14925 pts/0root java - 0 futex_wait_queue_me
#
stack traces, I
Hi,
When running specjbb on a power7 numa box, I am seeing java threads
getting stuck in futex
# ps -Ao pid,tt,user,fname,tmout,f,wchan | grep futex
14808 pts/0root java - 0 futex_wait_queue_me
14925 pts/0root java - 0 futex_wait_queue_me
#
stack traces, I
On Wed, Mar 19, 2014 at 08:56:19PM +0530, Srikar Dronamraju wrote:
There are 332 tasks all stuck in futex_wait_queue_me().
I am able to reproduce this consistently.
Infact I can reproduce this if the java_constraint is either node, socket,
system.
However I am not able to reproduce if
Infact I can reproduce this if the java_constraint is either node, socket,
system.
However I am not able to reproduce if java_constraint is set to core.
What's any of that mean?
Using the constraint, one can specify how many jvm instances should
participate in the specjbb run.
For
On Wed, Mar 19, 2014 at 8:26 AM, Srikar Dronamraju
sri...@linux.vnet.ibm.com wrote:
I reverted b0c29f79ecea0b6fbcefc999e70f2843ae8306db on top of v3.14-rc6 and
confirmed that
reverting the commit solved the problem.
Ok. I'll give Peter and Davidlohr a few days to perhaps find something
On Wed, Mar 19, 2014 at 04:47:05PM +0100, Peter Zijlstra wrote:
I reverted b0c29f79ecea0b6fbcefc999e70f2843ae8306db on top of v3.14-rc6 and
confirmed that
reverting the commit solved the problem.
Joy,.. let me look at that with ppc in mind.
OK; so while pretty much all the comments from
On Wed, 2014-03-19 at 18:08 +0100, Peter Zijlstra wrote:
On Wed, Mar 19, 2014 at 04:47:05PM +0100, Peter Zijlstra wrote:
I reverted b0c29f79ecea0b6fbcefc999e70f2843ae8306db on top of v3.14-rc6
and confirmed that
reverting the commit solved the problem.
Joy,.. let me look at that
Joy,.. let me look at that with ppc in mind.
OK; so while pretty much all the comments from that patch are utter
nonsense (what was I thinking), I cannot actually find a real bug.
But could you try the below which replaces a control dependency with a
full barrier. The control flow is
On Thu, 2014-03-20 at 11:03 +0530, Srikar Dronamraju wrote:
Joy,.. let me look at that with ppc in mind.
OK; so while pretty much all the comments from that patch are utter
nonsense (what was I thinking), I cannot actually find a real bug.
But could you try the below which replaces
54 matches
Mail list logo