On Tue, Oct 02, 2018 at 02:19:53PM +0100, Will Deacon wrote:
> On Mon, Oct 01, 2018 at 10:00:28PM +0200, Peter Zijlstra wrote:
> > Let me draw a picture of that..
> >
> >
> > CPU0 CPU1CPU2CPU3
> >
> > 0) lock
> >
On Tue, Oct 02, 2018 at 02:19:53PM +0100, Will Deacon wrote:
> On Mon, Oct 01, 2018 at 10:00:28PM +0200, Peter Zijlstra wrote:
> > Let me draw a picture of that..
> >
> >
> > CPU0 CPU1CPU2CPU3
> >
> > 0) lock
> >
On Tue, Oct 02, 2018 at 02:22:09PM +0100, Will Deacon wrote:
> On Tue, Oct 02, 2018 at 02:31:52PM +0200, Andrea Parri wrote:
> > > consider this scenario with your patch:
> > >
> > > 1. CPU0 sees a locked val, and is about to do your xchg_relaxed() to set
> > >pending.
> > >
> > > 2. CPU1
On Tue, Oct 02, 2018 at 02:22:09PM +0100, Will Deacon wrote:
> On Tue, Oct 02, 2018 at 02:31:52PM +0200, Andrea Parri wrote:
> > > consider this scenario with your patch:
> > >
> > > 1. CPU0 sees a locked val, and is about to do your xchg_relaxed() to set
> > >pending.
> > >
> > > 2. CPU1
On Tue, Oct 02, 2018 at 02:31:52PM +0200, Andrea Parri wrote:
> > consider this scenario with your patch:
> >
> > 1. CPU0 sees a locked val, and is about to do your xchg_relaxed() to set
> >pending.
> >
> > 2. CPU1 comes in and sets pending, spins on locked
> >
> > 3. CPU2 sees a pending
On Tue, Oct 02, 2018 at 02:31:52PM +0200, Andrea Parri wrote:
> > consider this scenario with your patch:
> >
> > 1. CPU0 sees a locked val, and is about to do your xchg_relaxed() to set
> >pending.
> >
> > 2. CPU1 comes in and sets pending, spins on locked
> >
> > 3. CPU2 sees a pending
On Mon, Oct 01, 2018 at 10:00:28PM +0200, Peter Zijlstra wrote:
> On Mon, Oct 01, 2018 at 06:17:00PM +0100, Will Deacon wrote:
> > Thanks for chewing up my afternoon ;)
>
> I'll get you a beer in EDI ;-)
Just one?!
> > But actually,
> > consider this scenario with your patch:
> >
> > 1. CPU0
On Mon, Oct 01, 2018 at 10:00:28PM +0200, Peter Zijlstra wrote:
> On Mon, Oct 01, 2018 at 06:17:00PM +0100, Will Deacon wrote:
> > Thanks for chewing up my afternoon ;)
>
> I'll get you a beer in EDI ;-)
Just one?!
> > But actually,
> > consider this scenario with your patch:
> >
> > 1. CPU0
> consider this scenario with your patch:
>
> 1. CPU0 sees a locked val, and is about to do your xchg_relaxed() to set
>pending.
>
> 2. CPU1 comes in and sets pending, spins on locked
>
> 3. CPU2 sees a pending and locked val, and is about to enter the head of
>the waitqueue (i.e. it's
> consider this scenario with your patch:
>
> 1. CPU0 sees a locked val, and is about to do your xchg_relaxed() to set
>pending.
>
> 2. CPU1 comes in and sets pending, spins on locked
>
> 3. CPU2 sees a pending and locked val, and is about to enter the head of
>the waitqueue (i.e. it's
On Mon, Oct 01, 2018 at 06:17:00PM +0100, Will Deacon wrote:
> Hi Peter,
>
> Thanks for chewing up my afternoon ;)
I'll get you a beer in EDI ;-)
> On Wed, Sep 26, 2018 at 01:01:20PM +0200, Peter Zijlstra wrote:
> > /**
> > + * set_pending_fetch_acquire - fetch the whole lock value and set
On Mon, Oct 01, 2018 at 06:17:00PM +0100, Will Deacon wrote:
> Hi Peter,
>
> Thanks for chewing up my afternoon ;)
I'll get you a beer in EDI ;-)
> On Wed, Sep 26, 2018 at 01:01:20PM +0200, Peter Zijlstra wrote:
> > /**
> > + * set_pending_fetch_acquire - fetch the whole lock value and set
Hi Peter,
Thanks for chewing up my afternoon ;)
On Wed, Sep 26, 2018 at 01:01:20PM +0200, Peter Zijlstra wrote:
> On x86 we cannot do fetch_or with a single instruction and end up
> using a cmpxchg loop, this reduces determinism. Replace the fetch_or
> with a very tricky composite xchg8 + load.
Hi Peter,
Thanks for chewing up my afternoon ;)
On Wed, Sep 26, 2018 at 01:01:20PM +0200, Peter Zijlstra wrote:
> On x86 we cannot do fetch_or with a single instruction and end up
> using a cmpxchg loop, this reduces determinism. Replace the fetch_or
> with a very tricky composite xchg8 + load.
From: Peter Zijlstra
> Sent: 26 September 2018 12:01
>
> On x86 we cannot do fetch_or with a single instruction and end up
> using a cmpxchg loop, this reduces determinism. Replace the fetch_or
> with a very tricky composite xchg8 + load.
>
> The basic idea is that we use xchg8 to test-and-set
From: Peter Zijlstra
> Sent: 26 September 2018 12:01
>
> On x86 we cannot do fetch_or with a single instruction and end up
> using a cmpxchg loop, this reduces determinism. Replace the fetch_or
> with a very tricky composite xchg8 + load.
>
> The basic idea is that we use xchg8 to test-and-set
On Thu, Sep 27, 2018 at 10:13:15AM +0200, Andrea Parri wrote:
> On Thu, Sep 27, 2018 at 09:59:35AM +0200, Peter Zijlstra wrote:
> > On Thu, Sep 27, 2018 at 09:47:48AM +0200, Andrea Parri wrote:
> > > > LKMM in particular does _NOT_ deal with mixed sized atomics _at_all_.
> > >
> > > True, but it
On Thu, Sep 27, 2018 at 10:13:15AM +0200, Andrea Parri wrote:
> On Thu, Sep 27, 2018 at 09:59:35AM +0200, Peter Zijlstra wrote:
> > On Thu, Sep 27, 2018 at 09:47:48AM +0200, Andrea Parri wrote:
> > > > LKMM in particular does _NOT_ deal with mixed sized atomics _at_all_.
> > >
> > > True, but it
On Thu, Sep 27, 2018 at 09:59:35AM +0200, Peter Zijlstra wrote:
> On Thu, Sep 27, 2018 at 09:47:48AM +0200, Andrea Parri wrote:
> > > LKMM in particular does _NOT_ deal with mixed sized atomics _at_all_.
> >
> > True, but it is nothing conceptually new to deal with: there're Cat
> > models that
On Thu, Sep 27, 2018 at 09:59:35AM +0200, Peter Zijlstra wrote:
> On Thu, Sep 27, 2018 at 09:47:48AM +0200, Andrea Parri wrote:
> > > LKMM in particular does _NOT_ deal with mixed sized atomics _at_all_.
> >
> > True, but it is nothing conceptually new to deal with: there're Cat
> > models that
On Thu, Sep 27, 2018 at 09:47:48AM +0200, Andrea Parri wrote:
> > LKMM in particular does _NOT_ deal with mixed sized atomics _at_all_.
>
> True, but it is nothing conceptually new to deal with: there're Cat
> models that handle mixed-size accesses, just give it time.
Sure, but until that time I
On Thu, Sep 27, 2018 at 09:47:48AM +0200, Andrea Parri wrote:
> > LKMM in particular does _NOT_ deal with mixed sized atomics _at_all_.
>
> True, but it is nothing conceptually new to deal with: there're Cat
> models that handle mixed-size accesses, just give it time.
Sure, but until that time I
On Thu, Sep 27, 2018 at 09:17:47AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 26, 2018 at 10:52:08PM +0200, Andrea Parri wrote:
> > On Wed, Sep 26, 2018 at 01:01:20PM +0200, Peter Zijlstra wrote:
> > > On x86 we cannot do fetch_or with a single instruction and end up
> > > using a cmpxchg loop,
On Thu, Sep 27, 2018 at 09:17:47AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 26, 2018 at 10:52:08PM +0200, Andrea Parri wrote:
> > On Wed, Sep 26, 2018 at 01:01:20PM +0200, Peter Zijlstra wrote:
> > > On x86 we cannot do fetch_or with a single instruction and end up
> > > using a cmpxchg loop,
On Wed, Sep 26, 2018 at 07:54:18PM +0200, Peter Zijlstra wrote:
> On Wed, Sep 26, 2018 at 12:30:36PM -0400, Waiman Long wrote:
> > On 09/26/2018 07:01 AM, Peter Zijlstra wrote:
> > > On x86 we cannot do fetch_or with a single instruction and end up
> > > using a cmpxchg loop, this reduces
On Wed, Sep 26, 2018 at 07:54:18PM +0200, Peter Zijlstra wrote:
> On Wed, Sep 26, 2018 at 12:30:36PM -0400, Waiman Long wrote:
> > On 09/26/2018 07:01 AM, Peter Zijlstra wrote:
> > > On x86 we cannot do fetch_or with a single instruction and end up
> > > using a cmpxchg loop, this reduces
On Wed, Sep 26, 2018 at 10:52:08PM +0200, Andrea Parri wrote:
> On Wed, Sep 26, 2018 at 01:01:20PM +0200, Peter Zijlstra wrote:
> > On x86 we cannot do fetch_or with a single instruction and end up
> > using a cmpxchg loop, this reduces determinism. Replace the fetch_or
> > with a very tricky
On Wed, Sep 26, 2018 at 10:52:08PM +0200, Andrea Parri wrote:
> On Wed, Sep 26, 2018 at 01:01:20PM +0200, Peter Zijlstra wrote:
> > On x86 we cannot do fetch_or with a single instruction and end up
> > using a cmpxchg loop, this reduces determinism. Replace the fetch_or
> > with a very tricky
On Wed, Sep 26, 2018 at 01:01:20PM +0200, Peter Zijlstra wrote:
> On x86 we cannot do fetch_or with a single instruction and end up
> using a cmpxchg loop, this reduces determinism. Replace the fetch_or
> with a very tricky composite xchg8 + load.
>
> The basic idea is that we use xchg8 to
On Wed, Sep 26, 2018 at 01:01:20PM +0200, Peter Zijlstra wrote:
> On x86 we cannot do fetch_or with a single instruction and end up
> using a cmpxchg loop, this reduces determinism. Replace the fetch_or
> with a very tricky composite xchg8 + load.
>
> The basic idea is that we use xchg8 to
On Wed, Sep 26, 2018 at 12:30:36PM -0400, Waiman Long wrote:
> On 09/26/2018 07:01 AM, Peter Zijlstra wrote:
> > On x86 we cannot do fetch_or with a single instruction and end up
> > using a cmpxchg loop, this reduces determinism. Replace the fetch_or
> > with a very tricky composite xchg8 + load.
On Wed, Sep 26, 2018 at 12:30:36PM -0400, Waiman Long wrote:
> On 09/26/2018 07:01 AM, Peter Zijlstra wrote:
> > On x86 we cannot do fetch_or with a single instruction and end up
> > using a cmpxchg loop, this reduces determinism. Replace the fetch_or
> > with a very tricky composite xchg8 + load.
On 09/26/2018 07:01 AM, Peter Zijlstra wrote:
> On x86 we cannot do fetch_or with a single instruction and end up
> using a cmpxchg loop, this reduces determinism. Replace the fetch_or
> with a very tricky composite xchg8 + load.
>
> The basic idea is that we use xchg8 to test-and-set the pending
On 09/26/2018 07:01 AM, Peter Zijlstra wrote:
> On x86 we cannot do fetch_or with a single instruction and end up
> using a cmpxchg loop, this reduces determinism. Replace the fetch_or
> with a very tricky composite xchg8 + load.
>
> The basic idea is that we use xchg8 to test-and-set the pending
34 matches
Mail list logo