Re: [PATCH v3 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-15 Thread Will Deacon
On Thu, Feb 14, 2019 at 10:09:44AM -0800, Linus Torvalds wrote: > On Thu, Feb 14, 2019 at 9:51 AM Linus Torvalds > wrote: > > > > The arm64 numbers scaled horribly even before, and that's because > > there is too much ping-pong, and it's probably because there is no > > "stickiness" to the

Re: [PATCH v3 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-14 Thread Waiman Long
On 02/14/2019 01:02 PM, Will Deacon wrote: > On Thu, Feb 14, 2019 at 11:33:33AM +0100, Peter Zijlstra wrote: >> On Wed, Feb 13, 2019 at 03:32:12PM -0500, Waiman Long wrote: >>> Modify __down_read_trylock() to optimize for an unlocked rwsem and make >>> it generate slightly better code. >>> >>>

Re: [PATCH v3 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-14 Thread Linus Torvalds
On Thu, Feb 14, 2019 at 9:51 AM Linus Torvalds wrote: > > The arm64 numbers scaled horribly even before, and that's because > there is too much ping-pong, and it's probably because there is no > "stickiness" to the cacheline to the core, and thus adding the extra > loop can make the ping-pong

Re: [PATCH v3 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-14 Thread Will Deacon
On Thu, Feb 14, 2019 at 11:33:33AM +0100, Peter Zijlstra wrote: > On Wed, Feb 13, 2019 at 03:32:12PM -0500, Waiman Long wrote: > > Modify __down_read_trylock() to optimize for an unlocked rwsem and make > > it generate slightly better code. > > > > Before this patch, down_read_trylock: > > > >

Re: [PATCH v3 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-14 Thread Linus Torvalds
On Thu, Feb 14, 2019 at 6:53 AM Waiman Long wrote: > > The ARM64 result is what I would have expected given that the change was > to optimize for the uncontended case. The x86-64 result is kind of an > anomaly to me, but I haven't bothered to dig into that. I would say that the ARM result is

Re: [PATCH v3 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-14 Thread Waiman Long
On 02/14/2019 05:33 AM, Peter Zijlstra wrote: > On Wed, Feb 13, 2019 at 03:32:12PM -0500, Waiman Long wrote: >> Modify __down_read_trylock() to optimize for an unlocked rwsem and make >> it generate slightly better code. >> >> Before this patch, down_read_trylock: >> >>0x <+0>:

Re: [PATCH v3 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-14 Thread Peter Zijlstra
On Wed, Feb 13, 2019 at 03:32:12PM -0500, Waiman Long wrote: > Modify __down_read_trylock() to optimize for an unlocked rwsem and make > it generate slightly better code. > > Before this patch, down_read_trylock: > >0x <+0>: callq 0x5 >0x0005 <+5>: