Re: [PATCH v2 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-13 Thread Waiman Long
On 02/13/2019 02:45 AM, Ingo Molnar wrote: > * Waiman Long wrote: > >> I looked at the assembly code in arch/x86/include/asm/rwsem.h. For both >> trylocks (read & write), the count is read first before attempting to >> lock it. We did the same for all trylock functions in other locks. >>

Re: [PATCH v2 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-12 Thread Ingo Molnar
* Waiman Long wrote: > I looked at the assembly code in arch/x86/include/asm/rwsem.h. For both > trylocks (read & write), the count is read first before attempting to > lock it. We did the same for all trylock functions in other locks. > Depending on how the trylock is used and how contended

Re: [PATCH v2 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-12 Thread Waiman Long
On 02/12/2019 02:58 PM, Linus Torvalds wrote: > On Mon, Feb 11, 2019 at 11:31 AM Waiman Long wrote: >> Modify __down_read_trylock() to make it generate slightly better code >> (smaller and maybe a tiny bit faster). > This looks good, but I would ask you to try one slightly different approach. > >

Re: [PATCH v2 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-12 Thread Linus Torvalds
On Mon, Feb 11, 2019 at 11:31 AM Waiman Long wrote: > > Modify __down_read_trylock() to make it generate slightly better code > (smaller and maybe a tiny bit faster). This looks good, but I would ask you to try one slightly different approach. Instead of this: >long tmp =

Re: [PATCH v2 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-12 Thread Waiman Long
On 02/12/2019 01:36 PM, Waiman Long wrote: > On 02/12/2019 08:25 AM, Peter Zijlstra wrote: >> On Tue, Feb 12, 2019 at 02:24:04PM +0100, Peter Zijlstra wrote: >>> On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote: Modify __down_read_trylock() to make it generate slightly better code

Re: [PATCH v2 2/2] locking/rwsem: Optimize down_read_trylock()

2019-02-12 Thread Peter Zijlstra
On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote: > Modify __down_read_trylock() to make it generate slightly better code > (smaller and maybe a tiny bit faster). > > Before this patch, down_read_trylock: > >0x <+0>: callq 0x5 >0x0005 <+5>: