Re: [lttng-dev] [PATCH] urcu: avoid false sharing for rcu_gp_ctr

Lai Jiangshan Mon, 10 Dec 2012 16:34:02 -0800

On 12/11/2012 07:48 AM, Paul E. McKenney wrote:
> On Mon, Dec 10, 2012 at 01:01:12PM -0500, Mathieu Desnoyers wrote:
>> * Paul E. McKenney ([email protected]) wrote:
>>> On Fri, Dec 07, 2012 at 12:22:52PM -0500, Mathieu Desnoyers wrote:
>>>> * Lai Jiangshan ([email protected]) wrote:
>>>>> On Saturday, December 8, 2012, Mathieu Desnoyers wrote:
>>>>>
>>>>>> * Lai Jiangshan ([email protected] <javascript:;>) wrote:
>>>>>>> we can define rcu_gp_ctr and registry with aligned attribute, but it is
>>>>>> not
>>>>>>> reliable way
>>>>>>>
>>>>>>> We need only this:
>>>>>>> unsigned long rcu_gp_ctr __attribute((aligned and padded(don't put other
>>>>>>> var next to it except the futex)))
>>>>>>
>>>>>> In which situation would this be unreliable ?
>>>>>
>>>>>
>>>>>
>>>>> int a;
>>>>> int b __attribute__((aligned));
>>>>> int c;
>>>>>
>>>>> b and c will be in the same line, even we define c as aligned too, the
>>>>> compiler/linker may put a next to b, thus a and b in the same line
>>>>
>>>> So if our goal is to have rcu_gp_ctr and rcu_gp_futex on the same cache
>>>> line, which is different from that of the registry, we could do:
>>>>
>>>> typeA rcu_gp_ctr __attribute__((aligned(...)));
>>>> typeB rcu_gp_futex;
>>>> typeC registry __attribute__((aligned(...)));
>>>>
>>>> I would expect the compiler won't typically reorder rcu_gp_futex and
>>>> registry. But I guess there is no guarantee it is going to be always
>>>> true given by the C99 standard.
>>>>
>>>> I guess this is a case where we could bump the library version number
>>>> and do things properly.
>>>>
>>>> Let's think a bit more about it, anyone else has comments on this ?
>>>
>>> If you really want the alignment and padding, they should go into a
>>> structure.  ;-)
>>
>> OK, so we would bump to liburcu2 then for the 0.8 release ?
> 
> I don't believe that this optimization by itself warrants a version
> bump.  Perhaps hold it in reserve when the next need for a version bump
> comes along.
> 
> So, how to do this without a version bump?  One approach would be to
> use linker-script magic


I already did it, send soon.

> (yeah, now -that- will be portable!).  Or if
> the alignment attributes happen to do the right thing on popular gcc
> versions, go with them until the next version bump allows us to do
> it more deterministically.
> 
>                                                       Thanx, Paul
> 
>> Thanks,
>>
>> Mathieu
>>
>>>
>>>                                                     Thanx, Paul
>>>
>>>> Thanks,
>>>>
>>>> Mathieu
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Mathieu
>>>>>>
>>>>>>>
>>>>>>> On Saturday, December 8, 2012, Mathieu Desnoyers wrote:
>>>>>>>
>>>>>>>> * Lai Jiangshan ([email protected] <javascript:;> <javascript:;>)
>>>>>> wrote:
>>>>>>>>> @rcu_gp_ctr and @registry share the same cache line, it causes
>>>>>>>>> false sharing and slowdown both of the read site and update site.
>>>>>>>>>
>>>>>>>>> Fix: Use different cache line for them.
>>>>>>>>>
>>>>>>>>> Although rcu_gp_futex is updated less than rcu_gp_ctr, but
>>>>>>>>> they always be accessed at almost the same time, so we also move
>>>>>>>> rcu_gp_futex
>>>>>>>>> to the cacheline of rcu_gp_ctr to reduce the cacheline-usage or
>>>>>>>> cache-missing
>>>>>>>>> of read site.
>>>>>>>>
>>>>>>>> Hi Lai,
>>>>>>>>
>>>>>>>> I agree on the goal: placing registry and rcu_gp_ctr on different
>>>>>>>> cache-lines. And yes, it makes sense to put rcu_gp_ctr and rcu_gp_futex
>>>>>>>> on the same cache-line. I agree that the next patch is fine too
>>>>>> (keeping
>>>>>>>> qsbr and other urcu similar). This is indeed what I try to ensure
>>>>>>>> myself.
>>>>>>>>
>>>>>>>> I'm just concerned that this patch seems to break ABI compability for
>>>>>>>> liburcu: the read-side, within applications, would have to be
>>>>>>>> recompiled. So I guess we should decide if we do this change in a way
>>>>>>>> that does not break the ABI (e.g. not introducing a structure), or if
>>>>>> we
>>>>>>>> choose to bump the library version number.
>>>>>>>>
>>>>>>>> Thoughts ?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Mathieu
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> test: (4X6=24 CPUs)
>>>>>>>>>
>>>>>>>>> Before patch:
>>>>>>>>>
>>>>>>>>> [root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
>>>>>>>>> SUMMARY ./tests/test_urcu_mb      testdur   20 nr_readers  20 rdur
>>>>>>>>  0 wdur      0 nr_writers   1 wdelay      0 nr_reads   2100285330
>>>>>> nr_writes
>>>>>>>>      3390219 nr_ops   2103675549
>>>>>>>>> [root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
>>>>>>>>> SUMMARY ./tests/test_urcu_mb      testdur   20 nr_readers  20 rdur
>>>>>>>>  0 wdur      0 nr_writers   1 wdelay      0 nr_reads   1619868562
>>>>>> nr_writes
>>>>>>>>      3529478 nr_ops   1623398040
>>>>>>>>> [root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
>>>>>>>>> SUMMARY ./tests/test_urcu_mb      testdur   20 nr_readers  20 rdur
>>>>>>>>  0 wdur      0 nr_writers   1 wdelay      0 nr_reads   1949067038
>>>>>> nr_writes
>>>>>>>>      3469334 nr_ops   1952536372
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> after patch:
>>>>>>>>>
>>>>>>>>> [root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
>>>>>>>>> SUMMARY ./tests/test_urcu_mb      testdur   20 nr_readers  20 rdur
>>>>>>>>  0 wdur      0 nr_writers   1 wdelay      0 nr_reads   3380191848
>>>>>> nr_writes
>>>>>>>>      4903248 nr_ops   3385095096
>>>>>>>>> [root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
>>>>>>>>> SUMMARY ./tests/test_urcu_mb      testdur   20 nr_readers  20 rdur
>>>>>>>>  0 wdur      0 nr_writers   1 wdelay      0 nr_reads   3397637486
>>>>>> nr_writes
>>>>>>>>      4129809 nr_ops   3401767295
>>>>>>>>>
>>>>>>>>> Singed-by: Lai Jiangshan <[email protected] 
>>>>>>>>> <javascript:;><javascript:;>>
>>>>>>>>> ---
>>>>>>>>> diff --git a/urcu.c b/urcu.c
>>>>>>>>> index 15def09..436d71c 100644
>>>>>>>>> --- a/urcu.c
>>>>>>>>> +++ b/urcu.c
>>>>>>>>> @@ -83,16 +83,7 @@ void __attribute__((destructor)) rcu_exit(void);
>>>>>>>>>  #endif
>>>>>>>>>
>>>>>>>>>  static pthread_mutex_t rcu_gp_lock = PTHREAD_MUTEX_INITIALIZER;
>>>>>>>>> -
>>>>>>>>> -int32_t rcu_gp_futex;
>>>>>>>>> -
>>>>>>>>> -/*
>>>>>>>>> - * Global grace period counter.
>>>>>>>>> - * Contains the current RCU_GP_CTR_PHASE.
>>>>>>>>> - * Also has a RCU_GP_COUNT of 1, to accelerate the reader fast path.
>>>>>>>>> - * Written to only by writer with mutex taken. Read by both writer
>>>>>> and
>>>>>>>> readers.
>>>>>>>>> - */
>>>>>>>>> -unsigned long rcu_gp_ctr = RCU_GP_COUNT;
>>>>>>>>> +struct urcu_gp rcu_gp = { .ctr = RCU_GP_COUNT };
>>>>>>>>>
>>>>>>>>>  /*
>>>>>>>>>   * Written to only by each individual reader. Read by both the
>>>>>> reader
>>>>>>>> and the
>>>>>>>>> @@ -217,8 +208,8 @@ static void wait_gp(void)
>>>>>>>>>  {
>>>>>>>>>       /* Read reader_gp before read futex */
>>>>>>>>>       smp_mb_master(RCU_MB_GROUP);
>>>>>>>>> -     if (uatomic_read(&rcu_gp_futex) == -1)
>>>>>>>>> -             futex_async(&rcu_gp_futex, FUTEX_WAIT, -1,
>>>>>>>>> +     if (uatomic_read(&rcu_gp.futex) == -1)
>>>>>>>>> +             futex_async(&rcu_gp.futex, FUTEX_WAIT, -1,
>>>>>>>>>                     NULL, NULL, 0);
>>>>>>>>>  }
>>>>>>>>>
>>>>>>>>> @@ -232,12 +223,12 @@ static void wait_for_readers(struct
>>>>>> cds_list_head
>>>>>>>> *input_readers,
>>>>>>>>>       /*
>>>>>>>>>        * Wait for each thread URCU_TLS(rcu_reader).ctr to either
>>>>>>>>>        * indicate quiescence (not nested), or observe the current
>>>>>>>>> -      * rcu_gp_ctr value.
>>>>>>>>> +      * rcu_gp.ctr value.
>>>>>>>>>        */
>>>>>>>>>       for (;;) {
>>>>>>>>>               wait_loops++;
>>>>>>>>>               if (wait_loops == RCU_QS_ACTIVE_ATTEMPTS) {
>>>>>>>>> -                     uatomic_dec(&rcu_gp_futex);
>>>>>>>>> +                     uatomic_dec(&rcu_gp.futex);
>>>>>>>>>                       /* Write futex before read reader_gp */
>>>>>>>>>                       smp_mb_master(RCU_MB_GROUP);
>>>>>>>>>               }
>>>>>>>>> @@ -270,7 +261,7 @@ static void wait_for_readers(struct cds_list_head
>>>>>>>> *input_readers,
>>>>>>>>>                       if (wait_loops == RCU_QS_ACTIVE_ATTEMPTS) {
>>>>>>>>>                               /* Read reader_gp before write futex */
>>>>>>>>>                               smp_mb_master(RCU_MB_GROUP);
>>>>>>>>> -                             uatomic_set(&rcu_gp_futex, 0);
>>>>>>>>> +                             uatomic_set(&rcu_gp.futex, 0);
>>>>>>>>>                       }
>>>>>>>>>                       break;
>>>>>>>>>               } else {
>>>>>>>>> @@ -289,7 +280,7 @@ static void wait_for_readers(struct cds_list_head
>>>>>>>> *input_readers,
>>>>>>>>>                       if (wait_loops == RCU_QS_ACTIVE_ATTEMPTS) {
>>>>>>>>>                               /* Read reader_gp before write futex */
>>>>>>>>>                             > > 
>>>>>>>>> [email protected]<javascript:;><javascript:;>
>>>>>>>> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Mathieu Desnoyers
>>>>>> Operating System Efficiency R&D Consultant
>>>>>> EfficiOS Inc.
>>>>>> http://www.efficios.com
>>>>>>
>>>>
>>>> -- 
>>>> Mathieu Desnoyers
>>>> Operating System Efficiency R&D Consultant
>>>> EfficiOS Inc.
>>>> http://www.efficios.com
>>>>
>>>
>>
>> -- 
>> Mathieu Desnoyers
>> Operating System Efficiency R&D Consultant
>> EfficiOS Inc.
>> http://www.efficios.com
>>
> 
> 


_______________________________________________
lttng-dev mailing list
[email protected]
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH] urcu: avoid false sharing for rcu_gp_ctr

Reply via email to