Re: [lttng-dev] [RFC PATCH] wfqueue: expand API, simplify implementation, small performance boost

Lai Jiangshan Wed, 15 Aug 2012 19:11:18 -0700

>>
>> Is it false sharing?
>> Access to q->head.next and access to q->tail have the same performance
>> because they are in the same cache line.
>
> Yes! you are right! And a quick benchmark confirms it:
>
> with head and tail on same cache line:
>
> SUMMARY /home/compudj/doc/userspace-rcu/tests/.libs/lt-test_urcu_wfq testdur  
>  10 nr_enqueuers   1 wdelay      0 nr_dequeuers   1 rdur      0 nr_enqueues   
>  100833595 nr_dequeues     88647134 successful enqueues 100833595 successful 
> dequeues     88646898 end_dequeues 12186697 nr_ops 189480729
>
> with a 256 bytes padding between head and tail, keeping the mutex on the
> "head" cache line:
>
> SUMMARY /home/compudj/doc/userspace-rcu/tests/.libs/lt-test_urcu_wfq testdur  
>  10 nr_enqueuers   1 wdelay      0 nr_dequeuers   1 rdur      0 nr_enqueues   
>  228992829 nr_dequeues    228921791 successful enqueues 228992829 successful 
> dequeues    228921367 end_dequeues 71462 nr_ops 457914620
>
> enqueue: 127% speedup
> dequeue: 158% speedup
>
> That is indeed a _really_ huge difference. However, to get this, we
> would have to increase the size of struct cds_wfq_queue beyond its
> current size, which would break API compatibility. Any idea on how to
> best do this without causing incompatibility would be welcome.
>


choice 1) two set of APIs?(cache-line-opt and none-cache-line-opt),
many users don't need the cache-line-opt.
choice 2) Just break the compatibility for NONE-LGPL. I think
NONE-LGPL-user of it is rare. And current version of urcu <1.0, I
don't like too much burden when <1.0.


thanks,
Lai

_______________________________________________
lttng-dev mailing list
[email protected]
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [RFC PATCH] wfqueue: expand API, simplify implementation, small performance boost

Reply via email to