>> >> Is it false sharing? >> Access to q->head.next and access to q->tail have the same performance >> because they are in the same cache line. > > Yes! you are right! And a quick benchmark confirms it: > > with head and tail on same cache line: > > SUMMARY /home/compudj/doc/userspace-rcu/tests/.libs/lt-test_urcu_wfq testdur > 10 nr_enqueuers 1 wdelay 0 nr_dequeuers 1 rdur 0 nr_enqueues > 100833595 nr_dequeues 88647134 successful enqueues 100833595 successful > dequeues 88646898 end_dequeues 12186697 nr_ops 189480729 > > with a 256 bytes padding between head and tail, keeping the mutex on the > "head" cache line: > > SUMMARY /home/compudj/doc/userspace-rcu/tests/.libs/lt-test_urcu_wfq testdur > 10 nr_enqueuers 1 wdelay 0 nr_dequeuers 1 rdur 0 nr_enqueues > 228992829 nr_dequeues 228921791 successful enqueues 228992829 successful > dequeues 228921367 end_dequeues 71462 nr_ops 457914620 > > enqueue: 127% speedup > dequeue: 158% speedup > > That is indeed a _really_ huge difference. However, to get this, we > would have to increase the size of struct cds_wfq_queue beyond its > current size, which would break API compatibility. Any idea on how to > best do this without causing incompatibility would be welcome. >
choice 1) two set of APIs?(cache-line-opt and none-cache-line-opt), many users don't need the cache-line-opt. choice 2) Just break the compatibility for NONE-LGPL. I think NONE-LGPL-user of it is rare. And current version of urcu <1.0, I don't like too much burden when <1.0. thanks, Lai _______________________________________________ lttng-dev mailing list [email protected] http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
