On 08/12/2011 08:08 PM, Mathieu Desnoyers wrote:
It is not accelerating synchronize_rcu(). It does two things:
1) by using futexes, it avoids burning CPU when a grace period is long.
It is actually effective even if the grace period is _not_ so long: 100
walks through the thread list take less than a millisecond, and you do
not want to call rcu_quiescent_state() that often;
2) once you're always using futexes, if you have frequent quiescent
states in one thread and more rare quiescent states in another, the
former thread will uselessly call FUTEX_WAKE on each quiescent state,
even though it is already in the next grace period.
OK, so this might benefit to URCU implementations other than qsbr too,
right ?
Yes. I started from urcu-qsbr because that's what I am using, and
because it's the simplest implementation.
I think I did not convey my idea fully:
this would take care of re-decrementing the gp_futex value for the first
wait attempt and the following ones:
if (wait_loops>= RCU_QS_ACTIVE_ATTEMPTS) {
uatomic_dec(&gp_futex);
/* Write futex before read reader_gp */
cmm_smp_mb();
}
[...]
and this would be waiting for a wakeup:
if (wait_loops>= RCU_QS_ACTIVE_ATTEMPTS) {
wait_gp();
} else {
But I agree that this does not handle readers with quite different
period length very well, because short-lived periods will trigger a lot
of useless futex wakeups.
Yes, that only covers (1) above.
@@ -136,7 +137,11 @@ extern int32_t gp_futex;
*/
static inline void wake_up_gp(void)
{
- if (unlikely(uatomic_read(&gp_futex) == -1)) {
+ if (unlikely(_CMM_LOAD_SHARED(rcu_reader.waiting))) {
+ _CMM_STORE_SHARED(rcu_reader.waiting, 0);
Commenting this memory barrier would be helpful too.
Ok.
Paolo
_______________________________________________
ltt-dev mailing list
[email protected]
http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev