On Wed, Jan 08, 2014 at 11:07:38AM -0800, Alex Wang wrote: > I suspect that this approach to a rwlock is faster when lots of threads > > try to take the read lock at once, because it should avoid cross-core > > contention on cache lines. > > > > For avoiding the contention, is that only because each thread will have its > own lock > or more precisely "fat_rwlock_perthread" which is aligned to cache line?
Each thread having its own lock is the main benefit. Cache-line alignment ensures that a lock's memory doesn't happen to overlap the memory used for a different thread's lock (or other data used by another thread). > > + base = xmalloc(sizeof *perthread + CACHE_LINE_SIZE - 1); > > + perthread = (void *) ROUND_UP((uintptr_t) base, CACHE_LINE_SIZE); > > + > > > > * + perthread = xmalloc(sizeof *perthread);* + perthread->base = > > base; > > + perthread->rwlock = rwlock; > > + list_push_back(&rwlock->threads, &perthread->list_node); > > + perthread->table = table; > > + hmap_insert(&table->perthreads, &perthread->hmap_node, rwlock->id); > > + ovs_mutex_init(&perthread->mutex); > > + ovs_mutex_lock(&perthread->mutex); > > + perthread->depth = 0; > > > > Here, you want to align the perthread to cache line size, right? Yes. > What is the meaning of this second "perthread = xmalloc(sizeof > *perthread);" ? That's a bug, thanks for reporting it. I've deleted that line now. _______________________________________________ dev mailing list [email protected] http://openvswitch.org/mailman/listinfo/dev
