* Paul E. McKenney ([email protected]) wrote: > On Mon, Jul 12, 2010 at 08:54:13PM -0400, Mathieu Desnoyers wrote: > > * Paul E. McKenney ([email protected]) wrote: > > > On Mon, Jul 12, 2010 at 11:50:44AM -0400, Mathieu Desnoyers wrote: > > > > Hi, > > > > > > > > I just did the lock-free stack, which end up being much simpler than > > > > the queue > > > > because there is no need for dummy head pointer. Comments are welcome. > > > > > > > > Even though I did not do formal verification of the queue and stack, I > > > > feel > > > > sufficiently confident to push them in urcu mainline. I'll wait for > > > > feedback > > > > before cutting a release though. I also created test_urcu_lfq and > > > > test_urcu_lfs > > > > which will also be in the tree. They perform heavy enqueue/dequeue and > > > > push/pop > > > > to stress-test the algorithms. They check if the number of operations > > > > (e.g. push > > > > vs pop) balance. > > > > > > This one looks OK. You definitely need some comments stating that > > > pop() needs to refrain from touching the rcu_lfs_node until after an > > > RCU grace period elapses, though. ;-) > > > > Sure, I'll add this comment. Thanks ! > > > > The discussion we had off-list made me wonder if a wait-free push, blocking > > pop > > implementation would not be better ? Here is the implementation of this > > variant: > > > > Thoughts ? > > Keeping in mind that the only atomic stack I have every used was for > a parallel memory allocator... > > My guess is that different applications would be better served by one > or the other. If a workload had a real-time component that did one > level of processing, then handed off to a non-real-time component, > but the situation was such that getting some of the work done by the > non-real-time component immediately was better than getting it all > done with a more uniform but longer delay, then your wait-free push > blocking pop might be just the ticket. > > However, if the stack was instead being used to communicate between > a pair of real-time components, the earlier implementation that > combined lock-free push and pop might be better.
Agreed, I'll leave both. > > Some relatively minor comments below... > > Thanx, Paul > > > Thanks! > > > > Mathieu > > > > /* > > * rcuwfstack.h > > * > > * Userspace RCU library - RCU Stack with Wait-Free push, Blocking pop. > > * > > * Copyright 2010 - Mathieu Desnoyers <[email protected]> > > * > > * This library is free software; you can redistribute it and/or > > * modify it under the terms of the GNU Lesser General Public > > * License as published by the Free Software Foundation; either > > * version 2.1 of the License, or (at your option) any later version. > > * > > * This library is distributed in the hope that it will be useful, > > * but WITHOUT ANY WARRANTY; without even the implied warranty of > > * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > * Lesser General Public License for more details. > > * > > * You should have received a copy of the GNU Lesser General Public > > * License along with this library; if not, write to the Free Software > > * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 > > USA > > */ > > > > #if (!defined(_GNU_SOURCE) && !defined(_LGPL_SOURCE)) > > #error "Dynamic loader LGPL wrappers not implemented yet" > > #endif > > > > struct rcu_wfs_node { > > struct rcu_wfs_node *next; > > }; > > > > struct rcu_wfs_stack { > > struct rcu_wfs_node *head; > > struct rcu_wfs_node end; > > ->end is the dummy node? Ah, a sentinel for the bottom of the stack. > > But how is ->end really different than a NULL pointer? You don't seem > to dereference it anywhere other than initializing it. True. A 0x1 will do to flag the bottom of stack. > > > }; > > > > void rcu_wfs_node_init(struct rcu_wfs_node *node) > > { > > node->next = NULL; > > } > > > > void rcu_wfs_init(struct rcu_wfs_stack *s) > > { > > s->head = &s->end; > > rcu_wfs_node_init(&s->end); > > } > > > > void rcu_wfs_push(struct rcu_wfs_stack *s, struct rcu_wfs_node *node) > > { > > struct rcu_wfs_node *old_head; > > > > /* > > * uatomic_xchg() implicit memory barrier orders earlier stores to node > > * (setting it to NULL) before publication. > > */ > > old_head = uatomic_xchg(&s->head, node); > > Interesting... This can be in an implied RCU read-side critical section > because rcu_wfs_pop() might be waiting for this code while within an > RCU read-side critical section... Hrm. Good point. Then I think it's better to lock/unlock at each loop of rcu_wfs_pop to ensure we are not holding grace periods needlessly. > > > /* > > * At this point, dequeuers see a NULL node->next, they should busy-wait > > * until node->next is set to old_head. > > */ > > STORE_SHARED(node->next, old_head); > > } > > > > /* > > * The caller must wait for a grace period before freeing the returned node. > > * Returns NULL if stack is empty. > > * > > * cmpxchg is protected from ABA races by holding a RCU read lock between > > * s->head read and cmpxchg modifying s->head and requiring that dequeuers > > wait > > * for a grace period before freeing the returned node. > > And they must also wait for a grace period before in any way modifying > the ->next pointer (so watch it with the unions!!!). And they cannot > pass the node back to push() on the same stack that they got it from > without also waiting for a grace period. Good point. Will add comment about this. Thanks ! Mathieu > > > * > > * TODO: implement adaptative busy-wait and wait/wakeup scheme rather than > > busy > > * loops. Better for UP. > > */ > > struct rcu_wfs_node * > > rcu_wfs_pop(struct rcu_wfs_stack *s) > > { > > rcu_read_lock(); > > for (;;) { > > struct rcu_wfs_node *head = rcu_dereference(s->head); > > > > if (head != &s->end) { > > struct rcu_wfs_node *next = rcu_dereference(head->next); > > > > /* Retry while head is being set by push(). */ > > if (!next) > > continue; > > > > if (uatomic_cmpxchg(&s->head, head, next) == head) { > > rcu_read_unlock(); > > return head; > > } else { > > /* Concurrent modification. Retry. */ > > continue; > > } > > } else { > > /* Empty stack */ > > rcu_read_unlock(); > > return NULL; > > } > > } > > } > > > > -- > > Mathieu Desnoyers > > Operating System Efficiency R&D Consultant > > EfficiOS Inc. > > http://www.efficios.com -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com _______________________________________________ ltt-dev mailing list [email protected] http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
