Hi Paul, On Wed, Oct 11, 2017 at 03:32:30PM -0700, Paul E. McKenney wrote: > Hello! > > At Linux Plumbers Conference, we got requests for a recipes document, > and a further request to point to actual code in the Linux kernel. > I have pulled together some examples for various litmus-test families, > as shown below. The decoder ring for the abbreviations (ISA2, LB, SB, > MP, ...) is here: > > https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test6.pdf > > This document is also checked into the memory-models git archive: > > https://github.com/aparri/memory-model.git > > I would be especially interested in simpler examples in general, and > of course any example at all for the cases where I was unable to find > any. Thoughts?
Below are some examples we did discuss (at some point): The comment in kernel/events/ring_buffer.c:perf_output_put_handle() describes instances of MP+wmb+rmb and LB+ctrl+mb. The comments in kernel/sched/core.c:try_to_wake_up() describes more instances of MP ("plus locking") and LB (see finish_lock_switch()). The comment in kernel/sched/core.c:task_rq_lock() describes an ins- tance of MP+wmb+addr-acqpo. The comment in include/linux/wait.h:waitqueue_active() describes an instance of SB+mb+mb. 63cae12bce986 ("perf/core: Fix sys_perf_event_open() vs. hotplug") describes an instance of W+RWC+porel+mb+mb. [...] I wish we could say "any barrier (explicit or implicit) in sources is accompanied by a comment mentioning the interested pattern...", but life is not always this simple. ;-) Andrea > > Thanx, Paul > > ------------------------------------------------------------------------ > > This document lists the litmus-test patterns that we have been discussing, > along with examples from the Linux kernel. This is intended to feed into > the recipes document. All examples are from v4.13. > > 0. Single-variable SC. > > a. Within a single CPU, the use of the ->dynticks_nmi_nesting > counter by rcu_nmi_enter() and rcu_nmi_exit() qualifies > (see kernel/rcu/tree.c). The counter is accessed by > interrupts and NMIs as well as by process-level code. > This counter can be accessed by other CPUs, but only > for debug output. > > b. Between CPUs, I would put forward the ->dflags > updates, but this is anything but simple. But maybe > OK for an illustration? > > 1. MP (see test6.pdf for nickname translation) > > a. smp_store_release() / smp_load_acquire() > > init_stack_slab() in lib/stackdepot.c uses release-acquire > to handle initialization of a slab of the stack. Working > out the mutual-exclusion design is left as an exercise for > the reader. > > b. rcu_assign_pointer() / rcu_dereference() > > expand_to_next_prime() does the rcu_assign_pointer(), > and next_prime_number() does the rcu_dereference(). > This mediates access to a bit vector that is expanded > as additional primes are needed. These two functions > are in lib/prime_numbers.c. > > c. smp_wmb() / smp_rmb() > > xlog_state_switch_iclogs() contains the following: > > log->l_curr_block -= log->l_logBBsize; > ASSERT(log->l_curr_block >= 0); > smp_wmb(); > log->l_curr_cycle++; > > And xlog_valid_lsn() contains the following: > > cur_cycle = ACCESS_ONCE(log->l_curr_cycle); > smp_rmb(); > cur_block = ACCESS_ONCE(log->l_curr_block); > > d. Replacing either of the above with smp_mb() > > Holding off on this one for the moment... > > 2. Release-acquire chains, AKA ISA2, Z6.2, LB, and 3.LB > > Lots of variety here, can in some cases substitute: > > a. READ_ONCE() for smp_load_acquire() > b. WRITE_ONCE() for smp_store_release() > c. Dependencies for both smp_load_acquire() and > smp_store_release(). > d. smp_wmb() for smp_store_release() in first thread > of ISA2 and Z6.2. > e. smp_rmb() for smp_load_acquire() in last thread of ISA2. > > The canonical illustration of LB involves the various memory > allocators, where you don't want a load from about-to-be-freed > memory to see a store initializing a later incarnation of that > same memory area. But the per-CPU caches make this a very > long and complicated example. > > I am not aware of any three-CPU release-acquire chains in the > Linux kernel. There are three-CPU lock-based chains in RCU, > but these are not at all simple, either. > > Thoughts? > > 3. SB > > a. smp_mb(), as in lockless wait-wakeup coordination. > And as in sys_membarrier()-scheduler coordination, > for that matter. > > Examples seem to be lacking. Most cases use locking. > Here is one rather strange one from RCU: > > void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func) > { > unsigned long flags; > bool needwake; > bool havetask = READ_ONCE(rcu_tasks_kthread_ptr); > > rhp->next = NULL; > rhp->func = func; > raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags); > needwake = !rcu_tasks_cbs_head; > *rcu_tasks_cbs_tail = rhp; > rcu_tasks_cbs_tail = &rhp->next; > raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags); > /* We can't create the thread unless interrupts are > enabled. */ > if ((needwake && havetask) || > (!havetask && !irqs_disabled_flags(flags))) { > rcu_spawn_tasks_kthread(); > wake_up(&rcu_tasks_cbs_wq); > } > } > > And for the wait side, using synchronize_sched() to supply > the barrier for both ends, with the preemption disabling > due to raw_spin_lock_irqsave() serving as the read-side > critical section: > > if (!list) { > wait_event_interruptible(rcu_tasks_cbs_wq, > rcu_tasks_cbs_head); > if (!rcu_tasks_cbs_head) { > WARN_ON(signal_pending(current)); > schedule_timeout_interruptible(HZ/10); > } > continue; > } > synchronize_sched(); > > ----------------- > > Here is another one that uses atomic_cmpxchg() as a > full memory barrier: > > if (!wait_event_timeout(*wait, !atomic_read(stopping), > msecs_to_jiffies(1000))) { > atomic_set(stopping, 0); > smp_mb(); > return -ETIMEDOUT; > } > > int omap3isp_module_sync_is_stopping(wait_queue_head_t *wait, > atomic_t *stopping) > { > if (atomic_cmpxchg(stopping, 1, 0)) { > wake_up(wait); > return 1; > } > > return 0; > } >