On 02/09/2016 20:23, Pranith Kumar wrote: > If I understand you correctly, this is what might happen without the > barrier(): > > P0 P1 > ---------------------------------------- > // bh = ctx->first_bh; optimized > if (ctx->first_bh) { > // next = ctx->first_bh->next; > lock(bh_lock); > new_bh->next = ctx->first_bh; > smp_wmb(); // this alone is not sufficient > // for Alpha > ctx->first_bh = new_bh; > unlock(bh_lock); > // bh = next; > bh = ctx->first_bh->next; > > if (bh) {do something} > } > > Is this what might happen? If so, inserting a barrier() after the first load > into bh will prevent the compiler from optimizing the load into bh since the > compiler cannot optimize away loads and stores past the barrier().
Yes, this is what you can expect from a compiler. > And on Alpha processors barrier() should really be smp_read_barrier_depends() > to prevent this from happening because of it's memory model(issue a barrier > after loading a pointer to shared memory and before dereferencing it). Yes; the actual effect you could see on the Alpha it's even crazier. Without the barrier, the "bh = ctx->first_bh" and "next = bh->next" can be reordered. If you take both threads into account, "next = bh->next" can use a value from before P1's "ctx->first = new_bh". Indeed, it can use a value from before P1's smp_wmb() or before new_bh->next = ctx->first_bh. For example, "next" could load a NULL value. FWIW, RISCv currently has the same memory model as the Alpha, but they plan to fix it before they finalize the relevant specs! Paolo >> > >> > So instead of smp_read_barrier_depends() you could load ctx->first_bh >> > and bh->next with the consume memory order, but you do need _something_. > OK, if the above situation is possible, then I think I understand the need for > this barrier.