On Wed, Nov 19, 2014 at 10:00 PM, Taylor R Campbell <campbell+netbsd-tech-k...@mumble.net> wrote: > Date: Wed, 19 Nov 2014 17:11:18 +0800 > From: Dennis Ferguson <dennis.c.fergu...@gmail.com> > > On 19 Nov, 2014, at 01:53 , Taylor R Campbell > <campbell+netbsd-tech-k...@mumble.net> wrote: > > The one tricky detail is that after a reader has fetched the tqe_next > > pointer, it must issue a membar_consumer before dereferencing the > > pointer: otherwise there is no guarantee about the order in which the > > CPU will fetch the tqe_next pointer and its contents (which it may > > have cached). > > I don't think it is correct to use a membar_consumer(). > > It is certainly /correct/. It may not be /necessary/ in some > architectures that provide more guarantees about load ordering than > are written in the program without membar_consumer.
By reading and comparing your opinions, I come to think that membar is not necessary to safely traverse the protected data structure (tqe_next pointers). It is the minimal requirement of pserialize(9) "update" section (mutex_enter() ... pserialize_perform()) that the protected data is updated atomically and kept topologically consistent to prevent readers from seeing corrupted data (pointers). Some readers may see the stale/old tqe_next, but it is OK at this point. In practice, there must be a membar around for readers to see the updated data; but such a membar is not always relative to tqe_next. Maybe it is, maybe not, only users (of pserialize(9)) know. Putting membars around TAILQ is too excessive. For example, consider struct ifnet. It will have a reference count to take a reference in a pserialize(9) critical section. Reference count is incremented by atomic_inc_*, which might imply membar on some archs. When adding a new ifnet, part of its content, promised as constant in the protected list (e.g. if_name), must be filled before being inserted and visible via the list. In this case, write to if_name is relative to tqe_next. > Since most machines don't need any barrier here it would be extremely > inefficient to add a membar_consumer() since that would make all machines > pay for the idiosyncrasies unique to the DEC Alpha. > > We could invent a membar_datadep_consumer for this case, and make it a > no-op on everything other than Alpha until another CPU designer > relaxes the memory ordering. The intent of the code is clearer with > the memory barrier, and it emphasizes the need for a paired > membar_producer elsewhere. I thought alpha sync op implementations imply instruction ordering ... but maybe I'm too optimistic. :)