On Tue, Aug 08, 2017 at 09:09:23AM -0500, Eric Blake wrote:
> On 08/08/2017 02:49 AM, Peter Xu wrote:
> >> This doesn't work for error-checking mutexes: rcu_init_child has a
> >> different PID than the parent, so the mutexes aren't unlocked. It's
> >> also true that right now we don't use error-checking mutexes (commit
> >> 24fa90499f, "qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK",
> >> 2015-03-10); however, that's also a bit sad.
> >> The reason for the undefined behavior is probably that some operating
> >> systems allocate memory in pthread_mutex_init, and initializing twice
> >> causes a memory leak. One such operating system is OpenBSD. :(
> > Good to know. :)
> > I thought pthread_atfork() was designed to solve such a locking
> > problem (in child hanlder, we unlock all the held locks).
> What's also sad is that POSIX says that pthread_atfork() is rather
> useless - there's no way it can be reliably used to do everything that
> everyone wants (and I think this case of error-checking mutexes is just
> ONE of those reasons).
> > If
> > PTHREAD_MUTEX_ERRORCHECK cannot coop well with it, not sure whether
> > that means we should just avoid using PTHREAD_MUTEX_ERRORCHECK in such
> > a use case (but we should be able to use the error checks in other
> > mutexes that do not need extra fork handling)?
> > Another idea is: can we just destroy the mutex first then re-init it
> > in subprocess? A quick glance in libpthread code shows that at least
> > pthread_mutex_destroy() won't check PTHREAD_MUTEX_ERRORCHECK.
> > Thanks,
> >> Eric, you chimed in on the patch that became commit 24fa90499f, what do
> >> you suggest?
> If, after forking, you can successfully destroy the mutex to then
> reinitialize it (even though you can't unlock it), then that sounds as
> good as anything I can come up with.
> An alternative approach might be to add a new mutex that anyone obtains
> just before forking; as long as you hold that mutex, you can then
> release any other mutex, fork, and then reobtain in the parent - but it
> still becomes tricky bookkeeping to know which locks need to be dropped
> and reobtained, and I worry that gating fork performance with such a
> heavy lock will have noticeable slowdowns.
It sounds like a MBQL (Much Bigger Qemu Lock :-).
I am thinking whether we can simplify this. After all, we have
1. for "-daemonize", we need the pthread_atfork() thing to make sure
RCU works well even in the daemonized process
2. for all the rest of the fork cases, we assume that it will always
be a quick exec() afterward, so need to maintain memory consistency
Then, why not we just initialize RCU after os_daemonize()? IIUC, we
can throw pthread_atfork() away then (considering that even it is not
suggested to use "officially").
I guess it also depends on whether we'll need RCU before
os_daemonize(). I assume not, but I may be wrong.