Re: [Qemu-devel] [PATCH for-2.11] rcu: init globals only once
On Tue, Aug 08, 2017 at 09:09:23AM -0500, Eric Blake wrote: > On 08/08/2017 02:49 AM, Peter Xu wrote: > >> This doesn't work for error-checking mutexes: rcu_init_child has a > >> different PID than the parent, so the mutexes aren't unlocked. It's > >> also true that right now we don't use error-checking mutexes (commit > >> 24fa90499f, "qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK", > >> 2015-03-10); however, that's also a bit sad. > >> > >> The reason for the undefined behavior is probably that some operating > >> systems allocate memory in pthread_mutex_init, and initializing twice > >> causes a memory leak. One such operating system is OpenBSD. :( > > > > Good to know. :) > > > > I thought pthread_atfork() was designed to solve such a locking > > problem (in child hanlder, we unlock all the held locks). > > What's also sad is that POSIX says that pthread_atfork() is rather > useless - there's no way it can be reliably used to do everything that > everyone wants (and I think this case of error-checking mutexes is just > ONE of those reasons). > > > If > > PTHREAD_MUTEX_ERRORCHECK cannot coop well with it, not sure whether > > that means we should just avoid using PTHREAD_MUTEX_ERRORCHECK in such > > a use case (but we should be able to use the error checks in other > > mutexes that do not need extra fork handling)? > > > > Another idea is: can we just destroy the mutex first then re-init it > > in subprocess? A quick glance in libpthread code shows that at least > > pthread_mutex_destroy() won't check PTHREAD_MUTEX_ERRORCHECK. > > > > Thanks, > > > >> > >> Eric, you chimed in on the patch that became commit 24fa90499f, what do > >> you suggest? > > If, after forking, you can successfully destroy the mutex to then > reinitialize it (even though you can't unlock it), then that sounds as > good as anything I can come up with. > > An alternative approach might be to add a new mutex that anyone obtains > just before forking; as long as you hold that mutex, you can then > release any other mutex, fork, and then reobtain in the parent - but it > still becomes tricky bookkeeping to know which locks need to be dropped > and reobtained, and I worry that gating fork performance with such a > heavy lock will have noticeable slowdowns. It sounds like a MBQL (Much Bigger Qemu Lock :-). I am thinking whether we can simplify this. After all, we have existing assumptions: 1. for "-daemonize", we need the pthread_atfork() thing to make sure RCU works well even in the daemonized process 2. for all the rest of the fork cases, we assume that it will always be a quick exec() afterward, so need to maintain memory consistency Then, why not we just initialize RCU after os_daemonize()? IIUC, we can throw pthread_atfork() away then (considering that even it is not suggested to use "officially"). I guess it also depends on whether we'll need RCU before os_daemonize(). I assume not, but I may be wrong. Thanks, -- Peter Xu
Re: [Qemu-devel] [PATCH for-2.11] rcu: init globals only once
On 08/08/2017 02:49 AM, Peter Xu wrote: >> This doesn't work for error-checking mutexes: rcu_init_child has a >> different PID than the parent, so the mutexes aren't unlocked. It's >> also true that right now we don't use error-checking mutexes (commit >> 24fa90499f, "qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK", >> 2015-03-10); however, that's also a bit sad. >> >> The reason for the undefined behavior is probably that some operating >> systems allocate memory in pthread_mutex_init, and initializing twice >> causes a memory leak. One such operating system is OpenBSD. :( > > Good to know. :) > > I thought pthread_atfork() was designed to solve such a locking > problem (in child hanlder, we unlock all the held locks). What's also sad is that POSIX says that pthread_atfork() is rather useless - there's no way it can be reliably used to do everything that everyone wants (and I think this case of error-checking mutexes is just ONE of those reasons). > If > PTHREAD_MUTEX_ERRORCHECK cannot coop well with it, not sure whether > that means we should just avoid using PTHREAD_MUTEX_ERRORCHECK in such > a use case (but we should be able to use the error checks in other > mutexes that do not need extra fork handling)? > > Another idea is: can we just destroy the mutex first then re-init it > in subprocess? A quick glance in libpthread code shows that at least > pthread_mutex_destroy() won't check PTHREAD_MUTEX_ERRORCHECK. > > Thanks, > >> >> Eric, you chimed in on the patch that became commit 24fa90499f, what do >> you suggest? If, after forking, you can successfully destroy the mutex to then reinitialize it (even though you can't unlock it), then that sounds as good as anything I can come up with. An alternative approach might be to add a new mutex that anyone obtains just before forking; as long as you hold that mutex, you can then release any other mutex, fork, and then reobtain in the parent - but it still becomes tricky bookkeeping to know which locks need to be dropped and reobtained, and I worry that gating fork performance with such a heavy lock will have noticeable slowdowns. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH for-2.11] rcu: init globals only once
On 08/08/2017 09:49, Peter Xu wrote: > On Tue, Aug 08, 2017 at 09:26:43AM +0200, Paolo Bonzini wrote: >> On 08/08/2017 09:00, Peter Xu wrote: >>> We were calling rcu_init_complete() twice in the child processes when >>> fork happened. However the pthread library does not really suggest to do >>> it that way: >>> >>> http://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_mutex_init.html >>> >>> "Attempting to initialise an already initialised mutex results in >>> undefined behaviour." >>> >>> Actually, IMHO we can do it in a more natural way: Firstly, we only init >>> the RCU globals once in rcu_init(). Then, in rcu_init_child(), we unlock >>> all the locks held in rcu_init_lock() just like what we do in the parent >>> process, then do the rest of RCU re-init (e.g., create the RCU thread). >> >> This doesn't work for error-checking mutexes: rcu_init_child has a >> different PID than the parent, so the mutexes aren't unlocked. It's >> also true that right now we don't use error-checking mutexes (commit >> 24fa90499f, "qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK", >> 2015-03-10); however, that's also a bit sad. >> >> The reason for the undefined behavior is probably that some operating >> systems allocate memory in pthread_mutex_init, and initializing twice >> causes a memory leak. One such operating system is OpenBSD. :( > > Good to know. :) > > I thought pthread_atfork() was designed to solve such a locking > problem (in child hanlder, we unlock all the held locks). If > PTHREAD_MUTEX_ERRORCHECK cannot coop well with it, not sure whether > that means we should just avoid using PTHREAD_MUTEX_ERRORCHECK in such > a use case (but we should be able to use the error checks in other > mutexes that do not need extra fork handling)? > > Another idea is: can we just destroy the mutex first then re-init it > in subprocess? A quick glance in libpthread code shows that at least > pthread_mutex_destroy() won't check PTHREAD_MUTEX_ERRORCHECK. Destroying a mutex that is locked is also undefined behavior, even if it has no waiters. So it's a nice catch-22: we cannot destroy before unlocking, and we cannot unlock an error-checking mutex, so we cannot destroy before init. Actually, unlocking in atfork's child callback is interesting even for non-error-checking mutexes. If there are waiters, waking up the first one in pthread_mutex_unlock fails. This is okay if (as on Linux) pthread_mutex_lock does none of the woken-up process's work. However, on OpenBSD pthread_mutex_lock sets mutex->owner... but in the child there's no other thread to wake up and thus no one will set mutex->owner to NULL. This should lead to deadlock scenarios with pthread_atfork. So pthread_atfork is impossible to use correctly with mutexes, and reinitializing is the best we can do (the memory leak being a lesser evil). The workaround would be to replace mutexes with semaphores, and to implement QemuEvent natively for OpenBSD using its __thrsleep and __thrwakeup system calls (no, I am not going to do that :)). Let's wait and hear what Eric has to say. Paolo > Thanks, > >> >> Eric, you chimed in on the patch that became commit 24fa90499f, what do >> you suggest? >> >> Paolo >
Re: [Qemu-devel] [PATCH for-2.11] rcu: init globals only once
On Tue, Aug 08, 2017 at 09:26:43AM +0200, Paolo Bonzini wrote: > On 08/08/2017 09:00, Peter Xu wrote: > > We were calling rcu_init_complete() twice in the child processes when > > fork happened. However the pthread library does not really suggest to do > > it that way: > > > > http://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_mutex_init.html > > > > "Attempting to initialise an already initialised mutex results in > > undefined behaviour." > > > > Actually, IMHO we can do it in a more natural way: Firstly, we only init > > the RCU globals once in rcu_init(). Then, in rcu_init_child(), we unlock > > all the locks held in rcu_init_lock() just like what we do in the parent > > process, then do the rest of RCU re-init (e.g., create the RCU thread). > > This doesn't work for error-checking mutexes: rcu_init_child has a > different PID than the parent, so the mutexes aren't unlocked. It's > also true that right now we don't use error-checking mutexes (commit > 24fa90499f, "qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK", > 2015-03-10); however, that's also a bit sad. > > The reason for the undefined behavior is probably that some operating > systems allocate memory in pthread_mutex_init, and initializing twice > causes a memory leak. One such operating system is OpenBSD. :( Good to know. :) I thought pthread_atfork() was designed to solve such a locking problem (in child hanlder, we unlock all the held locks). If PTHREAD_MUTEX_ERRORCHECK cannot coop well with it, not sure whether that means we should just avoid using PTHREAD_MUTEX_ERRORCHECK in such a use case (but we should be able to use the error checks in other mutexes that do not need extra fork handling)? Another idea is: can we just destroy the mutex first then re-init it in subprocess? A quick glance in libpthread code shows that at least pthread_mutex_destroy() won't check PTHREAD_MUTEX_ERRORCHECK. Thanks, > > Eric, you chimed in on the patch that became commit 24fa90499f, what do > you suggest? > > Paolo -- Peter Xu
Re: [Qemu-devel] [PATCH for-2.11] rcu: init globals only once
On 08/08/2017 09:00, Peter Xu wrote: > We were calling rcu_init_complete() twice in the child processes when > fork happened. However the pthread library does not really suggest to do > it that way: > > http://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_mutex_init.html > > "Attempting to initialise an already initialised mutex results in > undefined behaviour." > > Actually, IMHO we can do it in a more natural way: Firstly, we only init > the RCU globals once in rcu_init(). Then, in rcu_init_child(), we unlock > all the locks held in rcu_init_lock() just like what we do in the parent > process, then do the rest of RCU re-init (e.g., create the RCU thread). This doesn't work for error-checking mutexes: rcu_init_child has a different PID than the parent, so the mutexes aren't unlocked. It's also true that right now we don't use error-checking mutexes (commit 24fa90499f, "qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK", 2015-03-10); however, that's also a bit sad. The reason for the undefined behavior is probably that some operating systems allocate memory in pthread_mutex_init, and initializing twice causes a memory leak. One such operating system is OpenBSD. :( Eric, you chimed in on the patch that became commit 24fa90499f, what do you suggest? Paolo > CC: Paolo Bonzini > Signed-off-by: Peter Xu > --- > this is based on Paolo's series: > "[PATCH for-2.10 0/2] RCU: forking fix and cleanups" > --- > util/rcu.c | 18 ++ > 1 file changed, 14 insertions(+), 4 deletions(-) > > diff --git a/util/rcu.c b/util/rcu.c > index ca5a63e..6fbbe4c 100644 > --- a/util/rcu.c > +++ b/util/rcu.c > @@ -299,15 +299,17 @@ void rcu_unregister_thread(void) > qemu_mutex_unlock(&rcu_registry_lock); > } > > -static void rcu_init_complete(void) > +static void rcu_init_globals(void) > { > -QemuThread thread; > - > qemu_mutex_init(&rcu_registry_lock); > qemu_mutex_init(&rcu_sync_lock); > qemu_event_init(&rcu_gp_event, true); > - > qemu_event_init(&rcu_call_ready_event, false); > +} > + > +static void rcu_init_complete(void) > +{ > +QemuThread thread; > > /* The caller is assumed to have iothread lock, so the call_rcu thread > * must have been quiescent even after forking, just recreate it. > @@ -357,6 +359,13 @@ static void rcu_init_child(void) > return; > } > > +rcu_init_unlock(); > + > +/* > + * For the newly forked child, we need something extra: since > + * after fork the threads are all gone, we need to re-init the RCU > + * thread, along with the globals. > + */ > memset(®istry, 0, sizeof(registry)); > rcu_init_complete(); > } > @@ -367,5 +376,6 @@ static void __attribute__((__constructor__)) > rcu_init(void) > #ifdef CONFIG_POSIX > pthread_atfork(rcu_init_lock, rcu_init_unlock, rcu_init_child); > #endif > +rcu_init_globals(); > rcu_init_complete(); > } >
[Qemu-devel] [PATCH for-2.11] rcu: init globals only once
We were calling rcu_init_complete() twice in the child processes when fork happened. However the pthread library does not really suggest to do it that way: http://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_mutex_init.html "Attempting to initialise an already initialised mutex results in undefined behaviour." Actually, IMHO we can do it in a more natural way: Firstly, we only init the RCU globals once in rcu_init(). Then, in rcu_init_child(), we unlock all the locks held in rcu_init_lock() just like what we do in the parent process, then do the rest of RCU re-init (e.g., create the RCU thread). CC: Paolo Bonzini Signed-off-by: Peter Xu --- this is based on Paolo's series: "[PATCH for-2.10 0/2] RCU: forking fix and cleanups" --- util/rcu.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/util/rcu.c b/util/rcu.c index ca5a63e..6fbbe4c 100644 --- a/util/rcu.c +++ b/util/rcu.c @@ -299,15 +299,17 @@ void rcu_unregister_thread(void) qemu_mutex_unlock(&rcu_registry_lock); } -static void rcu_init_complete(void) +static void rcu_init_globals(void) { -QemuThread thread; - qemu_mutex_init(&rcu_registry_lock); qemu_mutex_init(&rcu_sync_lock); qemu_event_init(&rcu_gp_event, true); - qemu_event_init(&rcu_call_ready_event, false); +} + +static void rcu_init_complete(void) +{ +QemuThread thread; /* The caller is assumed to have iothread lock, so the call_rcu thread * must have been quiescent even after forking, just recreate it. @@ -357,6 +359,13 @@ static void rcu_init_child(void) return; } +rcu_init_unlock(); + +/* + * For the newly forked child, we need something extra: since + * after fork the threads are all gone, we need to re-init the RCU + * thread, along with the globals. + */ memset(®istry, 0, sizeof(registry)); rcu_init_complete(); } @@ -367,5 +376,6 @@ static void __attribute__((__constructor__)) rcu_init(void) #ifdef CONFIG_POSIX pthread_atfork(rcu_init_lock, rcu_init_unlock, rcu_init_child); #endif +rcu_init_globals(); rcu_init_complete(); } -- 2.7.4