On 18 October 2012 16:01, Paolo Bonzini <pbonz...@redhat.com> wrote: > Il 18/10/2012 16:28, Peter Maydell ha scritto: >> The function vnc_stop_worker_thread() is buggy, beacuse it tries to >> delete jobs from the worker thread's queue but the worker thread itself >> will not cope with this happening (it would end up trying to remove >> an already-removed list item from its queue list). Fortunately >> nobody ever calls vnc_stop_worker_thread(), so we can fix this by >> simply deleting all the untested racy code. > > Note that there is just one queue. The queue global == the arg argument > of vnc_worker_thread == the queue argument of vnc_worker_thread_loop. > So I'm not sure I follow your reasoning.
vnc_worker_thread_loop does this: lock queue get pointer to first job in queue unlock queue do stuff with job lock queue QTAILQ_REMOVE(&queue->jobs, job, next) unlock queue g_free(job) vnc_jobs_clear does this: lock queue QTAILQ_REMOVE each job from the queue unlock queue So two problems: (1) any job removed by vnc_jobs_clear is never g_free()d (2) if vnc_jobs_clear removes job X from the queue while the worker loop is in its "do stuff with job" phase, then the worker loop will subsequently try to QTAILQ_REMOVE an object from a list it is not in, which will probably crash or otherwise misbehave Here's a third which I just noticed: (3) vnc_stop_worker thread sets queue->exit to true, and then does a number of things with 'queue'. However, as soon as you've set queue->exit you can't touch 'queue' again, because the worker thread might (if you were unlucky) be just about to do the 'if (queue->exit) { return -1; }', which will cause it to destroy the condvar and muteux and free the queue memory. In particular, even doing the 'vnc_unlock_queue()' in vnc_stop_worker_thread() is unsafe, because the worker thread does its exit-check without the lock held, so it could destroy the mutex before you unlocked it. > So the bug may be that we never call vnc_stop_worker_thread from > vnc_disconnect_finish. BTW vnc_jobs_join is called there so we could > just assert that the queue is empty... Yes, actually trying to find somewhere to make a clean shutdown of the worker thread and fixing all the bugs in the shutdown code would be the other approach. That seems like hard work to me :-) -- PMM