Weston Pace created ARROW-12879:
-----------------------------------

             Summary: [C++] Thread pool leaks memory when forking (and could 
maybe deadlock) if threads exist at the time of fork
                 Key: ARROW-12879
                 URL: https://issues.apache.org/jira/browse/ARROW-12879
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++
    Affects Versions: 4.0.0
            Reporter: Weston Pace


While working on ARROW-12878 I have made the leak more obvious.  When we fork 
we cannot delete any remaining std::thread.  In addition, we cannot safely use 
any mutexes that might have been claimed by child threads.

 

The existing implementation works around this by creating a new 
ThreadPool::State instance.  However, shared_ptr's to the old instance are 
still held by (now defunct) std::thread instances and so the state object will 
never be deleted (valgrind confirms this).

 

Furthermore, if the fork were to happen while a thread task was running and had 
captured some mutex (e.g. any of the ones used in the datasets API) then that 
mutex will never be released.

 

A more correct workaround would be to hook into pthread_atfork and shut down 
all threads (don't have to wait for all jobs to complete), forking, then 
restarting all the threads on BOTH the child and the parent (today we restart 
on just the child and we leave the parent running).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to