[ 
https://issues.apache.org/jira/browse/ARROW-12879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351610#comment-17351610
 ] 

Antoine Pitrou commented on ARROW-12879:
----------------------------------------

Leaking memory when forking a process with threads is an unavoidable fact of 
life (the dead threads will still hold to unreleased memory, for example 
through shared_ptrs held in local frames of execution). I'm not sure there's 
any point in trying to solve this. If you fork a process with threads, the only 
reasonable thing you can do in the child is spawn another executable (using 
e.g. exec()).

> [C++] Thread pool leaks memory when forking (and could maybe deadlock) if 
> threads exist at the time of fork
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-12879
>                 URL: https://issues.apache.org/jira/browse/ARROW-12879
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 4.0.0
>            Reporter: Weston Pace
>            Priority: Major
>
> While working on ARROW-12878 I have made the leak more obvious.  When we fork 
> we cannot delete any remaining std::thread.  In addition, we cannot safely 
> use any mutexes that might have been claimed by child threads.
>  
> The existing implementation works around this by creating a new 
> ThreadPool::State instance.  However, shared_ptr's to the old instance are 
> still held by (now defunct) std::thread instances and so the state object 
> will never be deleted (valgrind confirms this).
>  
> Furthermore, if the fork were to happen while a thread task was running and 
> had captured some mutex (e.g. any of the ones used in the datasets API) then 
> that mutex will never be released.
>  
> A more correct workaround would be to hook into pthread_atfork and shut down 
> all threads (don't have to wait for all jobs to complete), forking, then 
> restarting all the threads on BOTH the child and the parent (today we restart 
> on just the child and we leave the parent running).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to