[ 
https://issues.apache.org/jira/browse/BEAM-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16822256#comment-16822256
 ] 

Ankur Goenka commented on BEAM-7109:
------------------------------------

Thanks [~yifanzou] for detailed instruction.

 

Issue seems to be that the logging client is not terminating and tries to 
reconnect indefinitely. This result SDKHarness continue love and creation of 
new threads and eventually we run out of threads.

Attached is the thread dump[^thread_dump.txt] of such a process. 

I will be sending a PR to solve this shortly.

> Thread leaking in Portable Python Precommit 
> --------------------------------------------
>
>                 Key: BEAM-7109
>                 URL: https://issues.apache.org/jira/browse/BEAM-7109
>             Project: Beam
>          Issue Type: Bug
>          Components: testing
>            Reporter: yifan zou
>            Assignee: Ankur Goenka
>            Priority: Critical
>         Attachments: threadDump.txt, thread_dump.txt
>
>
> Beam Jenkins constantly break due to some weird errors such as "Unable to 
> create new native thread". The recent build worker failure happened on 
> [apache-beam-jenkins-8] 
> ([https://builds.apache.org/computer/apache-beam-jenkins-8/builds]). Checking 
> the thread number on that VM shows: 
> Thread limit: kernel.pid_max = 32768 
> Actual used: 32411
>  
> Dumping the thread usage (see [^threadDump.txt]) exposed thread leaking on 
> some Python tests. And based on the execution history of the jenkins-8, the 
> [beam_PreCommit_Portable_Python_Commit] 
> ([https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit]) is 
> suspicious. We ran this test multiple times on a plain node and observed that 
> some thread started by +_apache_beam.runners.worker.sdk_worker_main_+ were 
> not tear down after tests complete. The stale threads finally accumulated and 
> ate the VM kernel thread quota. 
>  
> cc: [~alanmyrvold], [~jasonkuster], [~altay]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to