[
https://issues.apache.org/jira/browse/BEAM-7109?focusedWorklogId=230326&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230326
]
ASF GitHub Bot logged work on BEAM-7109:
----------------------------------------
Author: ASF GitHub Bot
Created on: 19/Apr/19 23:59
Start Date: 19/Apr/19 23:59
Worklog Time Spent: 10m
Work Description: angoenka commented on pull request #8367: [BEAM-7109]
Do not reconnect logging at termination
URL: https://github.com/apache/beam/pull/8367
Indefinite attempt to reconnect logging client causes SDKHarness to get
stuck during termination and eventually cause thread leak as each attempt to
reconnect create some threads on GRPC side.
The pr deletes the logging stub before reconnecting so that grpc can clean
its state.
The PR also introduces _alive flag which is used to determine if logging
client should reconnect on disconnect.
------------------------
Thank you for your contribution! Follow this checklist to help us
incorporate your contribution quickly and easily:
- [ ] [**Choose
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA
issue, if applicable. This will automatically link the pull request to the
issue.
- [ ] If this contribution is large, please file an Apache [Individual
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
Post-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
--- | --- | --- | --- | --- | --- | --- | ---
Go | [](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
| --- | --- | --- | --- | --- | ---
Java | [](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
Python | [](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
| --- | [](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
<br> [](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
| --- | --- | ---
Pre-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
--- |Java | Python | Go | Website
--- | --- | --- | --- | ---
Non-portable | [](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/)
Portable | --- | [](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/)
| --- | ---
See
[.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md)
for trigger phrase, status and link of all Jenkins jobs.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 230326)
Time Spent: 10m
Remaining Estimate: 0h
> Thread leaking in Portable Python Precommit
> --------------------------------------------
>
> Key: BEAM-7109
> URL: https://issues.apache.org/jira/browse/BEAM-7109
> Project: Beam
> Issue Type: Bug
> Components: testing
> Reporter: yifan zou
> Assignee: Ankur Goenka
> Priority: Critical
> Attachments: threadDump.txt, thread_dump.txt
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Beam Jenkins constantly break due to some weird errors such as "Unable to
> create new native thread". The recent build worker failure happened on
> [apache-beam-jenkins-8]
> ([https://builds.apache.org/computer/apache-beam-jenkins-8/builds]). Checking
> the thread number on that VM shows:
> Thread limit: kernel.pid_max = 32768
> Actual used: 32411
>
> Dumping the thread usage (see [^threadDump.txt]) exposed thread leaking on
> some Python tests. And based on the execution history of the jenkins-8, the
> [beam_PreCommit_Portable_Python_Commit]
> ([https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit]) is
> suspicious. We ran this test multiple times on a plain node and observed that
> some thread started by +_apache_beam.runners.worker.sdk_worker_main_+ were
> not tear down after tests complete. The stale threads finally accumulated and
> ate the VM kernel thread quota.
>
> cc: [~alanmyrvold], [~jasonkuster], [~altay]
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)