[ 
https://issues.apache.org/jira/browse/BEAM-7412?focusedWorklogId=250730&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-250730
 ]

ASF GitHub Bot logged work on BEAM-7412:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/May/19 05:50
            Start Date: 30/May/19 05:50
    Worklog Time Spent: 10m 
      Work Description: ibzib commented on pull request #8722: [BEAM-7412] shut 
down executor service in fn harness
URL: https://github.com/apache/beam/pull/8722
 
 
   I was somewhat surprised to find that the unbounded thread growth I 
encountered when running portable validates runner tests on local Spark is was 
actually caused by 
[`GcsOptions.java`](https://github.com/apache/beam/blob/39dab79b7bd56a47c0f2a02033ab4ca3bd4a67e2/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcsOptions.java#L150-L156).
 I'm guessing this hasn't caused problems before because the SDK harness was 
always run in a separately managed process/container/whatever that terminated 
when work was complete.
   
   Incidentally, this was a pain to track down because thousands of threads 
used for any number of purposes were all uniformly named `pool-N-thread-M` (as 
they all used Java's 
[`defaultThreadFactory`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/Executors.html#defaultThreadFactory--)).
 Going forward, I might look into naming Beam threads so it's easier to debug 
and analyze performance.
   
   ------------------------
   
   R: @angoenka 
   
   Post-Commit Tests Status (on master branch)
   
------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)<br>[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
 <br> [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   
------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/)
 
   Portable | --- | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/)
 | --- | ---
   
   See 
[.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md)
 for trigger phrase, status and link of all Jenkins jobs.
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

            Worklog Id:     (was: 250730)
            Time Spent: 10m
    Remaining Estimate: 0h

> portable spark: thread/memory leak in local mode
> ------------------------------------------------
>
>                 Key: BEAM-7412
>                 URL: https://issues.apache.org/jira/browse/BEAM-7412
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>            Reporter: Kyle Weaver
>            Assignee: Kyle Weaver
>            Priority: Major
>         Attachments: Screenshot from 2019-05-20 14-26-03.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When running validatesPortableRunnerBatch test on local mode, the portable 
> Spark runner creates ~18k threads before becoming unable to create any more 
> threads, at which point it crashes. This does not happen on standalone 
> cluster mode, where ephemeral executors (independent JVMs) are used and then 
> shut down, preventing whatever leakage is occurring from becoming too much of 
> a problem.
> Sample stack trace:
> {{"pool-3965-thread-3" #16059 daemon prio=5 os_prio=0 tid=0x00007f9f85a81000 
> nid=0x32a0 waiting on condition [0x00007f9f8b9f1000]}}
> {{ java.lang.Thread.State: TIMED_WAITING (parking)}}
> {{ at (C/C++) 0x00007fa2f32c2dae (Unknown Source)}}
> {{ at (C/C++) 0x00007fa2f2851351 (Unknown Source)}}
> {{ at sun.misc.Unsafe.park(Native Method)}}
> {{ - parking to wait for <0x0000000727bda848> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)}}
> {{ at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)}}
> {{ at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)}}
> {{ at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)}}
> {{ at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)}}
> {{ at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)}}
> {{ at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)}}
> {{ at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}
> {{ at java.lang.Thread.run(Thread.java:748)}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to