[
https://issues.apache.org/jira/browse/BEAM-9049?focusedWorklogId=366145&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-366145
]
ASF GitHub Bot logged work on BEAM-9049:
----------------------------------------
Author: ASF GitHub Bot
Created on: 04/Jan/20 01:15
Start Date: 04/Jan/20 01:15
Worklog Time Spent: 10m
Work Description: TheNeuralBit commented on issue #10499: [BEAM-9049] Add
opt-in --shutDownOnThrashing flag
URL: https://github.com/apache/beam/pull/10499#issuecomment-570743842
After investigating further I think this may not be the best solution.
In the streaming worker, we call `MemoryMonitor#waitForResources` before
grabbing a work item. This has the effect of reducing parallelism when there's
memory pressure. This is supposed to be the primary method of dealing with
thrashing, and shutting down the JVM when there's sustained thrashing is
intended as a last resort, when we are unable to process a single work item
without encountering thrashing.
The batch worker never calls `waitForResources`, I'm not sure if this is
intentional or if it's an oversight, but it seems that the actual root of our
problem is that in batch we aren't using both of these features together as
they are intended. Now I'm thinking we should either:
1) add a wait to the batch worker to reduce parallelism in the face of
memory pressure, or
2) disable the JVM shutdown on thrashing, but just for the batch worker.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 366145)
Time Spent: 1h 20m (was: 1h 10m)
> Add knobs for dataflow worker MemoryMonitor
> -------------------------------------------
>
> Key: BEAM-9049
> URL: https://issues.apache.org/jira/browse/BEAM-9049
> Project: Beam
> Issue Type: Improvement
> Components: runner-dataflow
> Reporter: Brian Hulette
> Assignee: Brian Hulette
> Priority: Major
> Fix For: 2.19.0
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> MemoryMonitor has many knobs that are just hard-coded for now. We should make
> some (or all) of these knobs available via pipeline options, so that users
> have control over when workers will be killed due to thrashing.
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/MemoryMonitor.java#L123
--
This message was sent by Atlassian Jira
(v8.3.4#803005)