[
https://issues.apache.org/jira/browse/BEAM-9550?focusedWorklogId=413105&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-413105
]
ASF GitHub Bot logged work on BEAM-9550:
----------------------------------------
Author: ASF GitHub Bot
Created on: 31/Mar/20 11:55
Start Date: 31/Mar/20 11:55
Worklog Time Spent: 10m
Work Description: kamilwu commented on issue #11193: [BEAM-9550] Increase
JVM Metaspace size for the TaskExecutors.
URL: https://github.com/apache/beam/pull/11193#issuecomment-606581424
@mxm I checked two tests from [GBK
suite](https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_GBK_Flink_Python.groovy)
and one test from [coGBK
suite](https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_coGBK_Flink_Python.groovy).
Each of them got stuck at some point. It's highly possible that every test
from these suites is affected, because the size of input is the same in all
tests.
I observed that the problem might be connected with a lack of Managed
Memory. After increasing `memory.managed.fraction` by `0.1`, the first two
tests from GBK suite passed, and a pipeline got stuck at the third test.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 413105)
Time Spent: 6h 50m (was: 6h 40m)
> beam_PostCommit_Python_Chicago_Taxi_Flink OOM
> ---------------------------------------------
>
> Key: BEAM-9550
> URL: https://issues.apache.org/jira/browse/BEAM-9550
> Project: Beam
> Issue Type: Bug
> Components: runner-flink, test-failures
> Reporter: Kyle Weaver
> Assignee: Kamil Wasilewski
> Priority: Major
> Labels: currently-failing
> Time Spent: 6h 50m
> Remaining Estimate: 0h
>
> https://builds.apache.org/job/beam_PostCommit_Python_Chicago_Taxi_Flink/
> The following error has been occurring consistently for several days:
> 07:57:26 ERROR:root:java.lang.OutOfMemoryError: Metaspace
> 07:57:27 Traceback (most recent call last):
> 07:57:27 File "tfdv_analyze_and_validate.py", line 227, in <module>
> 07:57:27 main()
> 07:57:27 File "tfdv_analyze_and_validate.py", line 212, in main
> 07:57:27 project=known_args.metric_reporting_project)
> 07:57:27 File "tfdv_analyze_and_validate.py", line 132, in compute_stats
> 07:57:27 result.wait_until_finish()
> 07:57:27 File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Flink/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/runners/portability/portable_runner.py",
> line 545, in wait_until_finish
> 07:57:27 (self._job_id, self._state, self._last_error_message()))
> 07:57:27 RuntimeError: Pipeline
> chicago-taxi-tfdv-20200317-144954-eval_9742ac2b-26bf-4d1d-835e-572d4efacfcb
> failed in state FAILED: java.lang.OutOfMemoryError: Metaspace
--
This message was sent by Atlassian Jira
(v8.3.4#803005)