Kamil Wasilewski created BEAM-9673:
--------------------------------------
Summary: Migrate the memory configuration of Flink cluster from
Flink <= 1.9 to 1.10
Key: BEAM-9673
URL: https://issues.apache.org/jira/browse/BEAM-9673
Project: Beam
Issue Type: Bug
Components: testing
Reporter: Kamil Wasilewski
Our Google Cloud Dataproc setup [1], which runs a Flink cluster for testing
purposes, needs to be reviewed and updated before reusing in the latest version
of Flink (1.10)
There is an official migration guide [2] which can help to update the
configuration.
Here's also a list of ideas I came up with during initial investigation:
1) JVM Metaspace Size must be increased to prevent OOM errors. This can be done
by setting _taskmanager.memory.jvm-metaspace.size_ to "512 mb" (default value
is 256 mb).
2) It appears that the size of _Managed Memory_ is too low for tests involving
GBK and coGBK operations [3] [4]. I managed to run those tests successfully by
increasing _taskmanager.memory.managed.fraction_ to 0.8 and changing the type
of dataproc workers to n1-highmem-4. But there might be another option.
[1] [https://github.com/apache/beam/tree/master/.test-infra/dataproc]
[2]
[https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_migration.html]
[3]
[https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_GBK_Flink_Python.groovy]
[4]
[https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_coGBK_Flink_Python.groovy]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)