[jira] [Commented] (FLINK-8826) In Flip6 mode, when starting yarn cluster, configured taskmanager.heap.mb is ignored
[ https://issues.apache.org/jira/browse/FLINK-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384535#comment-16384535 ] ASF GitHub Bot commented on FLINK-8826: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/5625 > In Flip6 mode, when starting yarn cluster, configured taskmanager.heap.mb is > ignored > > > Key: FLINK-8826 > URL: https://issues.apache.org/jira/browse/FLINK-8826 > Project: Flink > Issue Type: Bug > Components: ResourceManager, YARN >Affects Versions: 1.5.0 >Reporter: Piotr Nowojski >Assignee: Till Rohrmann >Priority: Blocker > Fix For: 1.5.0, 1.6.0 > > > When I tried running some job on the cluster, despite setting > taskmanager.heap.mb = 3072 > taskmanager.network.memory.fraction: 0.4 > and reported in the console > {code:java} > Cluster specification: ClusterSpecification{masterMemoryMB=768, > taskManagerMemoryMB=3072, numberTaskManagers=92, slotsPerTaskManager=1}{code} > The actual settings were: > {noformat} > > 2018-03-01 14:53:18,918 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - > > 2018-03-01 14:53:18,921 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - Starting YARN TaskExecutor runner (Version: 1.5-SNAPSHOT, > Rev:e92eb39, Date:28.02.2018 @ 17:43:39 UTC) > 2018-03-01 14:53:18,921 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - OS current user: yarn > 2018-03-01 14:53:19,780 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - Current Hadoop/Kerberos user: hadoop > 2018-03-01 14:53:19,781 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - > 1.8/25.161-b14 > 2018-03-01 14:53:19,781 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - Maximum heap size: 245 MiBytes > 2018-03-01 14:53:19,781 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - JAVA_HOME: /usr/lib/jvm/java-openjdk > 2018-03-01 14:53:19,783 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - Hadoop version: 2.4.1 > 2018-03-01 14:53:19,783 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - JVM Options: > 2018-03-01 14:53:19,783 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - -Xms255m > 2018-03-01 14:53:19,784 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - -Xmx255m > 2018-03-01 14:53:19,784 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - -XX:MaxDirectMemorySize=769m > 2018-03-01 14:53:19,784 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - > -Dlog.file=/var/log/hadoop-yarn/containers/application_1516373731080_1150/container_1516373731080_1150_01_000105/taskmanager.log > 2018-03-01 14:53:19,784 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - -Dlogback.configurationFile=file:./logback.xml > 2018-03-01 14:53:19,784 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - -Dlog4j.configuration=file:./log4j.properties > 2018-03-01 14:53:19,784 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - Program Arguments: > 2018-03-01 14:53:19,784 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - --configDir{noformat} > Heap was set to 255, while with default cuts of it should be 1383. 255MB > seems like coming from default taskmanager.heap.mb value of 1024. > When starting in non flip6 everything works as expected: > {noformat} > > 2018-03-01 14:04:49,650 INFO > org.apache.flink.yarn.YarnTaskManagerRunnerFactory - > > 2018-03-01 14:04:49,700 INFO > org.apache.flink.yarn.YarnTaskManagerRunnerFactory - Starting > YARN TaskManager (Version: 1.5-SNAPSHOT, Rev:e92eb39, Date:28.02.2018 @ > 17:43:39 UTC) > 2018-03-01 14:04:49,700 INFO > org.apache.flink.yarn.YarnTaskManagerRunnerFactory - OS current > user: yarn > 2018-03-01 14:04:53,277 INFO > org.apache.flink.yarn.YarnTaskManagerRunnerFactory - Current > Hadoop/Kerberos user: hadoop > 2018-03-01 14:04:53,278 INFO > org.apache.flink.yarn.YarnTaskManagerRunnerFactory - JVM: OpenJDK > 64-Bit Server VM - Oracle Corporation - 1.8/25.161-b14 > 2018-03-01 14:04:53,279 INFO > org.apache.flink.yarn.YarnTaskManagerRunnerFactory - Maximum heap > size: 1326 MiBytes > 2018-03-01 14:04:53,279 INFO > org.apache.flink.yarn.YarnTaskManagerRunnerFactory -
[jira] [Commented] (FLINK-8826) In Flip6 mode, when starting yarn cluster, configured taskmanager.heap.mb is ignored
[ https://issues.apache.org/jira/browse/FLINK-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384000#comment-16384000 ] ASF GitHub Bot commented on FLINK-8826: --- GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/5625 [FLINK-8826] [flip6] Start Yarn TaskExecutor with proper slots and memory ## What is the purpose of the change Read the default TaskManager memory and number of slots from the configuration when the YarnResourceManager is started. ## Verifying this change - Added `YarnConfigurationITCase` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink fixCheckpointCoordinator Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5625.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5625 commit ef080927aba095a72929cd8be390b46afa4dcab8 Author: Till RohrmannDate: 2018-03-02T14:27:13Z [FLINK-8840] [yarn] Pull YarnClient and YarnConfiguration instantiation out of AbstractYarnClusterClient For better testability, this commit moves the YarnClient and YarnConfiguration out of the AbstractYarnClusterDescriptor. commit 173e272c1a0250ecdc4a4f975a2f7991b9dd53c1 Author: Till Rohrmann Date: 2018-03-01T19:09:55Z [hotfix] [flip6] Harden JobMaster#triggerSavepoint Check first whether the CheckpointCoordinator has been set before triggering a savepoint. If it has not been set, then return a failure message. commit 2d2a6d5a84a586bf8ef59656aa7754e94b6e034b Author: Till Rohrmann Date: 2018-03-01T22:35:25Z [FLINK-8826] [flip6] Start Yarn TaskExecutor with proper slots and memory Read the default TaskManager memory and number of slots from the configuration when the YarnResourceManager is started. commit d8f3cfa0c89f465ad995ebab7470f37b37b18678 Author: Till Rohrmann Date: 2018-03-02T11:18:05Z [hotfix] Set default number of TaskManagers in FlinkYarnSessionCli for Flip6 commit 0475dc343f0fa703bcf82a585482bdd15ae168ea Author: Till Rohrmann Date: 2018-03-02T11:42:43Z [hotfix] Print correct web monitor URL in FlinkYarnSessionCli > In Flip6 mode, when starting yarn cluster, configured taskmanager.heap.mb is > ignored > > > Key: FLINK-8826 > URL: https://issues.apache.org/jira/browse/FLINK-8826 > Project: Flink > Issue Type: Bug > Components: ResourceManager, YARN >Affects Versions: 1.5.0 >Reporter: Piotr Nowojski >Assignee: Till Rohrmann >Priority: Blocker > > When I tried running some job on the cluster, despite setting > taskmanager.heap.mb = 3072 > taskmanager.network.memory.fraction: 0.4 > and reported in the console > {code:java} > Cluster specification: ClusterSpecification{masterMemoryMB=768, > taskManagerMemoryMB=3072, numberTaskManagers=92, slotsPerTaskManager=1}{code} > The actual settings were: > {noformat} > > 2018-03-01 14:53:18,918 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - > > 2018-03-01 14:53:18,921 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - Starting YARN TaskExecutor runner (Version: 1.5-SNAPSHOT, > Rev:e92eb39, Date:28.02.2018 @ 17:43:39 UTC) > 2018-03-01 14:53:18,921 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - OS current user: yarn > 2018-03-01 14:53:19,780 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - Current Hadoop/Kerberos user: hadoop > 2018-03-01 14:53:19,781 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > - JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - > 1.8/25.161-b14 > 2018-03-01 14:53:19,781 INFO org.apache.flink.yarn.YarnTaskExecutorRunner > -