[ 
https://issues.apache.org/jira/browse/FLINK-14968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-14968:
-------------------------------------
    Description: 
This change made the test flaky: 
https://github.com/apache/flink/commit/749965348170e4608ff2a23c9617f67b8c341df5.
 It changes the job to have two sources instead of one which, under normal 
circumstances, requires too many slots to run and therefore the job will fail.

The setup of this test is very intricate, we configure YARN to have two 
NodeManagers with 2500mb memory each: 
https://github.com/apache/flink/blob/413a77157caf25dbbfb8b0caaf2c9e12c7374d98/flink-end-to-end-tests/test-scripts/docker-hadoop-secure-cluster/config/yarn-site.xml#L39.
 We run the job with parallelism 3 and configure Flink to use 1000mb as 
TaskManager memory and 1000mb of JobManager memory. This means that the job 
fits into the YARN memory budget but more TaskManagers would not fit. We also 
don't simply increase the YARN resources because we want the Flink job to use 
TMs on different NMs because we had a bug where Kerberos config file shipping 
was not working correctly but the bug was not materialising if all TMs where on 
the same NM.

https://api.travis-ci.org/v3/job/612782888/log.txt

  was:https://api.travis-ci.org/v3/job/612782888/log.txt


> Kerberized YARN on Docker test (custom fs plugin) fails on Travis
> -----------------------------------------------------------------
>
>                 Key: FLINK-14968
>                 URL: https://issues.apache.org/jira/browse/FLINK-14968
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / YARN, Tests
>    Affects Versions: 1.10.0
>            Reporter: Gary Yao
>            Assignee: Aljoscha Krettek
>            Priority: Blocker
>              Labels: test-stability
>             Fix For: 1.10.0
>
>
> This change made the test flaky: 
> https://github.com/apache/flink/commit/749965348170e4608ff2a23c9617f67b8c341df5.
>  It changes the job to have two sources instead of one which, under normal 
> circumstances, requires too many slots to run and therefore the job will fail.
> The setup of this test is very intricate, we configure YARN to have two 
> NodeManagers with 2500mb memory each: 
> https://github.com/apache/flink/blob/413a77157caf25dbbfb8b0caaf2c9e12c7374d98/flink-end-to-end-tests/test-scripts/docker-hadoop-secure-cluster/config/yarn-site.xml#L39.
>  We run the job with parallelism 3 and configure Flink to use 1000mb as 
> TaskManager memory and 1000mb of JobManager memory. This means that the job 
> fits into the YARN memory budget but more TaskManagers would not fit. We also 
> don't simply increase the YARN resources because we want the Flink job to use 
> TMs on different NMs because we had a bug where Kerberos config file shipping 
> was not working correctly but the bug was not materialising if all TMs where 
> on the same NM.
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to