Github user rmetzger commented on the pull request:
https://github.com/apache/flink/pull/1741#issuecomment-195362473
I tested the change on a secured CDH 5.3 setup and it worked, also across
TaskManager failures:
```
2016-03-11 05:27:34,732 INFO org.apache.flink.yarn.YarnJobManager
- Status of job 58bd9ea7c4c4515ecd1cb29928523662 (Flink Java
Job at Fri Mar 11 05:27:33 PST 2016) changed to FINISHED.
2016-03-11 05:28:16,832 WARN akka.remote.ReliableDeliverySupervisor
- Association with remote system
[akka.tcp://[email protected]:48995] has failed, address is now gated for [5000]
ms. Reason is: [Disassociated].
2016-03-11 05:28:20,345 INFO
org.apache.flink.yarn.YarnFlinkResourceManager - Container
ResourceID{resourceId='container_1440768826963_0010_01_000002'} failed. Exit
status: 143
2016-03-11 05:28:20,346 INFO
org.apache.flink.yarn.YarnFlinkResourceManager - Diagnostics for
container ResourceID{resourceId='container_1440768826963_0010_01_000002'} in
state COMPLETE : exitStatus=143 diagnostics=Container killed on request. Exit
code is 143
Container exited with a non-zero exit code 143
Killed by external signal
2016-03-11 05:28:20,346 INFO
org.apache.flink.yarn.YarnFlinkResourceManager - Total number of
failed containers so far: 1
2016-03-11 05:28:20,346 INFO
org.apache.flink.yarn.YarnFlinkResourceManager - Requesting new
TaskManager container with 1024 megabytes memory. Pending requests: 1
2016-03-11 05:28:20,346 INFO org.apache.flink.yarn.YarnJobManager
- Task manager
akka.tcp://[email protected]:48995/user/taskmanager terminated.
2016-03-11 05:28:20,347 INFO
org.apache.flink.runtime.instance.InstanceManager - Unregistered
task manager akka.tcp://[email protected]:48995/user/taskmanager. Number of
registered task managers 0. Number of available slots 0.
2016-03-11 05:28:21,859 WARN Remoting
- Tried to associate with unreachable remote address
[akka.tcp://[email protected]:48995]. Address is now gated for 5000 ms, all
messages to this address will be delivered to dead letters. Reason: Connection
refused: /127.0.0.1:48995
2016-03-11 05:28:25,850 INFO
org.apache.flink.yarn.YarnFlinkResourceManager - Received new
container: container_1440768826963_0010_01_000003 - Remaining pending container
requests: 0
2016-03-11 05:28:25,850 INFO
org.apache.flink.yarn.YarnFlinkResourceManager - Launching
TaskManager in container
ResourceID{resourceId='container_1440768826963_0010_01_000003'} on host
quickstart.cloudera
2016-03-11 05:28:25,859 INFO
org.apache.flink.yarn.YarnFlinkResourceManager - Received new
container: container_1440768826963_0010_01_000004 - Remaining pending container
requests: 0
2016-03-11 05:28:25,859 INFO
org.apache.flink.yarn.YarnFlinkResourceManager - Returning
excess container container_1440768826963_0010_01_000004
2016-03-11 05:28:26,354 INFO
org.apache.flink.yarn.YarnFlinkResourceManager - Container
ResourceID{resourceId='container_1440768826963_0010_01_000004'} completed
successfully with diagnostics: Container released by application
2016-03-11 05:28:28,034 INFO
org.apache.flink.runtime.instance.InstanceManager - Registered
TaskManager at quickstart (akka.tcp://[email protected]:36513/user/taskmanager)
as 22bdcc2288142acf37c723fa2930d447. Current number of registered hosts is 1.
Current number of alive task slots is 1.
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---