[ https://issues.apache.org/jira/browse/FLINK-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Weijie Guo updated FLINK-14340: ------------------------------- Affects Version/s: 2.1.0 > Specify an unique DFSClient name for Hadoop FileSystem > ------------------------------------------------------ > > Key: FLINK-14340 > URL: https://issues.apache.org/jira/browse/FLINK-14340 > Project: Flink > Issue Type: Improvement > Components: FileSystems > Affects Versions: 2.1.0 > Reporter: Congxian Qiu > Priority: Minor > Labels: auto-deprioritized-major > Fix For: 2.0.0 > > > Currently, when Flink read/write to HDFS, we do not set the DFSClient name > for all the connections, so we can’t distinguish the connections, and can’t > find the specific Job or TM quickly. > This issue wants to add the {{container_id}} as a unique name when init > Hadoop File System, so we can easily distinguish the connections belongs to > which Job/TM. > > Core changes is add a line such as below in > {{org.apache.flink.runtime.fs.hdfs.HadoopFsFactory#create}} > > {code:java} > hadoopConfig.set(“mapreduce.task.attempt.id”, > System.getenv().getOrDefault(CONTAINER_KEY_IN_ENV, > DEFAULT_CONTAINER_ID));{code} > > Currently, In {{YarnResourceManager}} and {{MesosResourceManager}} we both > have an enviroment key {{ENV_FLINK_CONTAINER_ID = "_FLINK_CONTAINER_ID"}}, so > maybe we should introduce this key in {{StandaloneResourceManager}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)