[jira] [Updated] (FLINK-14340) Specify an unique DFSClient name for Hadoop FileSystem

Weijie Guo (Jira) Tue, 18 Mar 2025 02:54:22 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Weijie Guo updated FLINK-14340:
-------------------------------
    Affects Version/s: 2.1.0

> Specify an unique DFSClient name for Hadoop FileSystem
> ------------------------------------------------------
>
>                 Key: FLINK-14340
>                 URL: https://issues.apache.org/jira/browse/FLINK-14340
>             Project: Flink
>          Issue Type: Improvement
>          Components: FileSystems
>    Affects Versions: 2.1.0
>            Reporter: Congxian Qiu
>            Priority: Minor
>              Labels: auto-deprioritized-major
>             Fix For: 2.0.0
>
>
> Currently, when Flink read/write to HDFS, we do not set the DFSClient name 
> for all the connections, so we can’t distinguish the connections, and can’t 
> find the specific Job or TM quickly.
> This issue wants to add the {{container_id}} as a unique name when init 
> Hadoop File System, so we can easily distinguish the connections belongs to 
> which Job/TM.
>  
> Core changes is add a line such as below in 
> {{org.apache.flink.runtime.fs.hdfs.HadoopFsFactory#create}}
>  
> {code:java}
> hadoopConfig.set(“mapreduce.task.attempt.id”, 
> System.getenv().getOrDefault(CONTAINER_KEY_IN_ENV, 
> DEFAULT_CONTAINER_ID));{code}
>  
> Currently, In {{YarnResourceManager}} and {{MesosResourceManager}} we both 
> have an enviroment key {{ENV_FLINK_CONTAINER_ID = "_FLINK_CONTAINER_ID"}}, so 
> maybe we should introduce this key in {{StandaloneResourceManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-14340) Specify an unique DFSClient name for Hadoop FileSystem

Reply via email to