[
https://issues.apache.org/jira/browse/FLINK-25029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448457#comment-17448457
]
David Morávek commented on FLINK-25029:
---------------------------------------
I'd say this is mostly related to filesystems (namely to the hadoop
filesystem). If I understand it correctly, it should be enough to call
`CallerContext#setCurrent` before initiating any calls that interact with HDFS.
Whether it's also a coordination issue depends on whether filesystems have
enough information to create a "descriptive" context for the call. (eg. in
context hierarchy, "flink -> <jobId> -> <taskId> -> ???")
[1]
https://github.com/apache/hadoop/blob/rel/release-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java#L147
> Hadoop Caller Context Setting In Flink
> --------------------------------------
>
> Key: FLINK-25029
> URL: https://issues.apache.org/jira/browse/FLINK-25029
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Task
> Reporter: 刘方奇
> Priority: Major
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track
> which upper level job issues it. The upper level callers may be specific
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode
> (NN) is abused/spammed, the operator may want to know immediately which MR
> job should be blamed so that she can kill it. To this end, the caller context
> contains at least the application-dependent "tracking id".
> The above is the main effect of the Caller Context. HDFS Client set Caller
> Context, then name node get it in audit log to do some work.
> Now the Spark and hive have the Caller Context to meet the HDFS Job Audit
> requirement.
> In my company, flink jobs often cause some problems for HDFS, so we did it
> for preventing some cases.
> If the feature is general enough. Should we support it, then I can submit a
> PR for this.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)