[ 
https://issues.apache.org/jira/browse/FLINK-16544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-16544:
-----------------------------------
    Labels: stale-minor  (was: )

> Flink FileSystem for web.uploadDir
> ----------------------------------
>
>                 Key: FLINK-16544
>                 URL: https://issues.apache.org/jira/browse/FLINK-16544
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / Core
>    Affects Versions: 1.10.0
>            Reporter: Angel Barragán
>            Priority: Minor
>              Labels: stale-minor
>
> Currently the configuration properties "web.upload.dir" and "web.upload.dir" 
> only supports paths on the local filesystem. When we deploy Flink under 
> another cluster environment like yarn, it is more useful to be able to 
> configure those directories to be on HDFS, so the size and maintenance tasks 
> are easier, than trying to find out on which node yarn has launched the 
> Jobmanager task, and manage the upload directory there.
> In my concrete case, I found this management (let's say disadvantage) 
> creating an AWS EMR cluster with Flink, where the default configuration 
> creates this directory under /tmp on the local filesystem of the CORE node 
> where the JobManager is deployed by Yarn. We found that EMR cluster is also 
> configured to fully empty /tmp on a month basis, removing the upload 
> directory for Flink, and in that case makigng Flink to fail when you try to 
> submit a new Job. We had to recreate the directory manually.
> The first solution I tried is to change the above configuration properties to 
> use hdfs like we did with configuration property "state.checkpoints.dir", and 
> we found it doesn't work on yarn environment. So I checked Flink code to see 
> how this configuration is being used and found it is the local file system.
> I think, that this solution would be an improvement on the management for 
> Flink when running on another Cluster environment where we can use a shared 
> storage like HDFS or S3.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to