[
https://issues.apache.org/jira/browse/FLINK-5668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886220#comment-15886220
]
Bill Liu edited comment on FLINK-5668 at 2/27/17 5:55 PM:
----------------------------------------------------------
[~rmetzger]
[~wheat9] and I are working on implementing a flink job deployer for a Yarn
with `HttpFs` and `S3`.
The Yarn Container could resolve the `http/s3` file scheme.
We use `HttpFs` instead of `HDFS` to bootstrap the JobManager
Here is the code to set up the AM container (JobManager)
```
Path resourcePath = new Path("http://localhost:19989/flink-dist.jar")
FileStatus fileStatus = resourcePath.getFileSystem(yarnConfiguration)
.getFileStatus(resourcePath);
LOG.info("resource {}", ConverterUtils.getYarnUrlFromPath(resourcePath));
LocalResource packageResource =
LocalResource.newInstance(
ConverterUtils.getYarnUrlFromPath(resourcePath),
LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
fileStatus.getLen(), fileStatus.getModificationTime());
LOG.info("add localresource {}", packageResource);
localResources.put("flink.jar", packageResource);
amContainer.setLocalResources(localResources);
```
`yarn.deploy.fs` is not a goog idea, because these bootstrap jars/files may be
located on different filesystem.
It's better to parse the jar Path to get the underneath filesystem of jar.
was (Author: bill.liu8904):
[~rmetzger]
[~wheat9]] and I are working on implementing a flink job deployer for a Yarn
with `HttpFs` and `S3`.
The Yarn Container could resolve the `http/s3` file scheme.
We use `HttpFs` instead of `HDFS` to bootstrap the JobManager
Here is the code to set up the AM container (JobManager)
```
Path resourcePath = new Path("http://localhost:19989/flink-dist.jar")
FileStatus fileStatus = resourcePath.getFileSystem(yarnConfiguration)
.getFileStatus(resourcePath);
LOG.info("resource {}", ConverterUtils.getYarnUrlFromPath(resourcePath));
LocalResource packageResource =
LocalResource.newInstance(
ConverterUtils.getYarnUrlFromPath(resourcePath),
LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
fileStatus.getLen(), fileStatus.getModificationTime());
LOG.info("add localresource {}", packageResource);
localResources.put("flink.jar", packageResource);
amContainer.setLocalResources(localResources);
```
`yarn.deploy.fs` is not a goog idea, because these bootstrap jars/files may be
located on different filesystem.
It's better to parse the jar Path to get the underneath filesystem of jar.
> Reduce dependency on HDFS at job startup time
> ---------------------------------------------
>
> Key: FLINK-5668
> URL: https://issues.apache.org/jira/browse/FLINK-5668
> Project: Flink
> Issue Type: Improvement
> Components: YARN
> Reporter: Bill Liu
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> When create a Flink cluster on Yarn, JobManager depends on HDFS to share
> taskmanager-conf.yaml with TaskManager.
> It's better to share the taskmanager-conf.yaml on JobManager Web server
> instead of HDFS, which could reduce the HDFS dependency at job startup.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)