[ 
https://issues.apache.org/jira/browse/FLINK-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877859#comment-15877859
 ] 

ASF GitHub Bot commented on FLINK-5815:
---------------------------------------

GitHub user wenlong88 opened a pull request:

    https://github.com/apache/flink/pull/3388

    [FLINK-5815] Add resource files configuration for Yarn Mode

    This PR add three common resource configuration options to yarn mode, which 
allow user to set single file resource from both local filesystem and remote 
hdfs filesystem as what we can do using mapreduce, including:
    1. add -yfiles . -ylibjars for adding local resource file to yarn per-job 
cluster, user can provide a list of file paths to add some local jars or 
dictionary files to yarn distributed cache.
    2. add -yarchives for adding remote resource files to yarn per-job cluster, 
user can provide a list of uri of files which can be stored on hdfs, user can 
rename the file by fragment of the uri.
    all of the files will be distributed to every TM and JM by yarn and added 
to classpath of TM and JM.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/wenlong88/flink jira-5815

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3388.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3388
    
----
commit 77f27600368de02c26d9a45b2e575585728a2ddd
Author: wenlong.lwl <[email protected]>
Date:   2017-01-04T02:52:31Z

    add -yfiles -ylibjars -yarchives for yarn resource file management

commit e8957721e951d16861aff27af4d58e5ac42ec81b
Author: wenlong.lwl <[email protected]>
Date:   2017-02-22T09:35:55Z

    remove useless change

----


> Add resource files configuration for Yarn Mode
> ----------------------------------------------
>
>                 Key: FLINK-5815
>                 URL: https://issues.apache.org/jira/browse/FLINK-5815
>             Project: Flink
>          Issue Type: Improvement
>          Components: Client, YARN
>    Affects Versions: 1.3.0
>            Reporter: Wenlong Lyu
>            Assignee: Wenlong Lyu
>
> Currently in flink, when we want to setup a resource file to distributed 
> cache, we need to make the file accessible remotely by a url, which is often 
> difficult to maintain a service like that. What's more, when we want do add 
> some extra jar files to job classpath, we need to copy the jar files to blob 
> server when submitting the jobgraph. In yarn, especially in flip-6, the blob 
> server is not running yet when we try to start a flink job. 
> Yarn has a efficient distributed cache implementation for application running 
> on it, what's more we can be easily share the files stored in hdfs in 
> different application by distributed cache without extra IO operations. 
> I suggest to introduce -yfiles, -ylibjars -yarchives options to FlinkYarnCLI 
> to enable yarn user setup their job resource files by yarn distributed cache. 
> The options is compatible with what is used in mapreduce, which make it easy 
> to use for yarn user who generally has experience on using mapreduce.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to