[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709547#comment-16709547 ] Zhankun Tang edited comment on YARN-8714 at 12/5/18 3:04 AM: - [~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle directory. I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. This is enabled by YARN-2185. {code:java} Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);{code} And YARN localizer indeed downloads the HDFS dir to local for DistributedShell. {code:java} yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir {code} But the bad news is submarine utilizes YARN native service, and it doesn't know this YARN ability and blocked it. {code:java} 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported. {code} Two solutions ahead of us at present: 1. Fix the improper handling of the directory in the native service and then got this implement. 2. Go ahead with our download, zip and upload approach which is more complex. And refactor this after 1 is done. I personally prefer solution 2 because in that case submarine won't depend on newer YARN(3.1.0). Any thoughts? was (Author: tangzhankun): [~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle directory. I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. {code:java} Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);{code} And YARN localizer indeed downloads the HDFS dir to local for DistributedShell. {code:java} yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir {code} But the bad news is submarine utilizes YARN native service, and it doesn't know this YARN ability and blocked it. {code:java} 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported. {code} Two solutions ahead of us at present: 1. Fix the improper handling of the directory in the native service and then got this implement. 2. Go ahead with our download, zip and upload approach which is more complex. And refactor this after 1 is done. Any thoughts? > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, > YARN-8714-trunk.002.patch
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709547#comment-16709547 ] Zhankun Tang edited comment on YARN-8714 at 12/5/18 2:33 AM: - [~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle directory. I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. {code:java} Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);{code} And YARN localizer indeed downloads the HDFS dir to local for DistributedShell. {code:java} yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir {code} But the bad news is submarine utilizes YARN native service, and it doesn't know this YARN ability and blocked it. {code:java} 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported. {code} Two solutions ahead of us at present: 1. Fix the improper handling of the directory in the native service and then got this implement. 2. Go ahead with our download, zip and upload approach which is more complex. And refactor this after 1 is done. Any thoughts? was (Author: tangzhankun): [~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle directory. I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. {code:java} Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);{code} And YARN localizer indeed downloads the HDFS dir to local for DistributedShell. {code:java} yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir {code} But the bad news is submarine utilizes YARN native server, and it doesn't know this YARN ability and blocked it. {code:java} 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported. {code} Two solutions ahead of us at present: 1. Fix the improper handling of the directory in the native service and then got this implement. 2. Go ahead with our download, zip and upload approach which is more complex. And refactor this after 1 is done. Any thoughts? > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, > YARN-8714-trunk.002.patch > > > See > [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7], >
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708874#comment-16708874 ] Zhankun Tang edited comment on YARN-8714 at 12/4/18 3:32 PM: - [~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN localizer seems can localize *remote directory(hdfs, s3 .etc)*. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. Depending on this can greatly simplify our implementation, no need to download remote dir or zip local dir anymore. We may still need a configuration to limit the remote file/dir size to be localized to the container. I will verify and update the patch tomorrow. was (Author: tangzhankun): [~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN localizer seems can localize *remote directory(hdfs, s3 .etc)*. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. Depending on this can greatly simplify our implementation, no need to download remote dir or zip local dir anymore. I will verify and update the patch tomorrow. > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, > YARN-8714-trunk.002.patch > > > See > [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7], > {{job run --localization ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708874#comment-16708874 ] Zhankun Tang edited comment on YARN-8714 at 12/4/18 3:29 PM: - [~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN localizer seems can localize *remote directory(hdfs, s3 .etc)*. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. Depending on this can greatly simplify our implementation, no need to download remote dir or zip local dir anymore. I will verify and update the patch tomorrow. was (Author: tangzhankun): [~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN localizer seems can localize *remote directory*. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. Depending on this can greatly simplify our implementation, no need to download remote dir or zip local dir anymore. I will verify and update the patch tomorrow. > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, > YARN-8714-trunk.002.patch > > > See > [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7], > {{job run --localization ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704227#comment-16704227 ] Zhankun Tang edited comment on YARN-8714 at 11/30/18 6:47 AM: -- {quote}1) Why hardcoded to handle {{hdfs://}}, it could be s3, abfs, gs, etc. I'd definitely prefer to make this to more general. This including code and comments. {quote} {color:#d04437}Zhankun=>{color}Yeah. Agree with this. {quote}Because downloading files from remote fs could be risky. What if user accidentally specified "/"? If user has needs to localize a dir of file, he or she should tar or zip it before uploading to HDFS. {quote} {color:#d04437}Zhankun=>{color} Yeah. Maybe we could do capacity check to avoid risk. Let's confirm this requirment with [~liuxun323]. Is downloading from remote directory a good choice considring the potential security issue? was (Author: tangzhankun): {quote}1) Why hardcoded to handle {{hdfs://}}, it could be s3, abfs, gs, etc. I'd definitely prefer to make this to more general. This including code and comments. {quote} {color:#d04437}Zhankun=>{color}Yeah. Agree with this. {quote}Because downloading files from remote fs could be risky. What if user accidentally specified "/"? If user has needs to localize a dir of file, he or she should tar or zip it before uploading to HDFS. {quote} {color:#d04437}Zhankun=>{color} Yeah. Let's confirm this requirment with [~liuxun323]. Is downloading from remote directory a good choice considring the potential security issue? > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, > YARN-8714-trunk.002.patch > > > See > [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7], > {{job run --localization ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692813#comment-16692813 ] Xun Liu edited comment on YARN-8714 at 11/20/18 8:22 AM: - [~tangzhankun] , I think there is no problem. [~leftnoteasy],This {color:#FF}--localizations{color} parameter name, is it called {color:#FF}--mount{color} more suitable? was (Author: liuxun323): [~tangzhankun] , I think there is no problem. > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692696#comment-16692696 ] Zhankun Tang edited comment on YARN-8714 at 11/20/18 5:58 AM: -- [~sunilg], [~liuxun323], [~leftnoteasy], [~yuan_zac]. Updated a new version. More tests are in progress. But please help to review in case our direction is not right. For all ps and worker component, it provides --localization to support both HDFS and local directory (may download, then zip it in temp dir and then upload to HDFS) as the first part of the parameter. And it also supports mount into the container with permission(default is rw) Goals are: {code:java} --localization hdfs:///user/yarn/mydir:.:ro # download, zip and upload the zip file to HDFS. then mount into container word dir's "mydir" folder. {PWD}/mydir --localization /user/yarn/mydir2:/opt/dir2 # zip local dir and upload the zip file to HDFS. set mount into container's /opt/dir2 --localization /user/yarn/script1.py:.:ro # upload a script to HDFS and set mount into container {PWD}/script1.py --localization /user/yarn/script1.py:/opt/script2.py:rw # upload and mount into container /opt/script2.py {code} was (Author: tangzhankun): [~liuxun323], [~leftnoteasy], [~yuan_zac]. Updated a new version. More tests are in progress. But please help to review in case our direction is not right. For all ps and worker component, it provides --localization to support both HDFS and local directory (may download, then zip it in temp dir and then upload to HDFS) as the first part of the parameter. And it also supports mount into the container with permission(default is rw) Goals are: {code:java} --localization hdfs:///user/yarn/mydir:.:ro # download, zip and upload the zip file to HDFS. then mount into container word dir's "mydir" folder. {PWD}/mydir --localization /user/yarn/mydir2:/opt/dir2 # zip local dir and upload the zip file to HDFS. set mount into container's /opt/dir2 --localization /user/yarn/script1.py:.:ro # upload a script to HDFS and set mount into container {PWD}/script1.py --localization /user/yarn/script1.py:/opt/script2.py:rw # upload and mount into container /opt/script2.py {code} > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692696#comment-16692696 ] Zhankun Tang edited comment on YARN-8714 at 11/20/18 5:55 AM: -- [~liuxun323], [~leftnoteasy], [~yuan_zac]. Updated a new version. More tests are in progress. But please help to review in case our direction is not right. For all ps and worker component, it provides --localization to support both HDFS and local directory (may download, then zip it in temp dir and then upload to HDFS) as the first part of the parameter. And it also supports mount into the container with permission(default is rw) Goals are: {code:java} --localization hdfs:///user/yarn/mydir:.:ro # download, zip and upload the zip file to HDFS. then mount into container word dir's "mydir" folder. {PWD}/mydir --localization /user/yarn/mydir2:/opt/dir2 # zip local dir and upload the zip file to HDFS. set mount into container's /opt/dir2 --localization /user/yarn/script1.py:.:ro # upload a script to HDFS and set mount into container {PWD}/script1.py --localization /user/yarn/script1.py:/opt/script2.py:rw # upload and mount into container /opt/script2.py {code} was (Author: tangzhankun): [~liuxun323], [~leftnoteasy], [~yuan_zac]. Updated a new version. More tests are needed. But please help to review incase the changes are too big. For all ps and worker component, it provides --localization to support both HDFS and local directory (may download, then zip it in temp dir and then upload to HDFS) as the first part of the parameter. And it also supports mount into the container with permission(default is rw) Goals are: {code:java} --localization hdfs:///user/yarn/mydir:.:ro # download, zip and upload the zip file to HDFS. then mount into container word dir's "mydir" folder. {PWD}/mydir --localization /user/yarn/mydir2:/opt/dir2 # zip local dir and upload the zip file to HDFS. set mount into container's /opt/dir2 --localization /user/yarn/script1.py:.:ro # upload a script to HDFS and set mount into container {PWD}/script1.py --localization /user/yarn/script1.py:/opt/script2.py:rw # upload and mount into container /opt/script2.py {code} > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684794#comment-16684794 ] Xun Liu edited comment on YARN-8714 at 11/14/18 9:02 AM: - 1) Support for multiple files or directories. 2) Must maintain the original state, such as folder subdirectory structure, zip file or normal file, do not make any changes. 3) Mount the specified file or folder to the absolute path in the container. Because the user can customize the WORKDIR of the container through the Dockerfile, And the containers are dedicated and custom made, So the user knows exactly where to mount the file. 4) Keep consistent with the use of docker, use`{color:#ff}:{color}` split source directory and destination directory. 5) Parameter format: -localizations hdfs:///user/yarn{color:#ff}:{color}/absolute/path Requirements document: [https://docs.google.com/document/d/16YN8Kjmxt1Ym3clx5pDnGNXGajUT36hzQxjaik1cP4A/edit#heading=h.s07ukakieg7q] was (Author: liuxun323): 1) Support for multiple files or directories. 2) Must maintain the original state, such as folder subdirectory structure, zip file or normal file, do not make any changes. 3) Mount the specified file or folder to the absolute path in the container. Because the user can customize the WORKDIR of the container through the Dockerfile, And the containers are dedicated and custom made, So the user knows exactly where to mount the file. 4) Keep consistent with the use of docker, use`{color:#FF}:{color}` split source directory and destination directory. 5) Parameter format: -localizations hdfs:///user/yarn{color:#FF}:{color}/absolute/path > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684794#comment-16684794 ] Xun Liu edited comment on YARN-8714 at 11/14/18 7:52 AM: - 1) Support for multiple files or directories. 2) Must maintain the original state, such as folder subdirectory structure, zip file or normal file, do not make any changes. 3) Mount the specified file or folder to the absolute path in the container. Because the user can customize the WORKDIR of the container through the Dockerfile, And the containers are dedicated and custom made, So the user knows exactly where to mount the file. 4) Keep consistent with the use of docker, use`{color:#FF}:{color}` split source directory and destination directory. 5) Parameter format: -localizations hdfs:///user/yarn{color:#FF}:{color}/absolute/path was (Author: liuxun323): # Support files and folders # HDFS type and local file type supporting HDFS:// prefix at the same time # Keep it intact: the uploaded package is compressed, and the submarine is automatically decompressed into the container, which is not suitable, because if I rush to save the file that needs to upload the compressed package format, it will be destroyed. And it also introduces ambiguity. # Parameter format: {color:#FF}-localizations hdfs:///user/yarn>{color:#ff}.{color}{color} # indicates the current execution path of the container # Parameter format: {color:#FF}-localizations hdfs:///user/yarn>./abc{color} # Indicates the abc folder under the current execution path of the container (submarine marks the file under hdfs:///user/yarn as an abc .tar.gz compression package, extract the abc folder when pulling up the container, then mount it in it) > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684794#comment-16684794 ] Xun Liu edited comment on YARN-8714 at 11/13/18 7:01 AM: - # Support files and folders # HDFS type and local file type supporting HDFS:// prefix at the same time # Keep it intact: the uploaded package is compressed, and the submarine is automatically decompressed into the container, which is not suitable, because if I rush to save the file that needs to upload the compressed package format, it will be destroyed. And it also introduces ambiguity. # Parameter format: {color:#FF}-localizations hdfs:///user/yarn>{color:#ff}.{color}{color} # indicates the current execution path of the container # Parameter format: {color:#FF}-localizations hdfs:///user/yarn>./abc{color} # Indicates the abc folder under the current execution path of the container (submarine marks the file under hdfs:///user/yarn as an abc .tar.gz compression package, extract the abc folder when pulling up the container, then mount it in it) was (Author: liuxun323): # Support files and folders # HDFS type and local file type supporting HDFS:// prefix at the same time # Keep it intact: the uploaded package is compressed, and the submarine is automatically decompressed into the container, which is not suitable, because if I rush to save the file that needs to upload the compressed package format, it will be destroyed. And it also introduces ambiguity. # Parameter format: {color:#FF}--localizations hdfs:///user/yarn->.{color} # indicates the current execution path of the container # Parameter format: {color:#FF}--localizations hdfs:///user/yarn->./abc{color} # Indicates the abc folder under the current execution path of the container (submarine marks the file under hdfs:///user/yarn as an abc .tar.gz compression package, extract the abc folder when pulling up the container, then mount it in it) > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684664#comment-16684664 ] Zac Zhou edited comment on YARN-8714 at 11/13/18 3:14 AM: -- Looks great, it would be convenient for notebook app, like Zeppline, to submit the job if local files are supported. I'm not sure if the parameter name, localization, is ok. Is it easier to understand if we use some parameter like "files" or "libjars" used in map reduce job? Thanks, was (Author: yuan_zac): Looks great, it would be convenient for notebook app, like Zeppline, to submit the job if local files are supported. I'm not sure if the parameter name, localization, is ok. Is it easier to understand if we use some parameter like '''--files' or "--libjars" used in map reduce job? Thanks, > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684664#comment-16684664 ] Zac Zhou edited comment on YARN-8714 at 11/13/18 3:14 AM: -- Looks great, it would be convenient for notebook app, like Zeppline, to submit the job if local files are supported. I'm not sure if the parameter name, localization, is ok. Is it easier to understand if we use some parameter like "files" or "libjars" used in map reduce jobs? Thanks, was (Author: yuan_zac): Looks great, it would be convenient for notebook app, like Zeppline, to submit the job if local files are supported. I'm not sure if the parameter name, localization, is ok. Is it easier to understand if we use some parameter like "files" or "libjars" used in map reduce job? Thanks, > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684664#comment-16684664 ] Zac Zhou edited comment on YARN-8714 at 11/13/18 3:13 AM: -- Looks great, it would be convenient for notebook app, like Zeppline, to submit the job if local files are supported. I'm not sure if the parameter name, localization, is ok. Is it easier to understand if we use some parameter like '''--files' or "--libjars" used in map reduce job? Thanks, was (Author: yuan_zac): Looks great, it would be convenient for notebook app, like Zeppline, to submit the job if local files are supported. I'm not sure if the parameter name, localization, is ok. Is it easier to understand if we use some parameter like '''--files' or "--libjars" used in map reduce job? > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677533#comment-16677533 ] Zhankun Tang edited comment on YARN-8714 at 11/7/18 2:07 AM: - [~leftnoteasy] ,[~liuxun323] , [~yuan_zac] The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localizations hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} was (Author: tangzhankun): [~leftnoteasy] ,[~liuxun323] , The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localizations hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677533#comment-16677533 ] Zhankun Tang edited comment on YARN-8714 at 11/7/18 2:06 AM: - [~leftnoteasy] ,[~liuxun323] , The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localizations hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} was (Author: tangzhankun): [~leftnoteasy] , The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localizations hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677533#comment-16677533 ] Zhankun Tang edited comment on YARN-8714 at 11/7/18 2:05 AM: - [~leftnoteasy] , The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localizations hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} was (Author: tangzhankun): [~leftnoteasy] , The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localization hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677533#comment-16677533 ] Zhankun Tang edited comment on YARN-8714 at 11/7/18 2:04 AM: - [~leftnoteasy] , The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localization hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} was (Author: tangzhankun): [~leftnoteasy] , The "->" means we want the File in the left to be localized to a name in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localization hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org