[jira] [Comment Edited] (YARN-2185) Use pipes when localizing archives
[ https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337005#comment-16337005 ] Miklos Szegedi edited comment on YARN-2185 at 1/24/18 6:30 AM: --- I opened YARN-7803 for the unit test error. It is not related to this patch. was (Author: miklos.szeg...@cloudera.com): I opened YARN-7803 for the unit test error. > Use pipes when localizing archives > -- > > Key: YARN-2185 > URL: https://issues.apache.org/jira/browse/YARN-2185 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.4.0 >Reporter: Jason Lowe >Assignee: Miklos Szegedi >Priority: Major > Attachments: YARN-2185.000.patch, YARN-2185.001.patch, > YARN-2185.002.patch, YARN-2185.003.patch, YARN-2185.004.patch, > YARN-2185.005.patch, YARN-2185.006.patch, YARN-2185.007.patch, > YARN-2185.008.patch, YARN-2185.009.patch > > > Currently the nodemanager downloads an archive to a local file, unpacks it, > and then removes it. It would be more efficient to stream the data as it's > being unpacked to avoid both the extra disk space requirements and the > additional disk activity from storing the archive. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-2185) Use pipes when localizing archives
[ https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302123#comment-16302123 ] Miklos Szegedi edited comment on YARN-2185 at 12/23/17 12:10 AM: - Attaching my suggestion how to solve this. The code streams HDFS as standard input to the tar and gzip commands. It handles Windows as well. As an addition I create the temporary directory with permissions 700 instead of 755. I do not create any additional temporary directories for extraction, one is enough. A difference is that I use jar command for zips as well, so that it handles Windows properly. Also I added an additional switch to be able to disable the modification time check specifying -1 as the timestamp. I also do parallel copy for directory localization to leverage the distributed storage in HDFS. was (Author: miklos.szeg...@cloudera.com): Attaching my suggestion how to solve this. The code streams HDFS as standard input to the tar and gzip commands. It handles Windows as well. As an addition I create temporary files with permissions 700 instead of 755. I do not create any additional temporary directories for extraction, one is enough. A difference is that I use jar command for zips as well, so that it handles Windows properly. Also I added an additional switch to be able to disable the modification time check specifying -1 as the timestamp. I also do parallel copy for directory localization to leverage the distributed storage in HDFS. > Use pipes when localizing archives > -- > > Key: YARN-2185 > URL: https://issues.apache.org/jira/browse/YARN-2185 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.4.0 >Reporter: Jason Lowe >Assignee: Miklos Szegedi > Attachments: YARN-2185.000.patch > > > Currently the nodemanager downloads an archive to a local file, unpacks it, > and then removes it. It would be more efficient to stream the data as it's > being unpacked to avoid both the extra disk space requirements and the > additional disk activity from storing the archive. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-2185) Use pipes when localizing archives
[ https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302123#comment-16302123 ] Miklos Szegedi edited comment on YARN-2185 at 12/23/17 12:08 AM: - Attaching my suggestion how to solve this. The code streams HDFS as standard input to the tar and gzip commands. It handles Windows as well. As an addition I create temporary files with permissions 700 instead of 755. I do not create any additional temporary directories for extraction, one is enough. A difference is that I use jar command for zips as well, so that it handles Windows properly. Also I added an additional switch to be able to disable the modification time check specifying -1 as the timestamp. I also do parallel copy for directory localization to leverage the distributed storage in HDFS. was (Author: miklos.szeg...@cloudera.com): Attaching my suggestion how to solve this. The code streams HDFS as standard input to the tar and gzip commands. It handles Windows as well. As an addition I create temporary files with permissions 700 instead of 755. I do not create any additional temporary directories for extraction, one is enough. A difference is that I use jar command for zips as well, so that it handles Windows properly. Also I added an additional switch to be able to disable the modification time check specifying -1 as the timestamp. > Use pipes when localizing archives > -- > > Key: YARN-2185 > URL: https://issues.apache.org/jira/browse/YARN-2185 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.4.0 >Reporter: Jason Lowe >Assignee: Miklos Szegedi > Attachments: YARN-2185.000.patch > > > Currently the nodemanager downloads an archive to a local file, unpacks it, > and then removes it. It would be more efficient to stream the data as it's > being unpacked to avoid both the extra disk space requirements and the > additional disk activity from storing the archive. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org