[
https://issues.apache.org/jira/browse/OOZIE-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15027681#comment-15027681
]
Illya Yalovyy commented on OOZIE-2402:
--------------------------------------
[~rkanter],
Thank you for the prompt review.
Please see my notes below:
1. Will fix it.
2. I'll update related documentation.
3. Will fix it.
4. Will fix it.
5. {{IOUtils.copyBytes(in, out, fs.getConf(), true);}} closes both streams
internally. We need this {{close()}} statement in catch section only for a case
when {{out = fs.create(new Path(dstPath, file.getName()));}} fails with an
exception.
6. Will fix it
7. I wanted to escape overhead of hadoop FS implementation, but I will run some
tests to actually measure the difference. If it is not significant, I will use
{{fs.copyFromLocalFile}} to copy individual files.
8. Will add unit test
> oozie-setup.sh sharelib create takes a long time on large clusters
> ------------------------------------------------------------------
>
> Key: OOZIE-2402
> URL: https://issues.apache.org/jira/browse/OOZIE-2402
> Project: Oozie
> Issue Type: Improvement
> Components: tools
> Affects Versions: 4.2.0
> Reporter: Illya Yalovyy
> Assignee: Illya Yalovyy
> Attachments: OOZIE-2402-1.patch
>
>
> When cluster has 256+ nodes it can take up to 5 minutes to create a sharelib.
> Copy the tarball itself takes only around 10 seconds. It seems like
> performance could be improved by loading files concurrently in many threads.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)