[
https://issues.apache.org/jira/browse/HADOOP-12747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135167#comment-15135167
]
Chris Trezzo commented on HADOOP-12747:
---------------------------------------
Thanks [~sjlee0] for breaking down the two options. Currently I am in favor of
option 1. I think it is valuable to have the libjars syntax similar to the
classpath syntax as a large number of users are familiar with that syntax.
Another plus to expanding the wildcard pre-conf is that we get the per-path
file size information in the config. Additionally, the conf is more explicit
about what is being depended on by the task, which hopefully makes it harder to
miss a task that draws in a large number of dependencies (unintentionally or
intentionally).
The v2 patch looks good to me. Do we want to separate out the changes to
ApplicationClassLoader into another patch? I could go either way. The change is
small, but orthogonal to this jira.
> support wildcard in libjars argument
> ------------------------------------
>
> Key: HADOOP-12747
> URL: https://issues.apache.org/jira/browse/HADOOP-12747
> Project: Hadoop Common
> Issue Type: New Feature
> Components: util
> Reporter: Sangjin Lee
> Assignee: Sangjin Lee
> Attachments: HADOOP-12747.01.patch, HADOOP-12747.02.patch
>
>
> There is a problem when a user job adds too many dependency jars in their
> command line. The HADOOP_CLASSPATH part can be addressed, including using
> wildcards (\*). But the same cannot be done with the -libjars argument. Today
> it takes only fully specified file paths.
> We may want to consider supporting wildcards as a way to help users in this
> situation. The idea is to handle it the same way the JVM does it: \* expands
> to the list of jars in that directory. It does not traverse into any child
> directory.
> Also, it probably would be a good idea to do it only for libjars (i.e. don't
> do it for -files and -archives).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)