[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated MAPREDUCE-6719:
----------------------------------------
    Attachment: MAPREDUCE-6719.001.patch

This patch is based on patch 3 from YARN-4958 plus [~sjlee0]'s suggestions.  I 
tested, and it does work with the local job runner.

> -libjars should use wildcards to reduce the application footprint in the 
> state store
> ------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6719
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6719
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: distributed-cache
>    Affects Versions: 2.8.0
>            Reporter: Daniel Templeton
>            Assignee: Daniel Templeton
>            Priority: Critical
>         Attachments: MAPREDUCE-6719.001.patch
>
>
> When using the -libjars option to add classes to the classpath, every library 
> so added is explicitly listed in the ContainerLaunchContext's local resources 
> even though they're all uploaded to the same directory in HDFS. When using 
> tools like Crunch without an uber JAR or when trying to take advantage of the 
> shared cache, the number of libraries can be quite large. We've seen many 
> cases where we had to turn down the max number of applications to prevent ZK 
> from running out of heap because of the size of the state store entries.
> This JIRA proposes to allow for wildcards both in the internal processing of 
> the -libjars switch and in paths added through the Job and DistributedCache 
> classes. Rather than listing all files independently, this JIRA proposes to 
> replace the complete list of libdir files with the wildcarded libdir 
> directory, e.g. "libdir/*". This behavior is the same as the current behavior 
> when using -libjars, but avoids explicitly listing every file.
> This capability will also be exposed by the 
> {{DistributedCache.addCacheFile()}} method.
> See YARN-4958 for the NM side of the implementation and additional discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to