[ https://issues.apache.org/jira/browse/MAPREDUCE-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sangjin Lee updated MAPREDUCE-6719: ----------------------------------- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.0 Status: Resolved (was: Patch Available) Committed patch v.2 to trunk and branch-2 (2.9.0). Thanks [~templedf] for your contribution! > The list of -libjars archives should be replaced with a wildcard in the > distributed cache to reduce the application footprint in the state store > ------------------------------------------------------------------------------------------------------------------------------------------------ > > Key: MAPREDUCE-6719 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6719 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distributed-cache > Affects Versions: 2.8.0 > Reporter: Daniel Templeton > Assignee: Daniel Templeton > Priority: Critical > Fix For: 2.9.0 > > Attachments: MAPREDUCE-6719.001.patch, MAPREDUCE-6719.002.patch > > > When using the -libjars option to add classes to the classpath, every library > so added is explicitly listed in the ContainerLaunchContext's local resources > even though they're all uploaded to the same directory in HDFS. When using > tools like Crunch without an uber JAR or when trying to take advantage of the > shared cache, the number of libraries can be quite large. We've seen many > cases where we had to turn down the max number of applications to prevent ZK > from running out of heap because of the size of the state store entries. > This JIRA proposes to allow for wildcards both in the internal processing of > the -libjars switch and in paths added through the Job and DistributedCache > classes. Rather than listing all files independently, this JIRA proposes to > replace the complete list of libdir files with the wildcarded libdir > directory, e.g. "libdir/*". This behavior is the same as the current behavior > when using -libjars, but avoids explicitly listing every file. > This capability will also be exposed by the > {{DistributedCache.addCacheFile()}} method. > See YARN-4958 for the NM side of the implementation and additional discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org