[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857148#action_12857148
 ] 

Scott Carey commented on MAPREDUCE-1700:
----------------------------------------

The documentation for DistributedCache says:
"Its efficiency stems from the fact that the files are only copied _once per 
job_ and the ability to cache archives which are un-archived on the slaves."

Is the documentation wrong? or the claim that the distribution happens one per 
tasktracker and multiple jobs can use it incorrect? 
The documentation above is ambiguous -- does it copy items once per job, 
un-archiving once per slave per job?  or does it cache un-archived data on 
slaves across a longer period of time?

What I am suggesting is not a Job-scope cache, but something that has a much 
longer scope -- days, weeks, months -- to share between many different jobs 
without per job copying or unpacking unless the contents have changed.   It is 
unclear from the documentation on DistributedCache if there is any optimization 
outside of the Job scope.  If it had this sort of optimization already that 
would be great.



> User supplied dependencies may conflict with MapReduce system JARs
> ------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1700
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1700
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>            Reporter: Tom White
>
> If user code has a dependency on a version of a JAR that is different to the 
> one that happens to be used by Hadoop, then it may not work correctly. This 
> happened with user code using a different version of Avro, as reported 
> [here|https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852081#action_12852081].
> The problem is analogous to the one that application servers have with WAR 
> loading. Using a specialized classloader in the Child JVM is probably the way 
> to solve this.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to