Hi Henning, I don't know if you've seen https://issues.apache.org/jira/browse/MAPREDUCE-1938 and https://issues.apache.org/jira/browse/MAPREDUCE-1700 which have discussion about this issue.
Cheers Tom On Fri, Sep 24, 2010 at 3:41 AM, Henning Blohm <[email protected]> wrote: > Short update on the issue: > > I tried to find a way to separate class path configurations by modifying the > scripts in HADOOP_HOME/bin but found that TaskRunner actually copies the > class path setting from the parent process when starting a local task so > that I do not see a way of having less on a job's classpath without > modifying Hadoop. > > As that will present a real issue when running our jobs on Hadoop I would > like to propose to change TaskRunner so that it sets a class path > specifically for M/R tasks. That class path could be defined in the scipts > (as for the other processes) using a particular environment variable (e.g. > HADOOP_JOB_CLASSPATH). It could default to the current VM's class path, > preserving today's behavior. > > Is it ok to enter this as an issue? > > Thanks, > Henning > > > Am Freitag, den 17.09.2010, 16:01 +0000 schrieb Allen Wittenauer: > > On Sep 17, 2010, at 4:56 AM, Henning Blohm wrote: > >> When running map reduce tasks in Hadoop I run into classpath issues. >> Contrary to previous posts, my problem is not that I am missing classes on >> the Task's class path (we have a perfect solution for that) but rather find >> too many (e.g. ECJ classes or jetty). > > The fact that you mention: > >> The libs in HADOOP_HOME/lib seem to contain everything needed to run >> anything in Hadoop which is, I assume, much more than is needed to run a map >> reduce task. > > hints that your perfect solution is to throw all your custom stuff in lib. > If so, that's a huge mistake. Use distributed cache instead. >
