I wonder if there is a misunderstanding here - the problem is that the classpath has too many classes on it (and clashes with user classes), rather than it being a text string which is too long.
I would suggest that the technical discussion of how to fix this goes onto the JIRA. Cheers, Tom On Thu, Oct 7, 2010 at 1:23 AM, Alejandro Abdelnur <[email protected]> wrote: > well, if the issue is a too long classpath, the softlink thingy will give > some room to breath as the total CP length will be much smaller. > > A > On Thu, Oct 7, 2010 at 3:43 PM, Henning Blohm <[email protected]> > wrote: >> >> So that's actually another issue, right? Besides splitting the classpath >> into those three groups, you want the TT to create soft-links on demand to >> simplify the computation of classpath string. Is that right? >> >> But it's the TT that actually starts the job VM. Why does it matter what >> the string actually looks like, as long as it has the right content? >> >> Thanks, >> Henning >> >> On Thu, 2010-10-07 at 13:22 +0800, Alejandro Abdelnur wrote: >> >> [sent too soon] >> >> The first CP shown is how it is today the CP of a task. If we change it >> pick up all the job JARs from the current dir, then the classpath will be >> much shorter (second CP shown). We can easily achieve this by soft-linking >> the job JARs in the work dir of the task. >> >> Alejandro >> >> On Thu, Oct 7, 2010 at 1:02 PM, Alejandro Abdelnur <[email protected]> >> wrote: >> >> Fragmentation of Hadoop classpaths is another issue: hadoop should >> differentiate the CP in 3: >> >> 1*client CP: what is needed to submit a job (only the nachos) >> >> 2*server CP (JT/NN/TT/DD): what is need to run the cluster (the whole >> enchilada) >> >> 3*job CP: what is needed to run a job (some of the enchilada) >> >> >> But i'm not trying to get into that here. What I'm suggesting is: >> >> >> >> ----- >> >> # Hadoop JARs: >> >> /Users/tucu/dev-apps/hadoop/conf >> >> >> /System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home/lib/tools.jar >> >> /Users/tucu/dev-apps/hadoop/bin/.. >> >> /Users/tucu/dev-apps/hadoop/bin/../hadoop-core-0.20.3-CDH3-SNAPSHOT.jar >> >> /Users/tucu/dev-apps/hadoop/bin/../lib/aspectjrt-1.6.5.jar >> >> ..... (about 30 jars from hadoop lib/ ) >> >> /Users/tucu/dev-apps/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar >> >> # Job JARs (for a job with only 2 JARs): >> >> >> /Users/tucu/dev-apps/hadoop/dirs/mapred/taskTracker/distcache/-2707763075630339038_639898034_1993697040/localhost/user/tucu/oozie-tucu/0000003-101004184132247-oozie-tucu-W/java-node--java/java-launcher.jar >> >> >> /Users/tucu/dev-apps/hadoop/dirs/mapred/taskTracker/distcache/3613772770922728555_-588832047_1993624983/localhost/user/tucu/examples/apps/java-main/lib/oozie-examples-2.2.1-CDH3B3-SNAPSHOT.jar >> >> >> /Users/tucu/dev-apps/hadoop/dirs/mapred/taskTracker/tucu/jobcache/job_201010041326_0058/attempt_201010041326_0058_m_000000_0/work >> >> ----- >> >> >> >> What I'm suggesting is that the later group, the job JARs to be >> soft-linked (by the TT) into the working directory, then their classpath is >> just: >> >> ----- >> >> java-launcher.jar >> >> oozie-examples-2.2.1-CDH3B3-SNAPSHOT.jar >> >> . >> >> ----- >> >> >> >> >> Alejandro >> >> On Wed, Oct 6, 2010 at 7:57 PM, Henning Blohm <[email protected]> >> wrote: >> >> Hi Alejandro, >> >> yes, it can of course be done right (sorry if my wording seemed to >> imply otherwise). Just saying that I think that Hadoop M/R should not go >> into that class loader / module separation business. It's one Job, one VM, >> right? So the problem is to assign just the stuff needed to let the Job do >> its business without becoming an obstacle. >> >> Must admit I didn't understand your proposal 2. How would that remove >> (e.g.) jetty libs from the job's classpath? >> >> Thanks, >> Henning >> >> Am Mittwoch, den 06.10.2010, 18:28 +0800 schrieb Alejandro Abdelnur: >> >> 1. Classloader business can be done right. Actually it could be done as >> spec-ed for servlet web-apps. >> >> >> 2. If the issue is strictly 'too large classpath', then a simpler solution >> would be to sof-link all JARs to the current directory and create the >> classpath with the JAR names only (no path). Note that the soft-linking >> business is already supported by the DistributedCache. So the changes would >> be mostly in the TT to create the JAR names only classpath before starting >> the child. >> >> >> Alejandro >> >> >> On Wed, Oct 6, 2010 at 5:57 PM, Henning Blohm <[email protected]> >> wrote: >> >> Hi Tom, >> >> that's exactly it. Thanks! I don't think that I can comment on the >> issues in Jira so I will do it here. >> >> Tricking with class paths and deviating from the default class loading >> delegation has never been anything but a short term relieve. Fixing things >> by imposing a "better" order of stuff on the class path will not work when >> people do actually use child loaders (as the parent win) - like we do. Also >> it may easily lead to very confusing situations because the former part of >> the class path is not complete and gets other stuff from a latter part etc. >> etc.... no good. >> >> Child loaders are good for module separation but should not be used to >> "hide" type visibiliy from the parent. Almost certainly leading to Class >> Loader Contraint Violation - once you lose control (which is usually earlier >> than expected). >> >> The suggestion to reduce the Job class path to the required minimum is >> the most practical approach. There is some gray area there of course and it >> will not be feasible to reach the absolute minimal set of types there - but >> something reasonable, i.e. the hadoop core that suffices to run the job. >> Certainly jetty & co are not required for job execution (btw. I "hacked" >> 0.20.2 to remove anything in "server/" from the classpath before setting the >> job class path). >> >> I would suggest to >> >> a) introduce some HADOOP_JOB_CLASSPATH var that, if set, is the >> additional classpath, added to the "core" classpath (as described above). If >> not set, for compatibility, preserve today's behavior. >> b) not getting into custom child loaders for jobs as part of hadoop M/R. >> It's non-trivial to get it right and feels to be beyond scope. >> >> I wouldn't mind helping btw. >> >> Thanks, >> Henning >> >> >> >> >> On Tue, 2010-10-05 at 15:59 -0700, Tom White wrote: >> >> Hi Henning, >> >> I don't know if you've seen >> https://issues.apache.org/jira/browse/MAPREDUCE-1938 and >> https://issues.apache.org/jira/browse/MAPREDUCE-1700 which have >> discussion about this issue. >> >> Cheers >> Tom >> >> On Fri, Sep 24, 2010 at 3:41 AM, Henning Blohm <[email protected]> >> wrote: >> > Short update on the issue: >> > >> > I tried to find a way to separate class path configurations by modifying >> > the >> > scripts in HADOOP_HOME/bin but found that TaskRunner actually copies the >> > class path setting from the parent process when starting a local task so >> > that I do not see a way of having less on a job's classpath without >> > modifying Hadoop. >> > >> > As that will present a real issue when running our jobs on Hadoop I >> > would >> > like to propose to change TaskRunner so that it sets a class path >> > specifically for M/R tasks. That class path could be defined in the >> > scipts >> > (as for the other processes) using a particular environment variable >> > (e.g. >> > HADOOP_JOB_CLASSPATH). It could default to the current VM's class path, >> > preserving today's behavior. >> > >> > Is it ok to enter this as an issue? >> > >> > Thanks, >> > Henning >> > >> > >> > Am Freitag, den 17.09.2010, 16:01 +0000 schrieb Allen Wittenauer: >> > >> > On Sep 17, 2010, at 4:56 AM, Henning Blohm wrote: >> > >> >> When running map reduce tasks in Hadoop I run into classpath issues. >> >> Contrary to previous posts, my problem is not that I am missing classes >> >> on >> >> the Task's class path (we have a perfect solution for that) but rather >> >> find >> >> too many (e.g. ECJ classes or jetty). >> > >> > The fact that you mention: >> > >> >> The libs in HADOOP_HOME/lib seem to contain everything needed to run >> >> anything in Hadoop which is, I assume, much more than is needed to run >> >> a map >> >> reduce task. >> > >> > hints that your perfect solution is to throw all your custom stuff in >> > lib. >> > If so, that's a huge mistake. Use distributed cache instead. >> > >> >> >> >> >> >> >> >> >> > >
