Now that sounds like a problem then. I can guess what it is too. The script looks for job files in different places than the binary distribution places them, I believe. Let me open a JIRA for this; I bet you can find an answer quite quickly though if you debug the script a little and see where it looks for the job file vs where it really is. I imagine the script can just search more widely.
On Thu, Jun 9, 2011 at 3:38 PM, Mark <[email protected]> wrote: > Thanks for the explanation. I understand that hadoop needs all required > jars bundled together for it to work across nodes since they obviously will > need those dependencies. I also understand that binary distribution are > build from source but I'm still confused though why running seq2sparse using > the source distribution works while the bin distribution doesnt. > > For example I tried seq2sparse on the binary distribution using the > bin/mahout launcher and I receive: > > > 11/06/08 21:17:00 INFO mapred.JobClient: Task Id : > attempt_201106061352_0066_r_000001_1, Status : FAILED > Error: java.lang.ClassNotFoundException: > org.apache.lucene.analysis.TokenStream > > Same on the source distribution and everything will work as expected. > > > On 6/9/11 12:32 AM, Sean Owen wrote: > >> These aren't specific to Mahout. >> >> To run a Hadoop job, you have to give it all dependencies together. This >> is >> the error you're getting. To help with this, the distro has 'job' files >> with >> all dependencies packaged together for you. Your next error is just >> another >> symptom of this. No Hadoop job can run without its dependencies available >> on >> workers. >> >> Here as in most projects, the bin and src files are built from the same >> source. The difference is that bin contains the compiled artifacts and not >> the source code, and is "ready to run". In bin I believe the configuration >> is built into the compiled jar, yes. >> >> On Thu, Jun 9, 2011 at 2:18 AM, Mark<[email protected]> wrote: >> >> I explained in an earlier post that I was having problems running some >>> examples on a cluster when using the binary distribution. My cluster was >>> complaining about missing classes.. ie lucene analyzer and google >>> preconditions. However when I tried the same thing on a src distribution >>> (and after mvn package) I didn't receive those errors. >>> >>> How do the bin and src distributions differ? >>> >>> I also noticed that I was able to directly modify the >>> driver.classes.props >>> file using the src distribution and those changes were available >>> immediately. When I tried the same on the binary distribution my changes >>> never appeared??? Is this to be expected? >>> >>> Thanks for any clarifications. >>> >>>
