I just mean I'm not familiar with the particular code you are running, but, think the problem is to do with calling elastic-mapreduce in general, which has nothing to do with the JAR itself. Indeed there's nothing that indicates a problem with the JAR file. As I said on your other message, I think you need "--arg --foo=bar" not "--arg --foo bar".
utils has become integration but this is unrelated. On Wed, Dec 12, 2012 at 12:37 PM, hellen maziku <[email protected]> wrote: > Also, what do you mean by " don't know much about this particular job", does > the type of the job jar file matter? I thought as long as I could locate the > org.apache.mahout.utils.vectors.lucene.Driver class then I was good to use > that job jar file. > > Btw, whenever I installed and compiled mahout 0.7 (both from sorce and > trunk), I couldnot locate the mahout-utils-0.7 jar. Why is this so? > > Thank you again. > > > > ________________________________ > From: Sean Owen <[email protected]> > To: Mahout User List <[email protected]>; hellen maziku > <[email protected]> > Sent: Wednesday, December 12, 2012 6:05 AM > Subject: Re: Creating vectors from lucene index on EMR via the CLI > > I don't know much about this particular job, but the general problem > here is that you are passing arguments to a binary called > elastic-mapreduce, and not to the Java program. There is likely some > mechanism to package up arguments that need to be sent to the program, > as an argument to the elastic-mapreduce binary. > > On Wed, Dec 12, 2012 at 11:55 AM, hellen maziku <[email protected]> wrote: >> Hi, >> I installed mahout and solr. >> >> I created an index from the dictionary.txt using the command below >> >> curl "http://localhost:8983/solr/update/extract?literal.id=doc1&commit=true" >> -F "[email protected]" >> >> To create the vectors from my index >> >> I needed the org.apache.mahout.utils.vectors.lucene.Driver class. I >> couldnot locate this class in mahout-core-o.7-job.jar. I could only >> locate it from mahout-examples-0.7-job.jar, so I uploaded the >> mahout-examples-0.7-job.jar on an s3 bucket. >> >> I also uploaded the dictionary index on a separete s3 bucket. I created >> another bucket with two folders to store my dictOut and vectors. >> >> I created a job flow on the CLI >> >> /elastic-mapreduce --create --alive --log-uri s3n://mahout-output/logs/ >> --name dict_vectorize >> >> I added the step to vectorize my index using the following command >> ./elastic-mapreduce -j j-2NSJRI6N9EQJ4 --jar >> s3n://mahout-bucket/jars/mahout-examples-0.7-job.jar --main-class >> org.apache.mahout.utils.vectors.lucene.Driver --arg --dir >> s3n://mahout-input/input1/index/ --arg --field doc1 --arg --dictOut >> s3n://mahout-output/solr-dict-out/dict.txt --arg --output >> s3n://mahout-output/solr-vect-out/vectors >> >> >> But in the logs I get the following error >> >> 2012-12-12 09:37:17,883 ERROR org.apache.mahout.utils.vectors.lucene.Driver >> (main): Exception >> org.apache.commons.cli2.OptionException: Missing value(s) --dir >> at >> org.apache.commons.cli2.option.ArgumentImpl.validate(ArgumentImpl.java:241) >> at >> org.apache.commons.cli2.option.ParentImpl.validate(ParentImpl.java:124) >> at >> org.apache.commons.cli2.option.DefaultOption.validate(DefaultOption.java:176) >> at org.apache.commons.cli2.option.GroupImpl.validate(GroupImpl.java:265) >> at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:104) >> at >> org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:197) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:187) >> >> >> What am I doing wrong? >> Another question: what is the correct value of the --field argument, is it >> doc1 (the id) or dictionary(from the filename dictionary.txt). I am asking >> this becasue when I issue the querry with q=doc1 on solr I get no >> results. But when I issue the query with q=dictionary, I see my content. >> >> Thank you so much for help. I am a newbie, so please excuse my being too >> verbal.
