Mahout should work with any Hadoop 1 or 2 version. On Apr 26, 2015, at 9:22 PM, lastarsenal <lastarse...@163.com> wrote:
Thank your help. It's maybe for our hadoop system and classpath jar packages(may be the appache-cli version problem) were NOT compatible with the mahout. So, I re-rewrite the jobs In ItemSimilarityJob in my own project, then it works! 在 2015-04-16 21:21:06,"Pat Ferrel" <p...@occamsmachete.com> 写道: > As I said below “mahout itemsimilarity …” > > “mahout” will show a list of commands > “mahout itemsimilarity” will show the command help > > You are using HDFS and I suspect /home/hadoop/itembased/user_item is not a > valid HDFS path? If so put the data in HDFS and use that path. Usually no > need to specify the tmp dir. > > On Apr 14, 2015, at 9:05 PM, lastarsenal <lastarse...@163.com> wrote: > > Hi, Pat, > > > I have tried to give a minimum arguments form ItemSimilarityJob as below: > > > hadoop jar mahout-core-0.9-job.jar > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob -i > /home/hadoop/itembased/user_item -o /home/hadoop/itembased/output -s > SIMILARITY_EUCLIDEAN_DISTANCE > > > the argument parser error dismissed but another eorror came out: > Exception in thread "main" java.io.IOException: resolve path must start with > /, temp/prepareRatingMatrix/numUsers.bin > at org.apache.hadoop.fs.viewfs.MountTree.resolve(MountTree.java:272) > at org.apache.hadoop.fs.viewfs.ViewFs.open(ViewFs.java:139) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:394) > at org.apache.mahout.common.HadoopUtil.readInt(HadoopUtil.java:339) > at > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob.run(ItemSimilarityJob.java:147) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > at > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob.main(ItemSimilarityJob.java:93) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at org.apache.hadoop.util.RunJar.main(RunJar.java:166) > > > Then I tried to add --tempDir args: > hadoop jar mahout-core-0.9-job.jar > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob -i > /home/hadoop/itembased/user_item -o /home/hadoop/itembased/output -s > SIMILARITY_EUCLIDEAN_DISTANCE --tempDir=/tmp > The argument parser error was back: > ERROR common.AbstractJob: Unexpected --tempDir=/tmp while processing > Job-Specific Options: > Unexpected --tempDir=/tmp while processing Job-Specific Options: > > Usage: > > [--input <input> --output <output> --similarityClassname > <similarityClassname> > --maxSimilaritiesPerItem <maxSimilaritiesPerItem> --maxPrefs <maxPrefs> > > --minPrefsPerUser <minPrefsPerUser> --booleanData <booleanData> --threshold > > <threshold> --randomSeed <randomSeed> --help --tempDir <tempDir> --startPhase > > <startPhase> --endPhase <endPhase>] > > > So... Oh, you give advice to use command line: mahout xxx, however, there > is no mahout command, how can I solve it? > > > Thanks a lot! > > 在 2015-04-15 03:13:23,"Pat Ferrel" <p...@occamsmachete.com> 写道: > >> Also you don’t need to specify -mp 0 that is always allowed, you are >> specifying minimum if there are any and so -mp 0 is not valid, omit it. >> >> On Apr 14, 2015, at 11:59 AM, Pat Ferrel <p...@occamsmachete.com> wrote: >> >> use >> >> “mahout itemsimilarity …” >> >> But be aware that you have to convert all your user and item ids into >> non-negative ints. Basically inside Mahout-MapReduce they are assumed to be >> row and column numbers in a big matrix of all input. >> >> BTW no need to move data, Mahout-Spark reads anything Mahout-MapReduce can >> read without the ID restrictions. >> >> On Apr 12, 2015, at 8:04 PM, lastarsenal <lastarse...@163.com> wrote: >> >> Hi, Pat, >> I think it would better to follow the existing system instead of making a >> large scale data transfer. >> >> >> So, I will be very appreciated if somebody can give the advice based on >> hadoop, Thank you. >> >> >> >> >> >> 在 2015-04-13 00:33:48,"Pat Ferrel" <p...@occamsmachete.com> 写道: >>> You are invoking it incorrectly but I’d suggest using the newer Spark >>> version. It’s easier to use and about 10x faster. >>> >>> You’ll need to install Spark alongside Mahout then invoke with: >>> >>> mahout spark-itemsimilarity -i input -o output …. >>> >>> The driver is documented here: >>> http://mahout.apache.org/users/algorithms/intro-cooccurrence-spark.html >>> >>> >>> On Apr 11, 2015, at 12:34 AM, lastarsenal <lastarse...@163.com> wrote: >>> >>> Hi, >>> >>> I'm a rookie for mahout. Recently when I tried to run ItemSimilarityJob >>> with my own hadoop, I met a problem. The command is: >>> >>> hadoop jar mahout-core-0.9-job.jar >>> org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob -i >>> /home/hadoop/itembased/user_item -o /home/hadoop/itembased/output -s >>> SIMILARITY_EUCLIDEAN_DISTANCE -mp 0 -b true --startPhase 0 --endPhase 0 >>> >>> >>> There are 1 errors: >>> 15/04/10 15:06:02 ERROR common.AbstractJob: Unexpected 0 while processing >>> Job-Specific Options: >>> Unexpected 0 while processing Job-Specific Options: >>> >>> Usage: >>> >>> [--input <input> --output <output> --similarityClassname >>> <similarityClassname> >>> --maxSimilaritiesPerItem <maxSimilaritiesPerItem> --maxPrefs <maxPrefs> >>> >>> --minPrefsPerUser <minPrefsPerUser> --booleanData <booleanData> --threshold >>> >>> <threshold> --randomSeed <randomSeed> --help --tempDir <tempDir> >>> --startPhase >>> <startPhase> --endPhase <endPhase>] >>> >>> >>> What's the resaon for this situation? Thank you! >>> >>> >>> Best Regards, >>> lastarsenal >>> >> >> >