On Tue, Jun 14, 2011 at 11:57 AM, Prashant Sharma <[email protected]> wrote: > 1. Is there a difference between > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob and > org.apache.mahout.cf.taste.hadoop.psuedo.RecommenderJob apart from one being > for fully distributed and other for psuedo distributed mode. As one has an > implementation of recommendor class to run and the other similarity.
Yes that's the difference. The "pseudo-distributed" version isn't really a distributed algorithm. It's just splitting work among n non-distributed instances. Which could still be useful. > 2. There is no documentation given about the inbuilt similarity classes, can > you suggest me some reading which gives detail about the implementation of > those classes, also an example on how to write on of our own would be very > helpful. The book does a great job of exploring these differences, if I do say so myself! If you have more specific questions, you can ask here. The implementation is open source and in most cases a pretty straightforward implementation of the definition of various similarity metrics, which you can look up on Wikipedia. Sebastian -- Chapter 6 of the book never *quite* covered the actual distributed computaiton in Mahout. It is even too complex for one chapter of a book. It explains a somewhat simplified version of the computation as an intro to Mahout on Hadoop. In my opinion once you understand the simplified outline, what's in Mahout now is fairly clear from the docs. In any event, it's already more or less "to press" now. (We'll see: I can already smell a 2nd edition...)
