quickstart.mdtext

pat Sun, 08 Mar 2015 10:03:06 -0700

Author: pat
Date: Sun Mar  8 17:02:27 2015
New Revision: 1665051

URL: http://svn.apache.org/r1665051
Log:
CMS commit to mahout by pat


Modified:
    mahout/site/mahout_cms/trunk/content/users/recommender/quickstart.mdtext

Modified: 
mahout/site/mahout_cms/trunk/content/users/recommender/quickstart.mdtext
URL: 
http://svn.apache.org/viewvc/mahout/site/mahout_cms/trunk/content/users/recommender/quickstart.mdtext?rev=1665051&r1=1665050&r2=1665051&view=diff
==============================================================================
--- mahout/site/mahout_cms/trunk/content/users/recommender/quickstart.mdtext 
(original)
+++ mahout/site/mahout_cms/trunk/content/users/recommender/quickstart.mdtext 
Sun Mar  8 17:02:27 2015
@@ -1,12 +1,25 @@
 Title: Recommender Quickstart
 
-# Recommender Quickstart
+# Recommender Overview
 
-It's very easy to get started with Mahout's recommenders. You don't need to 
know and have Hadoop for this. Here we list resources that might be helpful for 
some first steps:
+Recommenders have changed over the years. Mahout contains a long list of them, 
which you can still use. But to get the best  out of our more modern aproach 
we'll need to think of the Recommender as a "model creation" 
component&mdash;supplied by Mahout's new spark-itemsimilarity job, and a 
"serving" component&mdash;supplied by a modern scalable search engine, like 
Solr.
 
- * Steve Cook created a [video 
tutorial](https://www.youtube.com/watch?v=yD40rVKUwPI) on how to create a 
simple item-based recommender from scratch using Eclipse. (Note that you can 
avoid manually downloading the library jars by including mahout as [maven 
dependency](/general/downloads.html) into your project). 
+![image](http://postimg.org/image/6yw9b3fdn/)
 
- * The paper [Collaborative Filtering with Apache 
Mahout](http://ssc.io/wp-content/uploads/2013/02/cf-mahout.pdf) by Sebastian 
Schelter and Sean Owen gives a short overview of Mahout's non-distributed 
recommenders and has pointers to research papers describing the underlying 
algorithms. 
+To integrate with your application you will collect user interactions storing 
them in a DB and also in a from usable by Mahout. The simplest way to do this 
is log interactions to csv files (user-id, item-id). The DB should be setup to 
contain the last n user interactions, which will form part of the query for 
recommendations.
 
- * For a more full featured Multimodal Recommender based on the newest Spark 
version of Mahout and integration with a 
-fast server using a search engine see references on the [Mahout 
site](http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html).
\ No newline at end of file
+Mahout's spark-itemsimilarity will create a table of (item-id, 
list-of-similar-items) in csv form. Think of this as an item collection with 
one field containing the item-ids of similar items. Index this with your search 
engine. 
+
+When your application needs recommendations for a specific person, get the 
latest user history of interactions from the DB and query the indicator 
collection with this history. You will get back an ordered list of item-ids. 
These are your recommendations. You may wish to filter out any that the user 
has already seen but that will depend on your use case.
+
+##References
+
+1. A free ebook, which talks about the general idea: [Practical Machine 
Learning](https://www.mapr.com/practical-machine-learning)
+2. A slide deck, which talks about mixing actions or other indicators: 
[Creating a Multimodal Recommender with Mahout and a Search 
Engine](http://occamsmachete.com/ml/2014/10/07/creating-a-unified-recommender-with-mahout-and-a-search-engine/)
+3. Two blog posts: [What's New in Recommenders: part 
#1](http://occamsmachete.com/ml/2014/08/11/mahout-on-spark-whats-new-in-recommenders/)
+and  [What's New in Recommenders: part 
#2](http://occamsmachete.com/ml/2014/09/09/mahout-on-spark-whats-new-in-recommenders-part-2/)
+3. A post describing the loglikelihood ratio:  [Surprise and 
Coinsidense](http://tdunning.blogspot.com/2008/03/surprise-and-coincidence.html)
  LLR is used to reduce noise in the data while keeping the calculations O(n) 
complexity.
+
+##Mahout Jobs
+
+See the page describing 
[*spark-itemsimilarity*](http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html)
 for more details.
\ No newline at end of file

svn commit: r1665051 - /mahout/site/mahout_cms/trunk/content/users/recommender/quickstart.mdtext

Reply via email to