Author: apalumbo
Date: Sat Mar 21 21:19:13 2015
New Revision: 1668322

URL: http://svn.apache.org/r1668322
Log:
copy recommender/quickstart to algorithms dir, rename to recommender-overview

Added:
    
mahout/site/mahout_cms/trunk/content/users/algorithms/recommender-overview.mdtext
Modified:
    mahout/site/mahout_cms/trunk/templates/standard.html

Added: 
mahout/site/mahout_cms/trunk/content/users/algorithms/recommender-overview.mdtext
URL: 
http://svn.apache.org/viewvc/mahout/site/mahout_cms/trunk/content/users/algorithms/recommender-overview.mdtext?rev=1668322&view=auto
==============================================================================
--- 
mahout/site/mahout_cms/trunk/content/users/algorithms/recommender-overview.mdtext
 (added)
+++ 
mahout/site/mahout_cms/trunk/content/users/algorithms/recommender-overview.mdtext
 Sat Mar 21 21:19:13 2015
@@ -0,0 +1,27 @@
+Title: Recommender Quickstart
+
+# Recommender Overview
+
+Recommenders have changed over the years. Mahout contains a long list of them, 
which you can still use. But to get the best  out of our more modern aproach 
we'll need to think of the Recommender as a "model creation" 
component—supplied by Mahout's new spark-itemsimilarity job, and a 
"serving" component—supplied by a modern scalable search engine, like 
Solr.
+
+![image](http://i.imgur.com/fliHMBo.png)
+
+To integrate with your application you will collect user interactions storing 
them in a DB and also in a from usable by Mahout. The simplest way to do this 
is to log user interactions to csv files (user-id, item-id). The DB should be 
setup to contain the last n user interactions, which will form part of the 
query for recommendations.
+
+Mahout's spark-itemsimilarity will create a table of (item-id, 
list-of-similar-items) in csv form. Think of this as an item collection with 
one field containing the item-ids of similar items. Index this with your search 
engine. 
+
+When your application needs recommendations for a specific person, get the 
latest user history of interactions from the DB and query the indicator 
collection with this history. You will get back an ordered list of item-ids. 
These are your recommendations. You may wish to filter out any that the user 
has already seen but that will depend on your use case.
+
+All ids for users and items are preserved as string tokens and so work as an 
external key in DBs or as doc ids for search engines, they also work as tokens 
for search queries.
+
+##References
+
+1. A free ebook, which talks about the general idea: [Practical Machine 
Learning](https://www.mapr.com/practical-machine-learning)
+2. A slide deck, which talks about mixing actions or other indicators: 
[Creating a Multimodal Recommender with Mahout and a Search 
Engine](http://occamsmachete.com/ml/2014/10/07/creating-a-unified-recommender-with-mahout-and-a-search-engine/)
+3. Two blog posts: [What's New in Recommenders: part 
#1](http://occamsmachete.com/ml/2014/08/11/mahout-on-spark-whats-new-in-recommenders/)
+and  [What's New in Recommenders: part 
#2](http://occamsmachete.com/ml/2014/09/09/mahout-on-spark-whats-new-in-recommenders-part-2/)
+3. A post describing the loglikelihood ratio:  [Surprise and 
Coinsidense](http://tdunning.blogspot.com/2008/03/surprise-and-coincidence.html)
  LLR is used to reduce noise in the data while keeping the calculations O(n) 
complexity.
+
+##Mahout Model Creation
+
+See the page describing 
[*spark-itemsimilarity*](http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html)
 for more details.
\ No newline at end of file

Modified: mahout/site/mahout_cms/trunk/templates/standard.html
URL: 
http://svn.apache.org/viewvc/mahout/site/mahout_cms/trunk/templates/standard.html?rev=1668322&r1=1668321&r2=1668322&view=diff
==============================================================================
--- mahout/site/mahout_cms/trunk/templates/standard.html (original)
+++ mahout/site/mahout_cms/trunk/templates/standard.html Sat Mar 21 21:19:13 
2015
@@ -168,6 +168,7 @@
                   <li class="nav-header">Matrix Decomposition</li>
                   <li><a href="/users/algorithms/d-qr.html">Distributed 
QR</a></li>
                   <li class="nav-header">Recommendations</li>
+                  <li><a 
href="/users/algorithms/recommender-overview.html">Recommender Overview</a></li>
                   <li><a 
href="/users/algorithms/intro-cooccurrence-spark.html">Intro to 
cooccurrence-based<br/> recommendations with Spark</a></li>
                   <li class="nav-header">Classification</li>
                   <li><a href="/users/algorithms/spark-naive-bayes.html">Spark 
Naive Bayes</a></li>
@@ -199,7 +200,6 @@
                   <li><a href="/users/clustering/cluster-dumper.html">Cluster 
Dumper tool</a></li>
                   <li><a 
href="/users/clustering/visualizing-sample-clusters.html">Cluster 
visualisation</a></li>
                   <li class="nav-header">Recommendations</li>
-                  <li><a 
href="/users/recommender/quickstart.html">Quickstart</a></li>
                   <li><a 
href="/users/recommender/recommender-first-timer-faq.html">First Timer 
FAQ</a></li>
                   <li><a href="/users/recommender/userbased-5-minutes.html">A 
user-based recommender <br/>in 5 minutes</a></li>
                  <li><a 
href="/users/recommender/matrix-factorization.html">Matrix 
factorization-based<br/> recommenders</a></li>


Reply via email to