Author: buildbot
Date: Sun Mar 8 20:57:32 2015
New Revision: 942911
Log:
Staging update by buildbot for mahout
Modified:
websites/staging/mahout/trunk/content/ (props changed)
websites/staging/mahout/trunk/content/users/recommender/quickstart.html
Propchange: websites/staging/mahout/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Sun Mar 8 20:57:32 2015
@@ -1 +1 @@
-1665057
+1665077
Modified:
websites/staging/mahout/trunk/content/users/recommender/quickstart.html
==============================================================================
--- websites/staging/mahout/trunk/content/users/recommender/quickstart.html
(original)
+++ websites/staging/mahout/trunk/content/users/recommender/quickstart.html Sun
Mar 8 20:57:32 2015
@@ -251,6 +251,7 @@
<p>To integrate with your application you will collect user interactions
storing them in a DB and also in a from usable by Mahout. The simplest way to
do this is log interactions to csv files (user-id, item-id). The DB should be
setup to contain the last n user interactions, which will form part of the
query for recommendations.</p>
<p>Mahout's spark-itemsimilarity will create a table of (item-id,
list-of-similar-items) in csv form. Think of this as an item collection with
one field containing the item-ids of similar items. Index this with your search
engine. </p>
<p>When your application needs recommendations for a specific person, get the
latest user history of interactions from the DB and query the indicator
collection with this history. You will get back an ordered list of item-ids.
These are your recommendations. You may wish to filter out any that the user
has already seen but that will depend on your use case.</p>
+<p>All ids for users and items are as preserved as string tokens and so work
as an external key in DBs or as doc ids for search engines, they also work as
tokens for search queries.</p>
<h2 id="references">References</h2>
<ol>
<li>A free ebook, which talks about the general idea: <a
href="https://www.mapr.com/practical-machine-learning">Practical Machine
Learning</a></li>
@@ -259,7 +260,7 @@
and <a
href="http://occamsmachete.com/ml/2014/09/09/mahout-on-spark-whats-new-in-recommenders-part-2/">What's
New in Recommenders: part #2</a></li>
<li>A post describing the loglikelihood ratio: <a
href="http://tdunning.blogspot.com/2008/03/surprise-and-coincidence.html">Surprise
and Coinsidense</a> LLR is used to reduce noise in the data while keeping the
calculations O(n) complexity.</li>
</ol>
-<h2 id="mahout-jobs">Mahout Jobs</h2>
+<h2 id="mahout-model-creation">Mahout Model Creation</h2>
<p>See the page describing <a
href="http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html"><em>spark-itemsimilarity</em></a>
for more details.</p>
</div>
</div>