Author: buildbot
Date: Sun Sep 21 15:19:28 2014
New Revision: 923072

Log:
Staging update by buildbot for mahout

Modified:
    websites/staging/mahout/trunk/content/   (props changed)
    
websites/staging/mahout/trunk/content/users/recommender/intro-cooccurrence-spark.html

Propchange: websites/staging/mahout/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Sun Sep 21 15:19:28 2014
@@ -1 +1 @@
-1622719
+1626592

Modified: 
websites/staging/mahout/trunk/content/users/recommender/intro-cooccurrence-spark.html
==============================================================================
--- 
websites/staging/mahout/trunk/content/users/recommender/intro-cooccurrence-spark.html
 (original)
+++ 
websites/staging/mahout/trunk/content/users/recommender/intro-cooccurrence-spark.html
 Sun Sep 21 15:19:28 2014
@@ -348,8 +348,17 @@ to recommend.   </p>
 </pre></div>
 
 
-<h3 id="more-complex-input">More Complex Input</h3>
-<p>For input of the form:</p>
+<h3 id="how-to-use-multiple-user-actions">How to use Multiple User Actions</h3>
+<p>Often we record various actions the user takes for later analytics. These 
can now be used to make recommendations. 
+The idea of a recommender is to recommend the action you want the user to 
make. For an ecom app this might be 
+a purchase action. It is usually not a good idea to just treat other actions 
the same as the action you want to recommend. 
+For instance a view of an item does not indicate the same intent as a purchase 
and if you just mixed the two together you 
+might even make worse recommendations. It is tempting though since there are 
so many more views than purchases. With <em>spark-itemsimilarity</em>
+we can now use both actions. Mahout will use cross-action cooccurrence 
analysis to limit the views to ones that do predict purchases.
+We do this by treating the primary action (purchase) as data for the indicator 
matrix and use the secondary action (view) 
+to calculate the cross-indicator matrix.  </p>
+<p><em>spark-itemsimilarity</em> can read separate actions from separate files 
or from a mixed action log by filtering certain lines. For a mixed 
+action log of the form:</p>
 <div class="codehilite"><pre><span class="n">u1</span><span 
class="p">,</span><span class="n">purchase</span><span class="p">,</span><span 
class="n">iphone</span>
 <span class="n">u1</span><span class="p">,</span><span 
class="n">purchase</span><span class="p">,</span><span class="n">ipad</span>
 <span class="n">u2</span><span class="p">,</span><span 
class="n">purchase</span><span class="p">,</span><span class="n">nexus</span>
@@ -374,7 +383,7 @@ to recommend.   </p>
 
 
 <h3 id="command-line">Command Line</h3>
-<p>Use the following options can be used:</p>
+<p>Use the following options:</p>
 <div class="codehilite"><pre><span class="n">bash</span>$ <span 
class="n">mahout</span> <span class="n">spark</span><span 
class="o">-</span><span class="n">itemsimilarity</span> <span class="o">\</span>
     <span class="o">--</span><span class="n">input</span> <span 
class="n">in</span><span class="o">-</span><span class="n">file</span> <span 
class="o">\</span>     # <span class="n">where</span> <span class="n">to</span> 
<span class="n">look</span> <span class="k">for</span> <span 
class="n">data</span>
     <span class="o">--</span><span class="n">output</span> <span 
class="n">out</span><span class="o">-</span><span class="n">path</span> <span 
class="o">\</span>   # <span class="n">root</span> <span class="n">dir</span> 
<span class="k">for</span> <span class="n">output</span>
@@ -388,7 +397,8 @@ to recommend.   </p>
 
 
 <h3 id="output">Output</h3>
-<p>The output of the job will be the standard text version of two Mahout DRMs. 
This is a case where we are calculating cross-cooccurrence so a primary 
indicator matrix and cross-indicator matrix will be created</p>
+<p>The output of the job will be the standard text version of two Mahout DRMs. 
This is a case where we are calculating 
+cross-cooccurrence so a primary indicator matrix and cross-indicator matrix 
will be created</p>
 <div class="codehilite"><pre><span class="n">out</span><span 
class="o">-</span><span class="n">path</span>
   <span class="o">|--</span> <span class="n">indicator</span><span 
class="o">-</span><span class="n">matrix</span> <span class="o">-</span> <span 
class="n">TDF</span> <span class="n">part</span> <span class="n">files</span>
   <span class="o">\--</span> <span class="nb">cross</span><span 
class="o">-</span><span class="n">indicator</span><span class="o">-</span><span 
class="n">matrix</span> <span class="o">-</span> <span class="n">TDF</span> 
<span class="n">part</span><span class="o">-</span><span class="n">files</span>
@@ -413,6 +423,8 @@ to recommend.   </p>
 </pre></div>
 
 
+<p><strong>Note:</strong> You can run this multiple times to use more than two 
actions or you can use the underlying 
+SimilarityAnalysis.cooccurrence API, which will more efficiently calculate any 
number of cross-indicators.</p>
 <h3 id="log-file-input">Log File Input</h3>
 <p>A common method of storing data is in log files. If they are written using 
some delimiter they can be consumed directly by spark-itemsimilarity. For 
instance input of the form:</p>
 <div class="codehilite"><pre>2014<span class="o">-</span>06<span 
class="o">-</span>23 14<span class="p">:</span>46<span 
class="p">:</span>53<span class="p">.</span>115<span class="o">\</span><span 
class="n">tu1</span><span class="o">\</span><span 
class="n">tpurchase</span><span class="o">\</span><span 
class="n">trandom</span> <span class="n">text</span><span 
class="o">\</span><span class="n">tiphone</span>


Reply via email to