Author: buildbot
Date: Mon Apr 20 21:26:46 2015
New Revision: 948507

Log:
Staging update by buildbot for mahout

Modified:
    websites/staging/mahout/trunk/content/   (props changed)
    
websites/staging/mahout/trunk/content/users/environment/how-to-build-an-app.html

Propchange: websites/staging/mahout/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Mon Apr 20 21:26:46 2015
@@ -1 +1 @@
-1674987
+1674989

Modified: 
websites/staging/mahout/trunk/content/users/environment/how-to-build-an-app.html
==============================================================================
--- 
websites/staging/mahout/trunk/content/users/environment/how-to-build-an-app.html
 (original)
+++ 
websites/staging/mahout/trunk/content/users/environment/how-to-build-an-app.html
 Mon Apr 20 21:26:46 2015
@@ -261,7 +261,54 @@
 
   <div id="content-wrap" class="clearfix">
    <div id="main">
-    <h1 id="how-to-build-an-app">How to build an app</h1>
+    <h1 id="multiple-indicator-creation">Multiple Indicator Creation</h1>
+<p>This is an example of how to create more that two indictors from more than 
two user interaction types with Mahout. We will use very simple hand created 
example data for one might see in an ecommerce application. The application 
records three interactions for item-purchase, item-detail-view, and 
category-preference (search for or click on a category). </p>
+<p><em>spark-itemsimilarity</em> will handle two inputs but here we have three 
and rather than running <em>spark-itemsimilarity</em> twice we will create our 
own app to do it.</p>
+<h2 id="setup">Setup</h2>
+<p>In order to build and run the CooccurrenceDriver you need to install the 
following:</p>
+<ul>
+<li>Install the Java 7 JDK from Oracle. Mac users look here: <a 
href="http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html";>Java
 SE Development Kit 7u72</a>.</li>
+<li>Install sbt (simple build tool) 0.13.x for <a 
href="Installing-sbt-on-Mac.html">Mac</a>, <a 
href="Installing-sbt-on-Windows.html">Windows</a>,
+<a href="Installing-sbt-on-Linux.html">Linux</a>,  or
+<a href="Manual-Installation.html">manual installation</a>.</li>
+<li>Install <a 
href="http://mahout.apache.org/general/downloads.html";>Mahout</a>. Don't forget 
to setup MAHOUT_HOME and MAHOUT_LOCAL</li>
+</ul>
+<h2 id="build">Build</h2>
+<p>Building the examples from project's root folder:</p>
+<div class="codehilite"><pre>$ <span class="n">sbt</span> <span 
class="n">pack</span>
+</pre></div>
+
+
+<p>This will automatically set up some launcher scripts for the driver. To run 
execute</p>
+<div class="codehilite"><pre>$ <span class="n">target</span><span 
class="o">/</span><span class="n">pack</span><span class="o">/</span><span 
class="n">bin</span><span class="o">/</span><span class="n">cooc</span>
+</pre></div>
+
+
+<p>The driver will execute in Spark standalone mode one the provided sample 
data and output log information including various information about the input 
data. The output will be in 
/path/to/3-input-cooc/data/indicators/<em>indicator-type</em></p>
+<h2 id="cooccurrencedriver">CooccurrenceDriver</h2>
+<p>This driver takes three actions in three separate input files. The input is 
in tuple form (user-id,item-id) one per line. It calculates all cooccurrence 
and cross-cooccurrence indicators. The sample actions are trivial hand made 
examples with somewhat intuitive data.</p>
+<p>Actions:</p>
+<ol>
+<li><strong>Purchase</strong>: user purchases</li>
+<li><strong>View</strong>: user product details views</li>
+<li><strong>Category</strong>: user preference for category tags</li>
+</ol>
+<p>Indicators:</p>
+<ol>
+<li><strong>Purchase cooccurrence</strong>: may be interpretted as a list if 
similar items for each item. Similar in terms of which users purchased 
them.</li>
+<li><strong>View cross-cooccurrence</strong>: may be interpretted as a list of 
similar items in terms of which users viewed the item where the view led to a 
purchase.</li>
+<li><strong>Category cross-cooccurrence</strong>: may be interpretted as a 
list of similar categories in terms of which users preferred the category and 
this led to a purchase.</li>
+</ol>
+<h2 id="data">Data</h2>
+<p>Mahout has reader traits that will read text delimited files. Input for 
<em>spark-itemsimilarity</em> and this CooccurrenceDriver are tuples of 
(user-id,item-id) with one line per tuple. The inputs for CooccurrenceDriver 
are files but in <em>spark-itemsimilarity</em> they may be directories of 
"part-xxxxx" files. These can be found in the <code>data</code> directory.</p>
+<h2 id="using-a-debugger">Using a Debugger</h2>
+<p>To build and run this example in a debugger like IntelliJ IDEA. Install 
from the IntelliJ site and add the Scala plugin.</p>
+<p>Open IDEA and go to the menu File-&gt;New-&gt;Project from existing 
sources-&gt;SBT-&gt;/path/to/3-input-cooc. This will create an IDEA project 
from <code>build.sbt</code> in the root directory.</p>
+<p>At this point you may create a "Debug Configuration" to run. In the menu 
choose Run-&gt;Edit Configurations. Under "Default" choose "Application". In 
the dialog hit the elipsis button "..." to the right of "Environment Variables" 
and fill in your versions of JAVA_HOME, SPARK_HOME, and MAHOUT_HOME. In 
configuration editor under "Use classpath from" choose root-3-input-cooc 
module. </p>
+<p><img alt="image" src="http://mahout.apache.org/images/debug-config.png"; 
title="=400x" /></p>
+<p>Now choose "Application" in the left pane and hit the plus sign "+". give 
the config a name and hit the elipsis button to the right of the "Main class" 
field as shown.</p>
+<p><img alt="image" src="http://mahout.apache.org/images/debug-config-2.png"; 
title="=600x" /></p>
+<p>After setting breakpoints you are now ready to debug the configuration. Go 
to the Run-&gt;Debug... menu and pick your configuration. This will execute 
using a local standalone instance of Spark.</p>
    </div>
   </div>     
 </div> 


Reply via email to