http://git-wip-us.apache.org/repos/asf/incubator-predictionio-site/blob/138f9481/templates/ecommercerecommendation/dase/index.html
----------------------------------------------------------------------
diff --git a/templates/ecommercerecommendation/dase/index.html 
b/templates/ecommercerecommendation/dase/index.html
new file mode 100644
index 0000000..33c4da4
--- /dev/null
+++ b/templates/ecommercerecommendation/dase/index.html
@@ -0,0 +1,806 @@
+<!DOCTYPE html><html><head><title>DASE Components Explained (E-Commerce 
Recommendation)</title><meta charset="utf-8"/><meta content="IE=edge,chrome=1" 
http-equiv="X-UA-Compatible"/><meta name="viewport" 
content="width=device-width, initial-scale=1.0"/><meta class="swiftype" 
name="title" data-type="string" content="DASE Components Explained (E-Commerce 
Recommendation)"/><link rel="canonical" 
href="https://docs.prediction.io/templates/ecommercerecommendation/dase/"/><link
 href="/images/favicon/normal-b330020a.png" rel="shortcut icon"/><link 
href="/images/favicon/apple-c0febcf2.png" rel="apple-touch-icon"/><link 
href="//fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,800italic,400,300,600,700,800"
 rel="stylesheet"/><link 
href="//maxcdn.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.min.css" 
rel="stylesheet"/><link href="/stylesheets/application-a2a2f408.css" 
rel="stylesheet" type="text/css"/><script 
src="//cdnjs.cloudflare.com/ajax/libs/html5shiv
 /3.7.2/html5shiv.min.js"></script><script 
src="//cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script><script
 src="//use.typekit.net/pqo0itb.js"></script><script>try{Typekit.load({ async: 
true });}catch(e){}</script></head><body><div id="global"><header><div 
class="container" id="header-wrapper"><div class="row"><div 
class="col-sm-12"><div id="logo-wrapper"><span id="drawer-toggle"></span><a 
href="#"></a><a href="http://predictionio.incubator.apache.org/";><img 
alt="PredictionIO" id="logo" 
src="/images/logos/logo-ee2b9bb3.png"/></a></div><div id="menu-wrapper"><div 
id="pill-wrapper"><a class="pill left" 
href="/gallery/template-gallery">TEMPLATES</a> <a class="pill right" 
href="//github.com/apache/incubator-predictionio/">OPEN 
SOURCE</a></div></div><img class="mobile-search-bar-toggler hidden-md 
hidden-lg" 
src="/images/icons/search-glass-704bd4ff.png"/></div></div></div></header><div 
id="search-bar-row-wrapper"><div class="container-fluid" id="search-bar-ro
 w"><div class="row"><div class="col-md-9 col-sm-11 col-xs-11"><div 
class="hidden-md hidden-lg" id="mobile-page-heading-wrapper"><p>PredictionIO 
Docs</p><h4>DASE Components Explained (E-Commerce Recommendation)</h4></div><h4 
class="hidden-sm hidden-xs">PredictionIO Docs</h4></div><div class="col-md-3 
col-sm-1 col-xs-1 hidden-md hidden-lg"><img id="left-menu-indicator" 
src="/images/icons/down-arrow-dfe9f7fe.png"/></div><div class="col-md-3 
col-sm-12 col-xs-12 swiftype-wrapper"><div class="swiftype"><form 
class="search-form"><img class="search-box-toggler hidden-xs hidden-sm" 
src="/images/icons/search-glass-704bd4ff.png"/><div class="search-box"><img 
src="/images/icons/search-glass-704bd4ff.png"/><input type="text" 
id="st-search-input" class="st-search-input" placeholder="Search 
Doc..."/></div><img class="swiftype-row-hider hidden-md hidden-lg" 
src="/images/icons/drawer-toggle-active-fcbef12a.png"/></form></div></div><div 
class="mobile-left-menu-toggler hidden-md hidden-lg"></div></div
 ></div></div><div id="page" class="container-fluid"><div class="row"><div 
 >id="left-menu-wrapper" class="col-md-3"><nav id="nav-main"><ul><li 
 >class="level-1"><a class="expandible" href="/"><span>Apache PredictionIO 
 >(incubating) Documentation</span></a><ul><li class="level-2"><a class="final" 
 >href="/"><span>Welcome to Apache PredictionIO 
 >(incubating)</span></a></li></ul></li><li class="level-1"><a 
 >class="expandible" href="#"><span>Getting Started</span></a><ul><li 
 >class="level-2"><a class="final" href="/start/"><span>A Quick 
 >Intro</span></a></li><li class="level-2"><a class="final" 
 >href="/install/"><span>Installing Apache PredictionIO 
 >(incubating)</span></a></li><li class="level-2"><a class="final" 
 >href="/start/download/"><span>Downloading an Engine 
 >Template</span></a></li><li class="level-2"><a class="final" 
 >href="/start/deploy/"><span>Deploying Your First Engine</span></a></li><li 
 >class="level-2"><a class="final" href="/start/customize/"><span>Customizing 
 >the Engine</span></a></li><
 /ul></li><li class="level-1"><a class="expandible" href="#"><span>Integrating 
with Your App</span></a><ul><li class="level-2"><a class="final" 
href="/appintegration/"><span>App Integration Overview</span></a></li><li 
class="level-2"><a class="expandible" href="/sdk/"><span>List of 
SDKs</span></a><ul><li class="level-3"><a class="final" 
href="/sdk/java/"><span>Java & Android SDK</span></a></li><li 
class="level-3"><a class="final" href="/sdk/php/"><span>PHP 
SDK</span></a></li><li class="level-3"><a class="final" 
href="/sdk/python/"><span>Python SDK</span></a></li><li class="level-3"><a 
class="final" href="/sdk/ruby/"><span>Ruby SDK</span></a></li><li 
class="level-3"><a class="final" href="/sdk/community/"><span>Community Powered 
SDKs</span></a></li></ul></li></ul></li><li class="level-1"><a 
class="expandible" href="#"><span>Deploying an Engine</span></a><ul><li 
class="level-2"><a class="final" href="/deploy/"><span>Deploying as a Web 
Service</span></a></li><li class="level-2"><a class
 ="final" href="/cli/#engine-commands"><span>Engine Command-line 
Interface</span></a></li><li class="level-2"><a class="final" 
href="/deploy/monitoring/"><span>Monitoring Engine</span></a></li><li 
class="level-2"><a class="final" href="/deploy/engineparams/"><span>Setting 
Engine Parameters</span></a></li><li class="level-2"><a class="final" 
href="/deploy/enginevariants/"><span>Deploying Multiple Engine 
Variants</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Customizing an Engine</span></a><ul><li class="level-2"><a 
class="final" href="/customize/"><span>Learning DASE</span></a></li><li 
class="level-2"><a class="final" href="/customize/dase/"><span>Implement 
DASE</span></a></li><li class="level-2"><a class="final" 
href="/customize/troubleshooting/"><span>Troubleshooting Engine 
Development</span></a></li><li class="level-2"><a class="final" 
href="/api/current/#package"><span>Engine Scala 
APIs</span></a></li></ul></li><li class="level-1"><a class="expa
 ndible" href="#"><span>Collecting and Analyzing Data</span></a><ul><li 
class="level-2"><a class="final" href="/datacollection/"><span>Event Server 
Overview</span></a></li><li class="level-2"><a class="final" 
href="/cli/#event-server-commands"><span>Event Server Command-line 
Interface</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/eventapi/"><span>Collecting Data with 
REST/SDKs</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/eventmodel/"><span>Events Modeling</span></a></li><li 
class="level-2"><a class="final" 
href="/datacollection/webhooks/"><span>Unifying Multichannel Data with 
Webhooks</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/channel/"><span>Channel</span></a></li><li 
class="level-2"><a class="final" 
href="/datacollection/batchimport/"><span>Importing Data in 
Batch</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/analytics/"><span>Using Analytics 
Tools</span></a></li></ul
 ></li><li class="level-1"><a class="expandible" href="#"><span>Choosing an 
 >Algorithm(s)</span></a><ul><li class="level-2"><a class="final" 
 >href="/algorithm/"><span>Built-in Algorithm Libraries</span></a></li><li 
 >class="level-2"><a class="final" href="/algorithm/switch/"><span>Switching to 
 >Another Algorithm</span></a></li><li class="level-2"><a class="final" 
 >href="/algorithm/multiple/"><span>Combining Multiple 
 >Algorithms</span></a></li><li class="level-2"><a class="final" 
 >href="/algorithm/custom/"><span>Adding Your Own 
 >Algorithms</span></a></li></ul></li><li class="level-1"><a class="expandible" 
 >href="#"><span>ML Tuning and Evaluation</span></a><ul><li class="level-2"><a 
 >class="final" href="/evaluation/"><span>Overview</span></a></li><li 
 >class="level-2"><a class="final" 
 >href="/evaluation/paramtuning/"><span>Hyperparameter 
 >Tuning</span></a></li><li class="level-2"><a class="final" 
 >href="/evaluation/evaluationdashboard/"><span>Evaluation 
 >Dashboard</span></a></li><li class="level-2"><a 
 class="final" href="/evaluation/metricchoose/"><span>Choosing Evaluation 
Metrics</span></a></li><li class="level-2"><a class="final" 
href="/evaluation/metricbuild/"><span>Building Evaluation 
Metrics</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>System Architecture</span></a><ul><li class="level-2"><a 
class="final" href="/system/"><span>Architecture Overview</span></a></li><li 
class="level-2"><a class="final" href="/system/anotherdatastore/"><span>Using 
Another Data Store</span></a></li></ul></li><li class="level-1"><a 
class="expandible" href="#"><span>Engine Template Gallery</span></a><ul><li 
class="level-2"><a class="final" 
href="/gallery/template-gallery/"><span>Browse</span></a></li><li 
class="level-2"><a class="final" 
href="/community/submit-template/"><span>Submit your Engine as a 
Template</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Demo Tutorials</span></a><ul><li class="level-2"><a 
class="final" href="/
 demo/tapster/"><span>Comics Recommendation Demo</span></a></li><li 
class="level-2"><a class="final" href="/demo/community/"><span>Community 
Contributed Demo</span></a></li><li class="level-2"><a class="final" 
href="/demo/textclassification/"><span>Text Classification Engine 
Tutorial</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="/community/"><span>Getting Involved</span></a><ul><li class="level-2"><a 
class="final" href="/community/contribute-code/"><span>Contribute 
Code</span></a></li><li class="level-2"><a class="final" 
href="/community/contribute-documentation/"><span>Contribute 
Documentation</span></a></li><li class="level-2"><a class="final" 
href="/community/contribute-sdk/"><span>Contribute a SDK</span></a></li><li 
class="level-2"><a class="final" 
href="/community/contribute-webhook/"><span>Contribute a 
Webhook</span></a></li><li class="level-2"><a class="final" 
href="/community/projects/"><span>Community 
Projects</span></a></li></ul></li><li class="le
 vel-1"><a class="expandible" href="#"><span>Getting Help</span></a><ul><li 
class="level-2"><a class="final" 
href="/resources/faq/"><span>FAQs</span></a></li><li class="level-2"><a 
class="final" href="/support/"><span>Support</span></a></li></ul></li><li 
class="level-1"><a class="expandible" 
href="#"><span>Resources</span></a><ul><li class="level-2"><a class="final" 
href="/resources/intellij/"><span>Developing Engines with IntelliJ 
IDEA</span></a></li><li class="level-2"><a class="final" 
href="/resources/upgrade/"><span>Upgrade Instructions</span></a></li><li 
class="level-2"><a class="final" 
href="/resources/glossary/"><span>Glossary</span></a></li></ul></li></ul></nav></div><div
 class="col-md-9 col-sm-12"><div class="content-header hidden-md 
hidden-lg"><div id="page-title"><h1>DASE Components Explained (E-Commerce 
Recommendation)</h1></div></div><div id="table-of-content-wrapper"><h5>On this 
page</h5><aside id="table-of-contents"><ul> <li> <a 
href="#the-engine-design">The Engine Des
 ign</a> </li> <li> <a href="#data">Data</a> </li> <li> <a 
href="#algorithm">Algorithm</a> </li> <li> <a href="#serving">Serving</a> </li> 
</ul> </aside><hr/><a id="edit-page-link" 
href="https://github.com/apache/incubator-predictionio/tree/livedoc/docs/manual/source/templates/ecommercerecommendation/dase.html.md.erb";><img
 src="/images/icons/edit-pencil-d6c1bb3d.png"/>Edit this page</a></div><div 
class="content-header hidden-sm hidden-xs"><div id="page-title"><h1>DASE 
Components Explained (E-Commerce Recommendation)</h1></div></div><div 
class="content"><p>PredictionIO&#39;s DASE architecture brings the 
separation-of-concerns design principle to predictive engine development. DASE 
stands for the following components of an engine:</p> <ul> 
<li><strong>D</strong>ata - includes Data Source and Data Preparator</li> 
<li><strong>A</strong>lgorithm(s)</li> <li><strong>S</strong>erving</li> 
<li><strong>E</strong>valuator</li> </ul> <p><p>Let&#39;s look at the code and 
see how you can customiz
 e the engine you built from the E-Commerce Recommendation Engine 
Template.</p><div class="alert-message note"><p>Evaluator will not be covered 
in this tutorial.</p></div></p><h2 id='the-engine-design' 
class='header-anchors'>The Engine Design</h2><p>As you can see from the Quick 
Start, <em>MyECommerceRecommendation</em> takes a JSON prediction query, e.g. 
<code>{ &quot;user&quot;: &quot;u1&quot;, &quot;num&quot;: 4 }</code>, and 
return a JSON predicted result. In 
MyECommerceRecommendation/src/main/scala/<strong><em>Engine.scala</em></strong>,
 the <code>Query</code> case class defines the format of such 
<strong>query</strong>:</p><div class="highlight scala"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7</pre></td><td class="code"><pre><span class="k">case</span> <span 
class="k">class</span> <span class="nc">Query</span><span class="o">(</span>
+  <span class="n">user</span><span class="k">:</span> <span 
class="kt">String</span><span class="o">,</span>
+  <span class="n">num</span><span class="k">:</span> <span 
class="kt">Int</span><span class="o">,</span>
+  <span class="n">categories</span><span class="k">:</span> <span 
class="kt">Option</span><span class="o">[</span><span 
class="kt">Set</span><span class="o">[</span><span 
class="kt">String</span><span class="o">]],</span>
+  <span class="n">whiteList</span><span class="k">:</span> <span 
class="kt">Option</span><span class="o">[</span><span 
class="kt">Set</span><span class="o">[</span><span 
class="kt">String</span><span class="o">]],</span>
+  <span class="n">blackList</span><span class="k">:</span> <span 
class="kt">Option</span><span class="o">[</span><span 
class="kt">Set</span><span class="o">[</span><span 
class="kt">String</span><span class="o">]]</span>
+<span class="o">)</span> <span class="k">extends</span> <span 
class="nc">Serializable</span>
+</pre></td></tr></tbody></table> </div> <p>The <code>PredictedResult</code> 
case class defines the format of <strong>predicted result</strong>, such 
as</p><div class="highlight json"><table style="border-spacing: 
0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre 
class="lineno">1
+2
+3
+4
+5
+6</pre></td><td class="code"><pre><span class="p">{</span><span 
class="s2">"itemScores"</span><span class="p">:[</span><span class="w">
+  </span><span class="p">{</span><span class="s2">"item"</span><span 
class="p">:</span><span class="mi">22</span><span class="p">,</span><span 
class="s2">"score"</span><span class="p">:</span><span 
class="mf">4.07</span><span class="p">},</span><span class="w">
+  </span><span class="p">{</span><span class="s2">"item"</span><span 
class="p">:</span><span class="mi">62</span><span class="p">,</span><span 
class="s2">"score"</span><span class="p">:</span><span 
class="mf">4.05</span><span class="p">},</span><span class="w">
+  </span><span class="p">{</span><span class="s2">"item"</span><span 
class="p">:</span><span class="mi">75</span><span class="p">,</span><span 
class="s2">"score"</span><span class="p">:</span><span 
class="mf">4.04</span><span class="p">},</span><span class="w">
+  </span><span class="p">{</span><span class="s2">"item"</span><span 
class="p">:</span><span class="mi">68</span><span class="p">,</span><span 
class="s2">"score"</span><span class="p">:</span><span 
class="mf">3.81</span><span class="p">}</span><span class="w">
+</span><span class="p">]}</span><span class="w">
+</span></pre></td></tr></tbody></table> </div> <p>with:</p><div 
class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8</pre></td><td class="code"><pre><span class="k">case</span> <span 
class="k">class</span> <span class="nc">PredictedResult</span><span 
class="o">(</span>
+  <span class="n">itemScores</span><span class="k">:</span> <span 
class="kt">Array</span><span class="o">[</span><span 
class="kt">ItemScore</span><span class="o">]</span>
+<span class="o">)</span> <span class="k">extends</span> <span 
class="nc">Serializable</span>
+
+<span class="k">case</span> <span class="k">class</span> <span 
class="nc">ItemScore</span><span class="o">(</span>
+  <span class="n">item</span><span class="k">:</span> <span 
class="kt">String</span><span class="o">,</span>
+  <span class="n">score</span><span class="k">:</span> <span 
class="kt">Double</span>
+<span class="o">)</span> <span class="k">extends</span> <span 
class="nc">Serializable</span>
+</pre></td></tr></tbody></table> </div> <p>Finally, 
<code>ECommerceRecommendationEngine</code> is the <em>Engine Factory</em> that 
defines the components this engine will use: Data Source, Data Preparator, 
Algorithm(s) and Serving components.</p><div class="highlight scala"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9</pre></td><td class="code"><pre><span class="k">object</span> <span 
class="nc">ECommerceRecommendationEngine</span> <span class="k">extends</span> 
<span class="nc">IEngineFactory</span> <span class="o">{</span>
+  <span class="k">def</span> <span class="n">apply</span><span 
class="o">()</span> <span class="k">=</span> <span class="o">{</span>
+    <span class="k">new</span> <span class="nc">Engine</span><span 
class="o">(</span>
+      <span class="n">classOf</span><span class="o">[</span><span 
class="kt">DataSource</span><span class="o">],</span>
+      <span class="n">classOf</span><span class="o">[</span><span 
class="kt">Preparator</span><span class="o">],</span>
+      <span class="nc">Map</span><span class="o">(</span><span 
class="s">"ecomm"</span> <span class="o">-&gt;</span> <span 
class="n">classOf</span><span class="o">[</span><span 
class="kt">ECommAlgorithm</span><span class="o">]),</span>
+      <span class="n">classOf</span><span class="o">[</span><span 
class="kt">Serving</span><span class="o">])</span>
+  <span class="o">}</span>
+<span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <h3 id='spark-mllib' 
class='header-anchors'>Spark MLlib</h3><p>The PredictionIO E-Commerce 
Recommendation Engine Template integrates Spark&#39;s MLlib ALS algorithm under 
the DASE architecture. We will take a closer look at the DASE code 
below.</p><p>The MLlib ALS algorithm takes training data of RDD type, i.e. 
<code>RDD[Rating]</code> and train a model, which is a 
<code>MatrixFactorizationModel</code> object.</p><p>You can visit <a 
href="https://spark.apache.org/docs/latest/mllib-collaborative-filtering.html";>here</a>
 to learn more about MLlib&#39;s ALS collaborative filtering algorithm.</p><h2 
id='data' class='header-anchors'>Data</h2><p>In the DASE architecture, data is 
prepared by 2 components sequentially: <em>DataSource</em> and 
<em>DataPreparator</em>. They take data from the data store and prepare them 
for Algorithm.</p><h3 id='data-source' class='header-anchors'>Data 
Source</h3><p>In MyECommerceRecommendation/src/main/scala/<strong><e
 m>DataSource.scala</em></strong>, the <code>readTraining</code> method of 
class <code>DataSource</code> reads and selects data from the <em>Event 
Store</em> (data store of the <em>Event Server</em>). It returns 
<code>TrainingData</code>.</p><div class="highlight scala"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19
+20
+21
+22
+23
+24
+25
+26
+27
+28
+29
+30
+31
+32
+33
+34</pre></td><td class="code"><pre><span class="k">case</span> <span 
class="k">class</span> <span class="nc">DataSourceParams</span><span 
class="o">(</span><span class="n">appName</span><span class="k">:</span> <span 
class="kt">String</span><span class="o">)</span> <span class="k">extends</span> 
<span class="nc">Params</span>
+
+<span class="k">class</span> <span class="nc">DataSource</span><span 
class="o">(</span><span class="k">val</span> <span class="n">dsp</span><span 
class="k">:</span> <span class="kt">DataSourceParams</span><span 
class="o">)</span>
+  <span class="k">extends</span> <span class="nc">PDataSource</span><span 
class="o">[</span><span class="kt">TrainingData</span>,
+      <span class="kt">EmptyEvaluationInfo</span>, <span 
class="kt">Query</span>, <span class="kt">EmptyActualResult</span><span 
class="o">]</span> <span class="o">{</span>
+
+  <span class="nd">@transient</span> <span class="k">lazy</span> <span 
class="k">val</span> <span class="n">logger</span> <span class="k">=</span> 
<span class="nc">Logger</span><span class="o">[</span><span 
class="kt">this.</span><span class="k">type</span><span class="o">]</span>
+
+  <span class="k">override</span>
+  <span class="k">def</span> <span class="n">readTraining</span><span 
class="o">(</span><span class="n">sc</span><span class="k">:</span> <span 
class="kt">SparkContext</span><span class="o">)</span><span class="k">:</span> 
<span class="kt">TrainingData</span> <span class="o">=</span> <span 
class="o">{</span>
+
+    <span class="c1">// create a RDD of (entityID, User)
+</span>    <span class="k">val</span> <span class="n">usersRDD</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span 
class="kt">String</span>, <span class="kt">User</span><span class="o">)]</span> 
<span class="k">=</span> <span class="nc">PEventStore</span><span 
class="o">.</span><span class="n">aggregateProperties</span><span 
class="o">(...)</span> <span class="o">...</span>
+
+    <span class="c1">// create a RDD of (entityID, Item)
+</span>    <span class="k">val</span> <span class="n">itemsRDD</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span 
class="kt">String</span>, <span class="kt">Item</span><span class="o">)]</span> 
<span class="k">=</span> <span class="nc">PEventStore</span><span 
class="o">.</span><span class="n">aggregateProperties</span><span 
class="o">(...)</span> <span class="o">...</span>
+
+    <span class="c1">// get all "user" "view" or "buy" "item" events from 
event store
+</span>    <span class="k">val</span> <span class="n">eventsRDD</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span 
class="kt">Event</span><span class="o">]</span> <span class="k">=</span> <span 
class="nc">PEventStore</span><span class="o">.</span><span 
class="n">find</span><span class="o">(...)</span> <span class="o">...</span>
+
+    <span class="c1">// filter all view events
+</span>    <span class="k">val</span> <span 
class="n">viewEventsRDD</span><span class="k">:</span> <span 
class="kt">RDD</span><span class="o">[</span><span 
class="kt">ViewEvent</span><span class="o">]</span> <span class="k">=</span> 
<span class="n">eventsRDD</span><span class="o">.</span><span 
class="n">filter</span> <span class="o">{</span> <span class="o">...</span> 
<span class="o">}</span> <span class="o">...</span>
+
+    <span class="c1">// filter all buy events
+</span>    <span class="k">val</span> <span class="n">buyEventsRDD</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span 
class="kt">BuyEvent</span><span class="o">]</span> <span class="k">=</span> 
<span class="n">eventsRDD</span><span class="o">.</span><span 
class="n">filter</span> <span class="o">{</span> <span class="o">...}</span> 
<span class="o">...</span>
+
+    <span class="k">new</span> <span class="nc">TrainingData</span><span 
class="o">(</span>
+      <span class="n">users</span> <span class="k">=</span> <span 
class="n">usersRDD</span><span class="o">,</span>
+      <span class="n">items</span> <span class="k">=</span> <span 
class="n">itemsRDD</span><span class="o">,</span>
+      <span class="n">viewEvents</span> <span class="k">=</span> <span 
class="n">viewEventsRDD</span><span class="o">,</span>
+      <span class="n">buyEvents</span> <span class="k">=</span> <span 
class="n">buyEventsRDD</span>
+    <span class="o">)</span>
+  <span class="o">}</span>
+<span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>PredictionIO automatically loads 
the parameters of <em>datasource</em> specified in 
MyECommerceRecommendation/<strong><em>engine.json</em></strong>, including 
<em>appName</em>, to <code>dsp</code>.</p><p>In 
<strong><em>engine.json</em></strong>:</p><div class="highlight shell"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9</pre></td><td class="code"><pre><span class="o">{</span>
+  ...
+  <span class="s2">"datasource"</span>: <span class="o">{</span>
+    <span class="s2">"params"</span> : <span class="o">{</span>
+      <span class="s2">"appName"</span>: <span class="s2">"MyApp1"</span>
+    <span class="o">}</span>
+  <span class="o">}</span>,
+  ...
+<span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>In <code>readTraining()</code>, 
<code>PEventStore</code> is an object which provides function to access dataa 
that is collected by PredictionIO Event Server.</p><p>This E-Commerce 
Recommendation Engine Template requires &quot;user&quot; and &quot;item&quot; 
entities that are set by 
events.</p><p><code>PEventStore.aggregateProperties(...)</code> aggregates 
properties of the <code>user</code> and <code>item</code> that are set, unset, 
or delete by special events <strong>$set</strong>, <strong>$unset</strong> and 
<strong>$delete</strong>. Please refer to <a 
href="/datacollection/eventapi/#note-about-properties">Event API</a> for more 
details of using these events.</p><p>The following code aggregates the 
properties of <code>user</code> and then map each result to a 
<code>User()</code> object.</p><div class="highlight scala"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18</pre></td><td class="code"><pre>
+  <span class="c1">// create a RDD of (entityID, User)
+</span>  <span class="k">val</span> <span class="n">usersRDD</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span 
class="kt">String</span>, <span class="kt">User</span><span class="o">)]</span> 
<span class="k">=</span> <span class="nc">PEventStore</span><span 
class="o">.</span><span class="n">aggregateProperties</span><span 
class="o">(</span>
+    <span class="n">appName</span> <span class="k">=</span> <span 
class="n">dsp</span><span class="o">.</span><span class="n">appName</span><span 
class="o">,</span>
+    <span class="n">entityType</span> <span class="k">=</span> <span 
class="s">"user"</span>
+  <span class="o">)(</span><span class="n">sc</span><span 
class="o">).</span><span class="n">map</span> <span class="o">{</span> <span 
class="k">case</span> <span class="o">(</span><span 
class="n">entityId</span><span class="o">,</span> <span 
class="n">properties</span><span class="o">)</span> <span class="k">=&gt;</span>
+    <span class="k">val</span> <span class="n">user</span> <span 
class="k">=</span> <span class="k">try</span> <span class="o">{</span>
+      <span class="nc">User</span><span class="o">()</span>
+    <span class="o">}</span> <span class="k">catch</span> <span 
class="o">{</span>
+      <span class="k">case</span> <span class="n">e</span><span 
class="k">:</span> <span class="kt">Exception</span> <span 
class="o">=&gt;</span> <span class="o">{</span>
+        <span class="n">logger</span><span class="o">.</span><span 
class="n">error</span><span class="o">(</span><span class="n">s</span><span 
class="s">"Failed to get properties ${properties} of"</span> <span 
class="o">+</span>
+          <span class="n">s</span><span class="s">" user ${entityId}. 
Exception: ${e}."</span><span class="o">)</span>
+        <span class="k">throw</span> <span class="n">e</span>
+      <span class="o">}</span>
+    <span class="o">}</span>
+    <span class="o">(</span><span class="n">entityId</span><span 
class="o">,</span> <span class="n">user</span><span class="o">)</span>
+  <span class="o">}.</span><span class="n">cache</span><span 
class="o">()</span>
+
+</pre></td></tr></tbody></table> </div> <p>In the template, 
<code>User()</code> object is a simple dummy as a placeholder for you to 
customize and expand.</p><p>Similarly, the following code aggregates 
<code>item</code> properties and then map each result to an <code>Item()</code> 
object. By default, this template assumes each item has an optional property 
<code>categories</code>, which is a list of String.</p><div class="highlight 
scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" 
style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17</pre></td><td class="code"><pre>  <span class="c1">// create a RDD of 
(entityID, Item)
+</span>  <span class="k">val</span> <span class="n">itemsRDD</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span 
class="kt">String</span>, <span class="kt">Item</span><span class="o">)]</span> 
<span class="k">=</span> <span class="nc">PEventStore</span><span 
class="o">.</span><span class="n">aggregateProperties</span><span 
class="o">(</span>
+    <span class="n">appName</span> <span class="k">=</span> <span 
class="n">dsp</span><span class="o">.</span><span class="n">appName</span><span 
class="o">,</span>
+    <span class="n">entityType</span> <span class="k">=</span> <span 
class="s">"item"</span>
+  <span class="o">)(</span><span class="n">sc</span><span 
class="o">).</span><span class="n">map</span> <span class="o">{</span> <span 
class="k">case</span> <span class="o">(</span><span 
class="n">entityId</span><span class="o">,</span> <span 
class="n">properties</span><span class="o">)</span> <span class="k">=&gt;</span>
+    <span class="k">val</span> <span class="n">item</span> <span 
class="k">=</span> <span class="k">try</span> <span class="o">{</span>
+      <span class="c1">// Assume categories is optional property of item.
+</span>      <span class="nc">Item</span><span class="o">(</span><span 
class="n">categories</span> <span class="k">=</span> <span 
class="n">properties</span><span class="o">.</span><span 
class="n">getOpt</span><span class="o">[</span><span 
class="kt">List</span><span class="o">[</span><span 
class="kt">String</span><span class="o">]](</span><span 
class="s">"categories"</span><span class="o">))</span>
+    <span class="o">}</span> <span class="k">catch</span> <span 
class="o">{</span>
+      <span class="k">case</span> <span class="n">e</span><span 
class="k">:</span> <span class="kt">Exception</span> <span 
class="o">=&gt;</span> <span class="o">{</span>
+        <span class="n">logger</span><span class="o">.</span><span 
class="n">error</span><span class="o">(</span><span class="n">s</span><span 
class="s">"Failed to get properties ${properties} of"</span> <span 
class="o">+</span>
+          <span class="n">s</span><span class="s">" item ${entityId}. 
Exception: ${e}."</span><span class="o">)</span>
+        <span class="k">throw</span> <span class="n">e</span>
+      <span class="o">}</span>
+    <span class="o">}</span>
+    <span class="o">(</span><span class="n">entityId</span><span 
class="o">,</span> <span class="n">item</span><span class="o">)</span>
+  <span class="o">}.</span><span class="n">cache</span><span 
class="o">()</span>
+</pre></td></tr></tbody></table> </div> <p>The <code>Item</code> case class is 
defined as</p><div class="highlight scala"><table style="border-spacing: 
0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre 
class="lineno">1</pre></td><td class="code"><pre><span class="k">case</span> 
<span class="k">class</span> <span class="nc">Item</span><span 
class="o">(</span><span class="n">categories</span><span class="k">:</span> 
<span class="kt">Option</span><span class="o">[</span><span 
class="kt">List</span><span class="o">[</span><span 
class="kt">String</span><span class="o">]])</span>
+</pre></td></tr></tbody></table> </div> <p><code>PEventStore.find(...)</code> 
specifies the events that you want to read. In this case, &quot;user view 
item&quot; and &quot;user buy item&quot; events are read</p><div 
class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10</pre></td><td class="code"><pre>
+  <span class="c1">// get all "user" "view" "item" events
+</span>  <span class="k">val</span> <span class="n">eventsRDD</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span 
class="kt">Event</span><span class="o">]</span> <span class="k">=</span> <span 
class="nc">PEventStore</span><span class="o">.</span><span 
class="n">find</span><span class="o">(</span>
+      <span class="n">appName</span> <span class="k">=</span> <span 
class="n">dsp</span><span class="o">.</span><span class="n">appName</span><span 
class="o">,</span>
+      <span class="n">entityType</span> <span class="k">=</span> <span 
class="nc">Some</span><span class="o">(</span><span 
class="s">"user"</span><span class="o">),</span>
+      <span class="n">eventNames</span> <span class="k">=</span> <span 
class="nc">Some</span><span class="o">(</span><span class="nc">List</span><span 
class="o">(</span><span class="s">"view"</span><span class="o">,</span> <span 
class="s">"buy"</span><span class="o">)),</span>
+      <span class="c1">// targetEntityType is optional field of an event.
+</span>      <span class="n">targetEntityType</span> <span class="k">=</span> 
<span class="nc">Some</span><span class="o">(</span><span 
class="nc">Some</span><span class="o">(</span><span 
class="s">"item"</span><span class="o">)))(</span><span 
class="n">sc</span><span class="o">)</span>
+      <span class="o">.</span><span class="n">cache</span><span 
class="o">()</span>
+
+</pre></td></tr></tbody></table> </div> <p>Note that <code>.cache()</code> is 
used to cache the RDD data into memory since eventsRDD will be used multiple 
times later.</p><p>Then we filter the events we are intersted in and map the 
event to a <code>ViewEvent</code> object.</p><div class="highlight 
scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" 
style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17</pre></td><td class="code"><pre>
+  <span class="k">val</span> <span class="n">viewEventsRDD</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span 
class="kt">ViewEvent</span><span class="o">]</span> <span class="k">=</span> 
<span class="n">eventsRDD</span>
+      <span class="o">.</span><span class="n">filter</span> <span 
class="o">{</span> <span class="n">event</span> <span class="k">=&gt;</span> 
<span class="n">event</span><span class="o">.</span><span 
class="n">event</span> <span class="o">==</span> <span class="s">"view"</span> 
<span class="o">}</span>
+      <span class="o">.</span><span class="n">map</span> <span 
class="o">{</span> <span class="n">event</span> <span class="k">=&gt;</span>
+        <span class="k">try</span> <span class="o">{</span>
+          <span class="nc">ViewEvent</span><span class="o">(</span>
+            <span class="n">user</span> <span class="k">=</span> <span 
class="n">event</span><span class="o">.</span><span 
class="n">entityId</span><span class="o">,</span>
+            <span class="n">item</span> <span class="k">=</span> <span 
class="n">event</span><span class="o">.</span><span 
class="n">targetEntityId</span><span class="o">.</span><span 
class="n">get</span><span class="o">,</span>
+            <span class="n">t</span> <span class="k">=</span> <span 
class="n">event</span><span class="o">.</span><span 
class="n">eventTime</span><span class="o">.</span><span 
class="n">getMillis</span>
+          <span class="o">)</span>
+        <span class="o">}</span> <span class="k">catch</span> <span 
class="o">{</span>
+          <span class="k">case</span> <span class="n">e</span><span 
class="k">:</span> <span class="kt">Exception</span> <span 
class="o">=&gt;</span>
+            <span class="n">logger</span><span class="o">.</span><span 
class="n">error</span><span class="o">(</span><span class="n">s</span><span 
class="s">"Cannot convert ${event} to ViewEvent."</span> <span 
class="o">+</span>
+              <span class="n">s</span><span class="s">" Exception: 
${e}."</span><span class="o">)</span>
+            <span class="k">throw</span> <span class="n">e</span>
+        <span class="o">}</span>
+      <span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p><code>ViewEvent</code> case class 
is defined as:</p><div class="highlight scala"><table style="border-spacing: 
0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre 
class="lineno">1</pre></td><td class="code"><pre><span class="k">case</span> 
<span class="k">class</span> <span class="nc">ViewEvent</span><span 
class="o">(</span><span class="n">user</span><span class="k">:</span> <span 
class="kt">String</span><span class="o">,</span> <span 
class="n">item</span><span class="k">:</span> <span 
class="kt">String</span><span class="o">,</span> <span class="n">t</span><span 
class="k">:</span> <span class="kt">Long</span><span class="o">)</span>
+</pre></td></tr></tbody></table> </div> <p>We filter buy event in similar way 
and map to <code>BuyEvent</code> object for later use.</p><div class="highlight 
scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" 
style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18</pre></td><td class="code"><pre>
+  <span class="k">val</span> <span class="n">buyEventsRDD</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span 
class="kt">BuyEvent</span><span class="o">]</span> <span class="k">=</span> 
<span class="n">eventsRDD</span>
+      <span class="o">.</span><span class="n">filter</span> <span 
class="o">{</span> <span class="n">event</span> <span class="k">=&gt;</span> 
<span class="n">event</span><span class="o">.</span><span 
class="n">event</span> <span class="o">==</span> <span class="s">"buy"</span> 
<span class="o">}</span>
+      <span class="o">.</span><span class="n">map</span> <span 
class="o">{</span> <span class="n">event</span> <span class="k">=&gt;</span>
+        <span class="k">try</span> <span class="o">{</span>
+          <span class="nc">BuyEvent</span><span class="o">(</span>
+            <span class="n">user</span> <span class="k">=</span> <span 
class="n">event</span><span class="o">.</span><span 
class="n">entityId</span><span class="o">,</span>
+            <span class="n">item</span> <span class="k">=</span> <span 
class="n">event</span><span class="o">.</span><span 
class="n">targetEntityId</span><span class="o">.</span><span 
class="n">get</span><span class="o">,</span>
+            <span class="n">t</span> <span class="k">=</span> <span 
class="n">event</span><span class="o">.</span><span 
class="n">eventTime</span><span class="o">.</span><span 
class="n">getMillis</span>
+          <span class="o">)</span>
+        <span class="o">}</span> <span class="k">catch</span> <span 
class="o">{</span>
+          <span class="k">case</span> <span class="n">e</span><span 
class="k">:</span> <span class="kt">Exception</span> <span 
class="o">=&gt;</span>
+            <span class="n">logger</span><span class="o">.</span><span 
class="n">error</span><span class="o">(</span><span class="n">s</span><span 
class="s">"Cannot convert ${event} to BuyEvent."</span> <span class="o">+</span>
+              <span class="n">s</span><span class="s">" Exception: 
${e}."</span><span class="o">)</span>
+            <span class="k">throw</span> <span class="n">e</span>
+        <span class="o">}</span>
+      <span class="o">}</span>
+
+</pre></td></tr></tbody></table> </div> <p><code>BuyEvent</code> case class is 
defined as:</p><div class="highlight scala"><table style="border-spacing: 
0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre 
class="lineno">1</pre></td><td class="code"><pre><span class="k">case</span> 
<span class="k">class</span> <span class="nc">BuyEvent</span><span 
class="o">(</span><span class="n">user</span><span class="k">:</span> <span 
class="kt">String</span><span class="o">,</span> <span 
class="n">item</span><span class="k">:</span> <span 
class="kt">String</span><span class="o">,</span> <span class="n">t</span><span 
class="k">:</span> <span class="kt">Long</span><span class="o">)</span>
+</pre></td></tr></tbody></table> </div> <div class="alert-message info"><p>For 
flexibility, this template is designed to support user ID and item ID in 
String.</p></div><p><code>TrainingData</code> contains an RDD of 
<code>User</code>, <code>Item</code> and <code>ViewEvent</code> objects. The 
class definition of <code>TrainingData</code> is:</p><div class="highlight 
scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" 
style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6</pre></td><td class="code"><pre><span class="k">class</span> <span 
class="nc">TrainingData</span><span class="o">(</span>
+  <span class="k">val</span> <span class="n">users</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span 
class="kt">String</span>, <span class="kt">User</span><span class="o">)],</span>
+  <span class="k">val</span> <span class="n">items</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span 
class="kt">String</span>, <span class="kt">Item</span><span class="o">)],</span>
+  <span class="k">val</span> <span class="n">viewEvents</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span 
class="kt">ViewEvent</span><span class="o">],</span>
+  <span class="k">val</span> <span class="n">buyEvents</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span 
class="kt">BuyEvent</span><span class="o">]</span>
+<span class="o">)</span> <span class="k">extends</span> <span 
class="nc">Serializable</span> <span class="o">{</span> <span 
class="o">...</span> <span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>PredictionIO then passes the 
returned <code>TrainingData</code> object to <em>Data Preparator</em>.</p><div 
class="alert-message note"><p>You could modify the DataSource to read other 
event other than the default <strong>view</strong> or 
<strong>buy</strong>.</p></div><h3 id='data-preparator' 
class='header-anchors'>Data Preparator</h3><p>In 
MyECommerceRecommendation/src/main/scala/<strong><em>Preparator.scala</em></strong>,
 the <code>prepare</code> method of class <code>Preparator</code> takes 
<code>TrainingData</code> as its input and performs any necessary feature 
selection and data processing tasks. At the end, it returns 
<code>PreparedData</code> which should contain the data <em>Algorithm</em> 
needs.</p><p>By default, <code>prepare</code> simply copies the unprocessed 
<code>TrainingData</code> data to <code>PreparedData</code>:</p><div 
class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text
 -align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18</pre></td><td class="code"><pre><span class="k">class</span> <span 
class="nc">Preparator</span>
+  <span class="k">extends</span> <span class="nc">PPreparator</span><span 
class="o">[</span><span class="kt">TrainingData</span>, <span 
class="kt">PreparedData</span><span class="o">]</span> <span class="o">{</span>
+
+  <span class="k">def</span> <span class="n">prepare</span><span 
class="o">(</span><span class="n">sc</span><span class="k">:</span> <span 
class="kt">SparkContext</span><span class="o">,</span> <span 
class="n">trainingData</span><span class="k">:</span> <span 
class="kt">TrainingData</span><span class="o">)</span><span class="k">:</span> 
<span class="kt">PreparedData</span> <span class="o">=</span> <span 
class="o">{</span>
+    <span class="k">new</span> <span class="nc">PreparedData</span><span 
class="o">(</span>
+      <span class="n">users</span> <span class="k">=</span> <span 
class="n">trainingData</span><span class="o">.</span><span 
class="n">users</span><span class="o">,</span>
+      <span class="n">items</span> <span class="k">=</span> <span 
class="n">trainingData</span><span class="o">.</span><span 
class="n">items</span><span class="o">,</span>
+      <span class="n">viewEvents</span> <span class="k">=</span> <span 
class="n">trainingData</span><span class="o">.</span><span 
class="n">viewEvents</span><span class="o">,</span>
+      <span class="n">buyEvents</span> <span class="k">=</span> <span 
class="n">trainingData</span><span class="o">.</span><span 
class="n">buyEvents</span><span class="o">)</span>
+  <span class="o">}</span>
+<span class="o">}</span>
+
+<span class="k">class</span> <span class="nc">PreparedData</span><span 
class="o">(</span>
+  <span class="k">val</span> <span class="n">users</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span 
class="kt">String</span>, <span class="kt">User</span><span class="o">)],</span>
+  <span class="k">val</span> <span class="n">items</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span 
class="kt">String</span>, <span class="kt">Item</span><span class="o">)],</span>
+  <span class="k">val</span> <span class="n">viewEvents</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span 
class="kt">ViewEvent</span><span class="o">],</span>
+  <span class="k">val</span> <span class="n">buyEvents</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span 
class="kt">BuyEvent</span><span class="o">]</span>
+<span class="o">)</span> <span class="k">extends</span> <span 
class="nc">Serializable</span>
+</pre></td></tr></tbody></table> </div> <p>PredictionIO passes the returned 
<code>PreparedData</code> object to Algorithm&#39;s <code>train</code> 
function.</p><h2 id='algorithm' class='header-anchors'>Algorithm</h2><p>In 
MyECommerceRecommendation/src/main/scala/<strong><em>ECommAlgorithm.scala</em></strong>,
 the two methods of the algorithm class are <code>train</code> and 
<code>predict</code>. <code>train</code> is responsible for training the 
predictive model;<code>predict</code> is responsible for using this model to 
make prediction.</p><h3 id='algorithm-parameters' 
class='header-anchors'>Algorithm parameters</h3><p>The ECommAlgorithm takes the 
following parameters, as defined by the <code>ECommAlgorithmParams</code> case 
class:</p><div class="highlight scala"><table style="border-spacing: 
0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre 
class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10</pre></td><td class="code"><pre><span class="k">case</span> <span 
class="k">class</span> <span class="nc">ECommAlgorithmParams</span><span 
class="o">(</span>
+  <span class="n">appName</span><span class="k">:</span> <span 
class="kt">String</span><span class="o">,</span>
+  <span class="n">unseenOnly</span><span class="k">:</span> <span 
class="kt">Boolean</span><span class="o">,</span>
+  <span class="n">seenEvents</span><span class="k">:</span> <span 
class="kt">List</span><span class="o">[</span><span 
class="kt">String</span><span class="o">],</span>
+  <span class="n">similarEvents</span><span class="k">:</span> <span 
class="kt">List</span><span class="o">[</span><span 
class="kt">String</span><span class="o">],</span>
+  <span class="n">rank</span><span class="k">:</span> <span 
class="kt">Int</span><span class="o">,</span>
+  <span class="n">numIterations</span><span class="k">:</span> <span 
class="kt">Int</span><span class="o">,</span>
+  <span class="n">lambda</span><span class="k">:</span> <span 
class="kt">Double</span><span class="o">,</span>
+  <span class="n">seed</span><span class="k">:</span> <span 
class="kt">Option</span><span class="o">[</span><span 
class="kt">Long</span><span class="o">]</span>
+<span class="o">)</span> <span class="k">extends</span> <span 
class="nc">Params</span>
+</pre></td></tr></tbody></table> </div> <p>Parameter description:</p> <ul> 
<li><strong>appName</strong>: Your App name. Events defined by 
&quot;seenEvents&quot; and &quot;similarEvents&quot; will be read from this app 
during <code>predict</code>.</li> <li><strong>unseenOnly</strong>: true or 
false. Set to true if you want to recommmend unseen items only. Seen items are 
defined by <em>seenEvents</em> which mean if the user has these events on the 
items, then it&#39;s treated as <em>seen</em>.</li> 
<li><strong>seenEvents</strong>: A list of user-to-item events which will be 
treated as <em>seen</em> events. Used when <em>unseenOnly</em> is set to 
true.</li> <li><strong>similarEvents</strong>: A list of user-item-item events 
which will be used to find similar items to the items which the user has 
performend these events on.</li> <li><strong>rank</strong>: Parameter of the 
MLlib ALS algorithm. Number of latent features.</li> 
<li><strong>numIterations</strong>: Parameter of the MLlib ALS 
 algorithm. Number of iterations.</li> <li><strong>lambda</strong>: 
Regularization parameter of the MLlib ALS algorithm.</li> 
<li><strong>seed</strong>: Optional. A random seed of the MLlib ALS algorithm. 
Specify a fixed value if want to have deterministic result.</li> </ul> <h3 
id='train(...)' class='header-anchors'>train(...)</h3><p><code>train</code> is 
called when you run <strong>pio train</strong>. This is where MLlib ALS 
algorithm, i.e. <code>ALS.trainImplicit()</code>, is used to train a predictive 
model. In addition, we also count the number of items being bought for each 
item as default model which will be used when there is no ALS model avaiable or 
other useful information about the user is avaiable during 
<code>predict</code>.</p><div class="highlight scala"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19
+20
+21
+22
+23
+24
+25
+26
+27
+28
+29
+30
+31
+32
+33
+34
+35
+36
+37
+38
+39</pre></td><td class="code"><pre>
+  <span class="k">def</span> <span class="n">train</span><span 
class="o">(</span><span class="n">sc</span><span class="k">:</span> <span 
class="kt">SparkContext</span><span class="o">,</span> <span 
class="n">data</span><span class="k">:</span> <span 
class="kt">PreparedData</span><span class="o">)</span><span class="k">:</span> 
<span class="kt">ECommModel</span> <span class="o">=</span> <span 
class="o">{</span>
+    <span class="o">...</span>
+
+    <span class="c1">// create User and item's String ID to integer index BiMap
+</span>    <span class="k">val</span> <span class="n">userStringIntMap</span> 
<span class="k">=</span> <span class="nc">BiMap</span><span 
class="o">.</span><span class="n">stringInt</span><span class="o">(</span><span 
class="n">data</span><span class="o">.</span><span class="n">users</span><span 
class="o">.</span><span class="n">keys</span><span class="o">)</span>
+    <span class="k">val</span> <span class="n">itemStringIntMap</span> <span 
class="k">=</span> <span class="nc">BiMap</span><span class="o">.</span><span 
class="n">stringInt</span><span class="o">(</span><span 
class="n">data</span><span class="o">.</span><span class="n">items</span><span 
class="o">.</span><span class="n">keys</span><span class="o">)</span>
+
+    <span class="c1">// generate MLlibRating data for ALS algorithm
+</span>    <span class="k">val</span> <span class="n">mllibRatings</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span 
class="kt">MLlibRating</span><span class="o">]</span> <span class="k">=</span> 
<span class="n">genMLlibRating</span><span class="o">(</span>
+      <span class="n">userStringIntMap</span> <span class="k">=</span> <span 
class="n">userStringIntMap</span><span class="o">,</span>
+      <span class="n">itemStringIntMap</span> <span class="k">=</span> <span 
class="n">itemStringIntMap</span><span class="o">,</span>
+      <span class="n">data</span> <span class="k">=</span> <span 
class="n">data</span>
+    <span class="o">)</span>
+
+    <span class="c1">// seed for MLlib ALS
+</span>    <span class="k">val</span> <span class="n">seed</span> <span 
class="k">=</span> <span class="n">ap</span><span class="o">.</span><span 
class="n">seed</span><span class="o">.</span><span 
class="n">getOrElse</span><span class="o">(</span><span 
class="nc">System</span><span class="o">.</span><span 
class="n">nanoTime</span><span class="o">)</span>
+
+    <span class="k">val</span> <span class="n">m</span> <span 
class="k">=</span> <span class="nc">ALS</span><span class="o">.</span><span 
class="n">trainImplicit</span><span class="o">(</span>
+      <span class="n">ratings</span> <span class="k">=</span> <span 
class="n">mllibRatings</span><span class="o">,</span>
+      <span class="n">rank</span> <span class="k">=</span> <span 
class="n">ap</span><span class="o">.</span><span class="n">rank</span><span 
class="o">,</span>
+      <span class="n">iterations</span> <span class="k">=</span> <span 
class="n">ap</span><span class="o">.</span><span 
class="n">numIterations</span><span class="o">,</span>
+      <span class="n">lambda</span> <span class="k">=</span> <span 
class="n">ap</span><span class="o">.</span><span class="n">lambda</span><span 
class="o">,</span>
+      <span class="n">blocks</span> <span class="k">=</span> <span 
class="o">-</span><span class="mi">1</span><span class="o">,</span>
+      <span class="n">alpha</span> <span class="k">=</span> <span 
class="mf">1.0</span><span class="o">,</span>
+      <span class="n">seed</span> <span class="k">=</span> <span 
class="n">seed</span><span class="o">)</span>
+
+    <span class="o">...</span>
+
+    <span class="c1">// count the number of items being bought for 
recommendation popular items as default case
+</span>    <span class="k">val</span> <span class="n">popularCount</span> 
<span class="k">=</span> <span class="n">trainDefault</span><span 
class="o">(</span>
+      <span class="n">userStringIntMap</span> <span class="k">=</span> <span 
class="n">userStringIntMap</span><span class="o">,</span>
+      <span class="n">itemStringIntMap</span> <span class="k">=</span> <span 
class="n">itemStringIntMap</span><span class="o">,</span>
+      <span class="n">data</span> <span class="k">=</span> <span 
class="n">data</span>
+    <span class="o">)</span>
+    <span class="o">...</span>
+
+  <span class="o">}</span>
+
+</pre></td></tr></tbody></table> </div> <h4 
id='working-with-spark-mllib&#39;s-als.trainimplicit(....)' 
class='header-anchors'>Working with Spark MLlib&#39;s 
ALS.trainImplicit(....)</h4><p>MLlib ALS does not support <code>String</code> 
user ID and item ID. <code>ALS.trainImplicit</code> thus also assumes int-only 
<code>Rating</code> object. First, you can rename MLlib&#39;s Integer-only 
<code>Rating</code> to <code>MLlibRating</code> for clarity:</p><div 
class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1</pre></td><td 
class="code"><pre>import org.apache.spark.mllib.recommendation.<span 
class="o">{</span>Rating <span class="o">=</span>&gt; MLlibRating<span 
class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>In order to use MLlib&#39;s ALS 
algorithm, we need to convert the <code>viewEvents</code> into 
<code>MLlibRating</code>. There are two things we need to handle:</p> <ol> 
<li>Map user and item String ID of the ViewEvent into Integer ID, as required 
by <code>MLlibRating</code>.</li> <li><code>ViewEvent</code> object is an 
implicit event that does not have an explicit rating value. 
<code>ALS.trainImplicit()</code> supports implicit preference. If the 
<code>MLlibRating</code> has higher rating value, it means higher confidence 
that the user prefers the item. Hence we can aggregate how many times the user 
has viewed the item to indicate the confidence level that the user may prefer 
the item.</li> </ol> <p>You create a bi-directional map with 
<code>BiMap.stringInt</code> which maps each String record to an Integer 
index.</p><div class="highlight scala"><table style="border-spacing: 
0"><tbody><tr><td class="gutter gl" style="text-align: right"><pr
 e class="lineno">1
+2</pre></td><td class="code"><pre><span class="k">val</span> <span 
class="n">userStringIntMap</span> <span class="k">=</span> <span 
class="nc">BiMap</span><span class="o">.</span><span 
class="n">stringInt</span><span class="o">(</span><span 
class="n">data</span><span class="o">.</span><span class="n">users</span><span 
class="o">.</span><span class="n">keys</span><span class="o">)</span>
+<span class="k">val</span> <span class="n">itemStringIntMap</span> <span 
class="k">=</span> <span class="nc">BiMap</span><span class="o">.</span><span 
class="n">stringInt</span><span class="o">(</span><span 
class="n">data</span><span class="o">.</span><span class="n">items</span><span 
class="o">.</span><span class="n">keys</span><span class="o">)</span>
+</pre></td></tr></tbody></table> </div> <p>Then convert the user and item 
String ID in each ViewEvent to Int with these BiMaps. We use default -1 if the 
user or item String ID couldn&#39;t be found in the BiMap and filter out these 
events with invalid user and item ID later. After filtering, we use 
<code>reduceByKey()</code> to add up all values for the same key (uindex, 
iindex) and then finally map to <code>MLlibRating</code> object. You can find 
the code inside the function <code>genMLlibRating()</code>:</p><div 
class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19
+20
+21
+22
+23
+24
+25
+26
+27
+28
+29
+30
+31
+32
+33
+34
+35
+36</pre></td><td class="code"><pre>
+  <span class="k">def</span> <span class="n">genMLlibRating</span><span 
class="o">(</span>
+    <span class="n">userStringIntMap</span><span class="k">:</span> <span 
class="kt">BiMap</span><span class="o">[</span><span class="kt">String</span>, 
<span class="kt">Int</span><span class="o">],</span>
+    <span class="n">itemStringIntMap</span><span class="k">:</span> <span 
class="kt">BiMap</span><span class="o">[</span><span class="kt">String</span>, 
<span class="kt">Int</span><span class="o">],</span>
+    <span class="n">data</span><span class="k">:</span> <span 
class="kt">PreparedData</span><span class="o">)</span><span class="k">:</span> 
<span class="kt">RDD</span><span class="o">[</span><span 
class="kt">MLlibRating</span><span class="o">]</span> <span class="k">=</span> 
<span class="o">{</span>
+
+    <span class="k">val</span> <span class="n">mllibRatings</span> <span 
class="k">=</span> <span class="n">data</span><span class="o">.</span><span 
class="n">viewEvents</span>
+      <span class="o">.</span><span class="n">map</span> <span 
class="o">{</span> <span class="n">r</span> <span class="k">=&gt;</span>
+        <span class="c1">// Convert user and item String IDs to Int index for 
MLlib
+</span>        <span class="k">val</span> <span class="n">uindex</span> <span 
class="k">=</span> <span class="n">userStringIntMap</span><span 
class="o">.</span><span class="n">getOrElse</span><span class="o">(</span><span 
class="n">r</span><span class="o">.</span><span class="n">user</span><span 
class="o">,</span> <span class="o">-</span><span class="mi">1</span><span 
class="o">)</span>
+        <span class="k">val</span> <span class="n">iindex</span> <span 
class="k">=</span> <span class="n">itemStringIntMap</span><span 
class="o">.</span><span class="n">getOrElse</span><span class="o">(</span><span 
class="n">r</span><span class="o">.</span><span class="n">item</span><span 
class="o">,</span> <span class="o">-</span><span class="mi">1</span><span 
class="o">)</span>
+
+        <span class="k">if</span> <span class="o">(</span><span 
class="n">uindex</span> <span class="o">==</span> <span class="o">-</span><span 
class="mi">1</span><span class="o">)</span>
+          <span class="n">logger</span><span class="o">.</span><span 
class="n">info</span><span class="o">(</span><span class="n">s</span><span 
class="s">"Couldn't convert nonexistent user ID ${r.user}"</span>
+            <span class="o">+</span> <span class="s">" to Int 
index."</span><span class="o">)</span>
+
+        <span class="k">if</span> <span class="o">(</span><span 
class="n">iindex</span> <span class="o">==</span> <span class="o">-</span><span 
class="mi">1</span><span class="o">)</span>
+          <span class="n">logger</span><span class="o">.</span><span 
class="n">info</span><span class="o">(</span><span class="n">s</span><span 
class="s">"Couldn't convert nonexistent item ID ${r.item}"</span>
+            <span class="o">+</span> <span class="s">" to Int 
index."</span><span class="o">)</span>
+
+        <span class="o">((</span><span class="n">uindex</span><span 
class="o">,</span> <span class="n">iindex</span><span class="o">),</span> <span 
class="mi">1</span><span class="o">)</span>
+      <span class="o">}</span>
+      <span class="o">.</span><span class="n">filter</span> <span 
class="o">{</span> <span class="k">case</span> <span class="o">((</span><span 
class="n">u</span><span class="o">,</span> <span class="n">i</span><span 
class="o">),</span> <span class="n">v</span><span class="o">)</span> <span 
class="k">=&gt;</span>
+        <span class="c1">// keep events with valid user and item index
+</span>        <span class="o">(</span><span class="n">u</span> <span 
class="o">!=</span> <span class="o">-</span><span class="mi">1</span><span 
class="o">)</span> <span class="o">&amp;&amp;</span> <span 
class="o">(</span><span class="n">i</span> <span class="o">!=</span> <span 
class="o">-</span><span class="mi">1</span><span class="o">)</span>
+      <span class="o">}</span>
+      <span class="o">.</span><span class="n">reduceByKey</span><span 
class="o">(</span><span class="k">_</span> <span class="o">+</span> <span 
class="k">_</span><span class="o">)</span> <span class="c1">// aggregate all 
view events of same user-item pair
+</span>      <span class="o">.</span><span class="n">map</span> <span 
class="o">{</span> <span class="k">case</span> <span class="o">((</span><span 
class="n">u</span><span class="o">,</span> <span class="n">i</span><span 
class="o">),</span> <span class="n">v</span><span class="o">)</span> <span 
class="k">=&gt;</span>
+        <span class="c1">// MLlibRating requires integer index for user and 
item
+</span>        <span class="nc">MLlibRating</span><span 
class="o">(</span><span class="n">u</span><span class="o">,</span> <span 
class="n">i</span><span class="o">,</span> <span class="n">v</span><span 
class="o">)</span>
+      <span class="o">}</span>
+      <span class="o">.</span><span class="n">cache</span><span 
class="o">()</span>
+
+    <span class="n">mllibRatings</span>
+  <span class="o">}</span>
+
+</pre></td></tr></tbody></table> </div> <div class="alert-message note"><p>You 
can customize this function if you want to convert other events to MLlibRating 
or need different ways to aggreagte the events into MLlibRating.</p></div><p>In 
addition to <code>RDD[MLlibRating]</code>, <code>ALS.trainImplicit</code> takes 
the following parameters: <em>rank</em>, <em>iterations</em>, <em>lambda</em> 
and <em>seed</em>.</p><p>The values of these parameters are specified in 
<em>algorithms</em> of 
MyECommerceRecommendation/<strong><em>engine.json</em></strong>:</p><div 
class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19</pre></td><td class="code"><pre><span class="o">{</span>
+  ...
+  <span class="s2">"algorithms"</span>: <span class="o">[</span>
+    <span class="o">{</span>
+      <span class="s2">"name"</span>: <span class="s2">"als"</span>,
+      <span class="s2">"params"</span>: <span class="o">{</span>
+        <span class="s2">"appName"</span>: <span class="s2">"MyApp1"</span>,
+        <span class="s2">"unseenOnly"</span>: <span class="nb">true</span>,
+        <span class="s2">"seenEvents"</span>: <span class="o">[</span><span 
class="s2">"buy"</span>, <span class="s2">"view"</span><span class="o">]</span>,
+        <span class="s2">"similarEvents"</span> : <span 
class="o">[</span><span class="s2">"view"</span><span class="o">]</span>
+        <span class="s2">"rank"</span>: 10,
+        <span class="s2">"numIterations"</span> : 20,
+        <span class="s2">"lambda"</span>: 0.01,
+        <span class="s2">"seed"</span>: 3
+      <span class="o">}</span>
+    <span class="o">}</span>
+  <span class="o">]</span>
+  ...
+<span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>The parameters 
<code>appName</code>, <code>unseenOnly</code>, <code>seenEvents</code> and 
<code>similarEvents</code> are used during <code>predict()</code>, which will 
be explained later.</p><p>PredictionIO will automatically loads these values 
into the constructor <code>ap</code>, which has a corresponding case class 
<code>ECommAlgorithmParams</code>.</p><p>The <code>seed</code> parameter is an 
optional parameter, which is used by MLlib ALS algorithm internally to generate 
random values. If the <code>seed</code> is not specified, current system time 
would be used and hence each train may produce different reuslts. Specify a 
fixed value for the <code>seed</code> if you want to have deterministic result 
(For example, when you are testing).</p><p><code>ALS.trainImplicit()</code> 
returns a <code>MatrixFactorizationModel</code> model which contains two RDDs: 
userFeatures and productFeatures. They correspond to the user X latent features 
matrix 
 and item X latent features matrix, respectively.</p><p>In addition to the 
latent feature vector, the item properties (e.g. categories) and popular count 
are also used during <code>predict()</code>. Hence, we also save these data 
along with the feature vector by joining them and then collect the data as 
local Map. Each item is represented by a <code>ProductModel</code> class, which 
cosists of the <code>item</code> information, <code>features</code> calculated 
by ALS, and <code>count</code> returned by <code>trainDefault()</code>.</p><div 
class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7</pre></td><td class="code"><pre>
+<span class="k">case</span> <span class="k">class</span> <span 
class="nc">ProductModel</span><span class="o">(</span>
+  <span class="n">item</span><span class="k">:</span> <span 
class="kt">Item</span><span class="o">,</span>
+  <span class="n">features</span><span class="k">:</span> <span 
class="kt">Option</span><span class="o">[</span><span 
class="kt">Array</span><span class="o">[</span><span 
class="kt">Double</span><span class="o">]],</span> <span class="c1">// features 
by ALS
+</span>  <span class="n">count</span><span class="k">:</span> <span 
class="kt">Int</span> <span class="c1">// popular count for default score
+</span><span class="o">)</span>
+
+</pre></td></tr></tbody></table> </div> <div class="highlight scala"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19
+20
+21
+22
+23
+24
+25</pre></td><td class="code"><pre>    <span class="c1">// join item with the 
trained productFeatures
+</span>    <span class="k">val</span> <span 
class="n">productFeatures</span><span class="k">:</span> <span 
class="kt">Map</span><span class="o">[</span><span class="kt">Int</span>, <span 
class="o">(</span><span class="kt">Item</span>, <span 
class="kt">Option</span><span class="o">[</span><span 
class="kt">Array</span><span class="o">[</span><span 
class="kt">Double</span><span class="o">]])]</span> <span class="k">=</span>
+      <span class="n">items</span><span class="o">.</span><span 
class="n">leftOuterJoin</span><span class="o">(</span><span 
class="n">m</span><span class="o">.</span><span 
class="n">productFeatures</span><span class="o">).</span><span 
class="n">collectAsMap</span><span class="o">.</span><span 
class="n">toMap</span>
+
+    <span class="o">...</span>
+
+    <span class="k">val</span> <span class="n">productModels</span><span 
class="k">:</span> <span class="kt">Map</span><span class="o">[</span><span 
class="kt">Int</span>, <span class="kt">ProductModel</span><span 
class="o">]</span> <span class="k">=</span> <span 
class="n">productFeatures</span>
+      <span class="o">.</span><span class="n">map</span> <span 
class="o">{</span> <span class="k">case</span> <span class="o">(</span><span 
class="n">index</span><span class="o">,</span> <span class="o">(</span><span 
class="n">item</span><span class="o">,</span> <span 
class="n">features</span><span class="o">))</span> <span class="k">=&gt;</span>
+        <span class="k">val</span> <span class="n">pm</span> <span 
class="k">=</span> <span class="nc">ProductModel</span><span class="o">(</span>
+          <span class="n">item</span> <span class="k">=</span> <span 
class="n">item</span><span class="o">,</span>
+          <span class="n">features</span> <span class="k">=</span> <span 
class="n">features</span><span class="o">,</span>
+          <span class="c1">// NOTE: use getOrElse because popularCount may not 
contain all items.
+</span>          <span class="n">count</span> <span class="k">=</span> <span 
class="n">popularCount</span><span class="o">.</span><span 
class="n">getOrElse</span><span class="o">(</span><span 
class="n">index</span><span class="o">,</span> <span class="mi">0</span><span 
class="o">)</span>
+        <span class="o">)</span>
+        <span class="o">(</span><span class="n">index</span><span 
class="o">,</span> <span class="n">pm</span><span class="o">)</span>
+      <span class="o">}</span>
+
+    <span class="k">new</span> <span class="nc">ECommModel</span><span 
class="o">(</span>
+      <span class="n">rank</span> <span class="k">=</span> <span 
class="n">m</span><span class="o">.</span><span class="n">rank</span><span 
class="o">,</span>
+      <span class="n">userFeatures</span> <span class="k">=</span> <span 
class="n">userFeatures</span><span class="o">,</span>
+      <span class="n">productModels</span> <span class="k">=</span> <span 
class="n">productModels</span><span class="o">,</span>
+      <span class="n">userStringIntMap</span> <span class="k">=</span> <span 
class="n">userStringIntMap</span><span class="o">,</span>
+      <span class="n">itemStringIntMap</span> <span class="k">=</span> <span 
class="n">itemStringIntMap</span>
+    <span class="o">)</span>
+
+</pre></td></tr></tbody></table> </div> <p>Note that 
<code>leftOuterJoin</code> is used because the productFeatures returned by ALS 
may not contain all items.</p><p>The <code>ECommModel</code> is defined as the 
following:</p><div class="highlight scala"><table style="border-spacing: 
0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre 
class="lineno">1
+2
+3
+4
+5
+6
+7</pre></td><td class="code"><pre><span class="k">class</span> <span 
class="nc">ECommModel</span><span class="o">(</span>
+  <span class="k">val</span> <span class="n">rank</span><span 
class="k">:</span> <span class="kt">Int</span><span class="o">,</span>
+  <span class="k">val</span> <span class="n">userFeatures</span><span 
class="k">:</span> <span class="kt">Map</span><span class="o">[</span><span 
class="kt">Int</span>, <span class="kt">Array</span><span 
class="o">[</span><span class="kt">Double</span><span class="o">]],</span>
+  <span class="k">val</span> <span class="n">productModels</span><span 
class="k">:</span> <span class="kt">Map</span><span class="o">[</span><span 
class="kt">Int</span>, <span class="kt">ProductModel</span><span 
class="o">],</span>
+  <span class="k">val</span> <span class="n">userStringIntMap</span><span 
class="k">:</span> <span class="kt">BiMap</span><span class="o">[</span><span 
class="kt">String</span>, <span class="kt">Int</span><span class="o">],</span>
+  <span class="k">val</span> <span class="n">itemStringIntMap</span><span 
class="k">:</span> <span class="kt">BiMap</span><span class="o">[</span><span 
class="kt">String</span>, <span class="kt">Int</span><span class="o">]</span>
+<span class="o">)</span> <span class="k">extends</span> <span 
class="nc">Serializable</span>  <span class="o">{</span> <span 
class="o">...</span> <span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>PredictionIO will automatically 
store the returned model after training, i.e. <code>ECommModel</code> in this 
example.</p><h3 id='predict(...)' 
class='header-anchors'>predict(...)</h3><p><code>predict</code> is called when 
you send a JSON query to <a 
href="http://localhost:8000/queries.json";>http://localhost:8000/queries.json</a>.
 PredictionIO converts the query, such as <code>{ &quot;user&quot;: 
&quot;u1&quot;, &quot;num&quot;: 4 }</code> to the <code>Query</code> class you 
defined previously.</p><p>We can use the userFeatures and productFeatures 
stored in ECommModel to calculate the scores of items for the user.</p><p>This 
template also supports additional business logic features, such as filtering 
items by categories, recommending items in the white list, excluding items in 
the black list, recommend unseen items only, and exclude unavaiable items 
defined in constraint event.</p><p>The <code>predict()</code> function does the 
following:</
 p> <ol> <li>Convert the item in query&#39;s whilteList from string ID to 
integer index</li> <li>Get a list seen items by the user (defined by parmater 
<code>seenEvents</code>)</li> <li>Get the latest unavailableItems which is used 
to exclude unavailable items for all users</li> <li>Combine query&#39;s 
blackList, seenItems, and unavailableItems into a final black list of items to 
be excluded from recommendation.</li> <li>Get the user feature vector from the 
ECommModel.</li> <li>If there is feature vector for the user, recommend top N 
items based on the user feature and prodcut features.</li> <li>If there is no 
feature vector for the user, use the recent items acted by the user (defined by 
<code>similarEvents</code> parameter) to recommend similar items.</li> <li>If 
there is no recent <code>similarEvents</code> available for the user, popular 
items are then recommended (added in template version 0.4.0).</li> </ol> 
<p>Only items which satisfy the <code>isCandidate()</code> condition wi
 ll be recommended. By default, the item can be recommended if:</p> <ul> <li>it 
belongs to one of the categories defined in query.</li> <li>it is one of the 
white list items if white list is defined.</li> <li>it is not in the black 
list.</li> </ul> <div class="alert-message info"><p>You can easily modify 
<code>isCandidate()</code> checking or related logic if you have different 
requirements or condition to determine if an item is a candidate item to be 
recommended.</p></div><div class="highlight scala"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19
+20
+21
+22
+23
+24
+25
+26
+27
+28
+29
+30
+31
+32
+33
+34
+35
+36
+37
+38
+39
+40
+41
+42
+43
+44
+45
+46
+47
+48
+49
+50
+51
+52
+53
+54
+55
+56
+57
+58
+59
+60
+61
+62
+63
+64
+65
+66
+67
+68
+69
+70</pre></td><td class="code"><pre>
+  <span class="k">def</span> <span class="n">predict</span><span 
class="o">(</span><span class="n">model</span><span class="k">:</span> <span 
class="kt">ECommModel</span><span class="o">,</span> <span 
class="n">query</span><span class="k">:</span> <span 
class="kt">Query</span><span class="o">)</span><span class="k">:</span> <span 
class="kt">PredictedResult</span> <span class="o">=</span> <span 
class="o">{</span>
+
+    <span class="k">val</span> <span class="n">userFeatures</span> <span 
class="k">=</span> <span class="n">model</span><span class="o">.</span><span 
class="n">userFeatures</span>
+    <span class="k">val</span> <span class="n">productFeatures</span> <span 
class="k">=</span> <span class="n">model</span><span class="o">.</span><span 
class="n">productFeatures</span>
+
+    <span class="c1">// convert whiteList's string ID to integer index
+</span>    <span class="k">val</span> <span class="n">whiteList</span><span 
class="k">:</span> <span class="kt">Option</span><span class="o">[</span><span 
class="kt">Set</span><span class="o">[</span><span class="kt">Int</span><span 
class="o">]]</span> <span class="k">=</span> <span class="n">query</span><span 
class="o">.</span><span class="n">whiteList</span><span class="o">.</span><span 
class="n">map</span><span class="o">(</span> <span class="n">set</span> <span 
class="k">=&gt;</span>
+      <span class="n">set</span><span class="o">.</span><span 
class="n">map</span><span class="o">(</span><span class="n">model</span><span 
class="o">.</span><span class="n">itemStringIntMap</span><span 
class="o">.</span><span class="n">get</span><span class="o">(</span><span 
class="k">_</span><span class="o">)).</span><span class="n">flatten</span>
+    <span class="o">)</span>
+
+    <span class="c1">// generate final blackList based on additional 
constraints
+</span>    <span class="k">val</span> <span 
class="n">finalBlackList</span><span class="k">:</span> <span 
class="kt">Set</span><span class="o">[</span><span class="kt">Int</span><span 
class="o">]</span> <span class="k">=</span> <span 
class="n">genBlackList</span><span class="o">(</span><span 
class="n">query</span> <span class="k">=</span> <span 
class="n">query</span><span class="o">)</span>
+      <span class="c1">// convert seen Items list from String ID to interger 
Index
+</span>      <span class="o">.</span><span class="n">flatMap</span><span 
class="o">(</span><span class="n">x</span> <span class="k">=&gt;</span> <span 
class="n">model</span><span class="o">.</span><span 
class="n">itemStringIntMap</span><span class="o">.</span><span 
class="n">get</span><span class="o">(</span><span class="n">x</span><span 
class="o">))</span>
+
+    <span class="c1">// look up user feature from model
+</span>    <span class="k">val</span> <span class="n">userFeature</span> <span 
class="k">=</span>
+      <span class="n">model</span><span class="o">.</span><span 
class="n">userStringIntMap</span><span class="o">.</span><span 
class="n">get</span><span class="o">(</span><span class="n">query</span><span 
class="o">.</span><span class="n">user</span><span class="o">).</span><span 
class="n">map</span> <span class="o">{</span> <span class="n">userIndex</span> 
<span class="k">=&gt;</span>
+        <span class="n">userFeatures</span><span class="o">.</span><span 
class="n">get</span><span class="o">(</span><span 
class="n">userIndex</span><span class="o">)</span>
+      <span class="o">}</span>
+      <span class="c1">// flatten Option[Option[Array[Double]]] to 
Option[Array[Double]]
+</span>      <span class="o">.</span><span class="n">flatten</span>
+
+    <span class="k">val</span> <span class="n">topScores</span><span 
class="k">:</span> <span class="kt">Array</span><span class="o">[(</span><span 
class="kt">Int</span>, <span class="kt">Double</span><span class="o">)]</span> 
<span class="k">=</span> <span class="k">if</span> <span 
class="o">(</span><span class="n">userFeature</span><span 
class="o">.</span><span class="n">isDefined</span><span class="o">)</span> 
<span class="o">{</span>
+      <span class="c1">// the user has feature vector
+</span>      <span class="n">predictKnownUser</span><span class="o">(</span>
+        <span class="n">userFeature</span> <span class="k">=</span> <span 
class="n">userFeature</span><span class="o">.</span><span 
class="n">get</span><span class="o">,</span>
+        <span class="n">productModels</span> <span class="k">=</span> <span 
class="n">productModels</span><span class="o">,</span>
+        <span class="n">query</span> <span class="k">=</span> <span 
class="n">query</span><span class="o">,</span>
+        <span class="n">whiteList</span> <span class="k">=</span> <span 
class="n">whiteList</span><span class="o">,</span>
+        <span class="n">blackList</span> <span class="k">=</span> <span 
class="n">finalBlackList</span>
+      <span class="o">)</span>
+    <span class="o">}</span> <span class="k">else</span> <span 
class="o">{</span>
+      <span class="c1">// the user doesn't have feature vector.
+</span>      <span class="c1">// For example, new user is created after model 
is trained.
+</span>      <span class="n">logger</span><span class="o">.</span><span 
class="n">info</span><span class="o">(</span><span class="n">s</span><span 
class="s">"No userFeature found for user ${query.user}."</span><span 
class="o">)</span>
+
+      <span class="c1">// check if the user has recent events on some items
+</span>      <span class="k">val</span> <span 
class="n">recentItems</span><span class="k">:</span> <span 
class="kt">Set</span><span class="o">[</span><span 
class="kt">String</span><span class="o">]</span> <span class="k">=</span> <span 
class="n">getRecentItems</span><span class="o">(</span><span 
class="n">query</span><span class="o">)</span>
+      <span class="k">val</span> <span class="n">recentList</span><span 
class="k">:</span> <span class="kt">Set</span><span class="o">[</span><span 
class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span 
class="n">recentItems</span><span class="o">.</span><span 
class="n">flatMap</span> <span class="o">(</span><span class="n">x</span> <span 
class="k">=&gt;</span>
+        <span class="n">model</span><span class="o">.</span><span 
class="n">itemStringIntMap</span><span class="o">.</span><span 
class="n">get</span><span class="o">(</span><span class="n">x</span><span 
class="o">))</span>
+
+      <span class="k">val</span> <span class="n">recentFeatures</span><span 
class="k">:</span> <span class="kt">Vector</span><span class="o">[</span><span 
class="kt">Array</span><span class="o">[</span><span 
class="kt">Double</span><span class="o">]]</span> <span class="k">=</span> 
<span class="n">recentList</span><span class="o">.</span><span 
class="n">toVector</span>
+        <span class="c1">// productModels may not contain the requested item
+</span>        <span class="o">.</span><span class="n">map</span> <span 
class="o">{</span> <span class="n">i</span> <span class="k">=&gt;</span>
+          <span class="n">productModels</span><span class="o">.</span><span 
class="n">get</span><span class="o">(</span><span class="n">i</span><span 
class="o">).</span><span class="n">flatMap</span> <span class="o">{</span> 
<span class="n">pm</span> <span class="k">=&gt;</span> <span 
class="n">pm</span><span class="o">.</span><span class="n">features</span> 
<span class="o">}</span>
+        <span class="o">}.</span><span class="n">flatten</span>
+
+      <span class="k">if</span> <span class="o">(</span><span 
class="n">recentFeatures</span><span class="o">.</span><span 
class="n">isEmpty</span><span class="o">)</span> <span class="o">{</span>
+        <span class="n">logger</span><span class="o">.</span><span 
class="n">info</span><span class="o">(</span><span class="n">s</span><span 
class="s">"No features vector for recent items ${recentItems}."</span><span 
class="o">)</span>
+        <span class="n">predictDefault</span><span class="o">(</span>
+          <span class="n">productModels</span> <span class="k">=</span> <span 
class="n">productModels</span><span class="o">,</span>
+          <span class="n">query</span> <span class="k">=</span> <span 
class="n">query</span><span class="o">,</span>
+          <span class="n">whiteList</span> <span class="k">=</span> <span 
class="n">whiteList</span><span class="o">,</span>
+          <span class="n">blackList</span> <span class="k">=</span> <span 
class="n">finalBlackList</span>
+        <span class="o">)</span>
+      <span class="o">}</span> <span class="k">else</span> <span 
class="o">{</span>
+        <span class="n">predictSimilar</span><span class="o">(</span>
+          <span class="n">recentFeatures</span> <span class="k">=</span> <span 
class="n">recentFeatures</span><span class="o">,</span>
+          <span class="n">productModels</span> <span class="k">=</span> <span 
class="n">productModels</span><span class="o">,</span>
+          <span class="n">query</span> <span class="k">=</span> <span 
class="n">query</span><span class="o">,</span>
+          <span class="n">whiteList</span> <span class="k">=</span> <span 
class="n">whiteList</span><span class="o">,</span>
+          <span class="n">blackList</span> <span class="k">=</span> <span 
class="n">finalBlackList</span>
+        <span class="o">)</span>
+      <span class="o">}</span>
+    <span class="o">}</span>
+
+    <span class="o">...</span>
+  <span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>Note that the item IDs in top N 
results are the <code>Int</code> indices. You map them back to 
<code>String</code> with <code>itemIntStringMap</code> before they are 
returned.</p><div class="highlight scala"><table style="border-spacing: 
0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre 
class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9</pre></td><td class="code"><pre>  <span class="k">val</span> <span 
class="n">itemScores</span> <span class="k">=</span> <span 
class="n">topScores</span><span class="o">.</span><span class="n">map</span> 
<span class="o">{</span> <span class="k">case</span> <span 
class="o">(</span><span class="n">i</span><span class="o">,</span> <span 
class="n">s</span><span class="o">)</span> <span class="k">=&gt;</span>
+    <span class="k">new</span> <span class="nc">ItemScore</span><span 
class="o">(</span>
+      <span class="c1">// convert item int index back to string ID
+</span>      <span class="n">item</span> <span class="k">=</span> <span 
class="n">model</span><span class="o">.</span><span 
class="n">itemIntStringMap</span><span class="o">(</span><span 
class="n">i</span><span class="o">),</span>
+      <span class="n">score</span> <span class="k">=</span> <span 
class="n">s</span>
+    <span class="o">)</span>
+  <span class="o">}</span>
+
+  <span class="k">new</span> <span class="nc">PredictedResult</span><span 
class="o">(</span><span class="n">itemScores</span><span class="o">)</span>
+</pre></td></tr></tbody></table> </div> <p>PredictionIO passes the returned 
<code>PredictedResult</code> object to <em>Serving</em>.</p><h2 id='serving' 
class='header-anchors'>Serving</h2><p>The <code>serve</code> method of class 
<code>Serving</code> processes predicted result. It is also responsible for 
combining multiple predicted results into one if you have more than one 
predictive model. <em>Serving</em> then returns the final predicted result. 
PredictionIO will convert it to a JSON response automatically.</p><p>In 
MyECommerceRecommendation/src/main/scala/<strong><em>Serving.scala</em></strong>,</p><div
 class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9</pre></td><td class="code"><pre><span class="k">class</span> <span 
class="nc">Serving</span>
+  <span class="k">extends</span> <span class="nc">LServing</span><span 
class="o">[</span><span class="kt">Query</span>, <span 
class="kt">PredictedResult</span><span class="o">]</span> <span 
class="o">{</span>
+
+  <span class="k">override</span>
+  <span class="k">def</span> <span class="n">serve</span><span 
class="o">(</span><span class="n">query</span><span class="k">:</span> <span 
class="kt">Query</span><span class="o">,</span>
+    <span class="n">predictedResults</span><span class="k">:</span> <span 
class="kt">Seq</span><span class="o">[</span><span 
class="kt">PredictedResult</span><span class="o">])</span><span 
class="k">:</span> <span class="kt">PredictedResult</span> <span 
class="o">=</span> <span class="o">{</span>
+    <span class="n">predictedResults</span><span class="o">.</span><span 
class="n">head</span>
+  <span class="o">}</span>
+<span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>When you send a JSON query to <a 
href="http://localhost:8000/queries.json";>http://localhost:8000/queries.json</a>,
 <code>PredictedResult</code> from all models will be passed to 
<code>serve</code> as a sequence, i.e. <code>Seq[PredictedResult]</code>.</p> 
<blockquote> <p>An engine can train multiple models if you specify more than 
one Algorithm component in <code>object RecommendationEngine</code> inside 
<strong><em>Engine.scala</em></strong>. Since only one 
<code>ECommAlgorithm</code> is implemented by default, this <code>Seq</code> 
contains one element.</p></blockquote> </div></div></div></div><footer><div 
class="container"><div class="seperator"></div><div class="row"><div 
class="col-md-6 col-xs-6 footer-link-column"><div 
class="footer-link-column-row"><h4>Community</h4><ul><li><a 
href="//docs.prediction.io/install/" target="blank">Download</a></li><li><a 
href="//docs.prediction.io/" target="blank">Docs</a></li><li><a 
href="//github.com/
 apache/incubator-predictionio" target="blank">GitHub</a></li><li><a 
href="mailto:user-subscribe@predictionio

<TRUNCATED>
http://git-wip-us.apache.org/repos/asf/incubator-predictionio-site/blob/138f9481/templates/ecommercerecommendation/dase/index.html.gz
----------------------------------------------------------------------
diff --git a/templates/ecommercerecommendation/dase/index.html.gz 
b/templates/ecommercerecommendation/dase/index.html.gz
new file mode 100644
index 0000000..2e15beb
Binary files /dev/null and 
b/templates/ecommercerecommendation/dase/index.html.gz differ


Reply via email to