http://git-wip-us.apache.org/repos/asf/predictionio-site/blob/765e178c/archived/tapster/index.html
----------------------------------------------------------------------
diff --git a/archived/tapster/index.html b/archived/tapster/index.html
new file mode 100644
index 0000000..aa68c35
--- /dev/null
+++ b/archived/tapster/index.html
@@ -0,0 +1,269 @@
+<!DOCTYPE html><html><head><title>Comics Recommendation Demo</title><meta 
charset="utf-8"/><meta content="IE=edge,chrome=1" 
http-equiv="X-UA-Compatible"/><meta name="viewport" 
content="width=device-width, initial-scale=1.0"/><meta class="swiftype" 
name="title" data-type="string" content="Comics Recommendation Demo"/><link 
rel="canonical" href="https://predictionio.apache.org/archived/tapster/"/><link 
href="/images/favicon/normal-b330020a.png" rel="shortcut icon"/><link 
href="/images/favicon/apple-c0febcf2.png" rel="apple-touch-icon"/><link 
href="//fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,800italic,400,300,600,700,800"
 rel="stylesheet"/><link 
href="//maxcdn.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.min.css" 
rel="stylesheet"/><link href="/stylesheets/application-eccfc6cb.css" 
rel="stylesheet" type="text/css"/><script 
src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.2/html5shiv.min.js"></script><script
 src="//cdn.mathjax.org/mathja
 x/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script><script 
src="//use.typekit.net/pqo0itb.js"></script><script>try{Typekit.load({ async: 
true });}catch(e){}</script></head><body><div id="global"><header><div 
class="container" id="header-wrapper"><div class="row"><div 
class="col-sm-12"><div id="logo-wrapper"><span id="drawer-toggle"></span><a 
href="#"></a><a href="http://predictionio.apache.org/";><img alt="Apache 
PredictionIO" id="logo" 
src="/images/logos/logo-ee2b9bb3.png"/></a><span>®</span></div><div 
id="menu-wrapper"><div id="pill-wrapper"><a class="pill left" 
href="/gallery/template-gallery">TEMPLATES</a> <a class="pill right" 
href="//github.com/apache/predictionio/">OPEN SOURCE</a></div></div><img 
class="mobile-search-bar-toggler hidden-md hidden-lg" 
src="/images/icons/search-glass-704bd4ff.png"/></div></div></div></header><div 
id="search-bar-row-wrapper"><div class="container-fluid" 
id="search-bar-row"><div class="row"><div class="col-md-9 col-sm-11 
col-xs-11"><div cl
 ass="hidden-md hidden-lg" id="mobile-page-heading-wrapper"><p>PredictionIO 
Docs</p><h4>Comics Recommendation Demo</h4></div><h4 class="hidden-sm 
hidden-xs">PredictionIO Docs</h4></div><div class="col-md-3 col-sm-1 col-xs-1 
hidden-md hidden-lg"><img id="left-menu-indicator" 
src="/images/icons/down-arrow-dfe9f7fe.png"/></div><div class="col-md-3 
col-sm-12 col-xs-12 swiftype-wrapper"><div class="swiftype"><form 
class="search-form"><img class="search-box-toggler hidden-xs hidden-sm" 
src="/images/icons/search-glass-704bd4ff.png"/><div class="search-box"><img 
src="/images/icons/search-glass-704bd4ff.png"/><input type="text" 
id="st-search-input" class="st-search-input" placeholder="Search 
Doc..."/></div><img class="swiftype-row-hider hidden-md hidden-lg" 
src="/images/icons/drawer-toggle-active-fcbef12a.png"/></form></div></div><div 
class="mobile-left-menu-toggler hidden-md 
hidden-lg"></div></div></div></div><div id="page" class="container-fluid"><div 
class="row"><div id="left-menu-wrapper"
  class="col-md-3"><nav id="nav-main"><ul><li class="level-1"><a 
class="expandible" href="/"><span>Apache PredictionIO® 
Documentation</span></a><ul><li class="level-2"><a class="final" 
href="/"><span>Welcome to Apache PredictionIO®</span></a></li></ul></li><li 
class="level-1"><a class="expandible" href="#"><span>Getting 
Started</span></a><ul><li class="level-2"><a class="final" 
href="/start/"><span>A Quick Intro</span></a></li><li class="level-2"><a 
class="final" href="/install/"><span>Installing Apache 
PredictionIO</span></a></li><li class="level-2"><a class="final" 
href="/start/download/"><span>Downloading an Engine Template</span></a></li><li 
class="level-2"><a class="final" href="/start/deploy/"><span>Deploying Your 
First Engine</span></a></li><li class="level-2"><a class="final" 
href="/start/customize/"><span>Customizing the 
Engine</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Integrating with Your App</span></a><ul><li class="level-2"><a cl
 ass="final" href="/appintegration/"><span>App Integration 
Overview</span></a></li><li class="level-2"><a class="expandible" 
href="/sdk/"><span>List of SDKs</span></a><ul><li class="level-3"><a 
class="final" href="/sdk/java/"><span>Java & Android SDK</span></a></li><li 
class="level-3"><a class="final" href="/sdk/php/"><span>PHP 
SDK</span></a></li><li class="level-3"><a class="final" 
href="/sdk/python/"><span>Python SDK</span></a></li><li class="level-3"><a 
class="final" href="/sdk/ruby/"><span>Ruby SDK</span></a></li><li 
class="level-3"><a class="final" 
href="/community/projects/#sdks"><span>Community Powered 
SDKs</span></a></li></ul></li></ul></li><li class="level-1"><a 
class="expandible" href="#"><span>Deploying an Engine</span></a><ul><li 
class="level-2"><a class="final" href="/deploy/"><span>Deploying as a Web 
Service</span></a></li><li class="level-2"><a class="final" 
href="/batchpredict/"><span>Batch Predictions</span></a></li><li 
class="level-2"><a class="final" href="/deploy/
 monitoring/"><span>Monitoring Engine</span></a></li><li class="level-2"><a 
class="final" href="/deploy/engineparams/"><span>Setting Engine 
Parameters</span></a></li><li class="level-2"><a class="final" 
href="/deploy/enginevariants/"><span>Deploying Multiple Engine 
Variants</span></a></li><li class="level-2"><a class="final" 
href="/deploy/plugin/"><span>Engine Server Plugin</span></a></li></ul></li><li 
class="level-1"><a class="expandible" href="#"><span>Customizing an 
Engine</span></a><ul><li class="level-2"><a class="final" 
href="/customize/"><span>Learning DASE</span></a></li><li class="level-2"><a 
class="final" href="/customize/dase/"><span>Implement DASE</span></a></li><li 
class="level-2"><a class="final" 
href="/customize/troubleshooting/"><span>Troubleshooting Engine 
Development</span></a></li><li class="level-2"><a class="final" 
href="/api/current/#package"><span>Engine Scala 
APIs</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Collecting and 
 Analyzing Data</span></a><ul><li class="level-2"><a class="final" 
href="/datacollection/"><span>Event Server Overview</span></a></li><li 
class="level-2"><a class="final" 
href="/datacollection/eventapi/"><span>Collecting Data with 
REST/SDKs</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/eventmodel/"><span>Events Modeling</span></a></li><li 
class="level-2"><a class="final" 
href="/datacollection/webhooks/"><span>Unifying Multichannel Data with 
Webhooks</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/channel/"><span>Channel</span></a></li><li 
class="level-2"><a class="final" 
href="/datacollection/batchimport/"><span>Importing Data in 
Batch</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/analytics/"><span>Using Analytics 
Tools</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/plugin/"><span>Event Server 
Plugin</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><
 span>Choosing an Algorithm</span></a><ul><li class="level-2"><a class="final" 
href="/algorithm/"><span>Built-in Algorithm Libraries</span></a></li><li 
class="level-2"><a class="final" href="/algorithm/switch/"><span>Switching to 
Another Algorithm</span></a></li><li class="level-2"><a class="final" 
href="/algorithm/multiple/"><span>Combining Multiple 
Algorithms</span></a></li><li class="level-2"><a class="final" 
href="/algorithm/custom/"><span>Adding Your Own 
Algorithms</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Tuning and Evaluation</span></a><ul><li class="level-2"><a 
class="final" href="/evaluation/"><span>Overview</span></a></li><li 
class="level-2"><a class="final" 
href="/evaluation/paramtuning/"><span>Hyperparameter Tuning</span></a></li><li 
class="level-2"><a class="final" 
href="/evaluation/evaluationdashboard/"><span>Evaluation 
Dashboard</span></a></li><li class="level-2"><a class="final" 
href="/evaluation/metricchoose/"><span>Choosing Ev
 aluation Metrics</span></a></li><li class="level-2"><a class="final" 
href="/evaluation/metricbuild/"><span>Building Evaluation 
Metrics</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>System Architecture</span></a><ul><li class="level-2"><a 
class="final" href="/system/"><span>Architecture Overview</span></a></li><li 
class="level-2"><a class="final" href="/system/anotherdatastore/"><span>Using 
Another Data Store</span></a></li></ul></li><li class="level-1"><a 
class="expandible" href="#"><span>PredictionIO® Official 
Templates</span></a><ul><li class="level-2"><a class="final" 
href="/templates/"><span>Intro</span></a></li><li class="level-2"><a 
class="expandible" href="#"><span>Recommendation</span></a><ul><li 
class="level-3"><a class="final" 
href="/templates/recommendation/quickstart/"><span>Quick 
Start</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/dase/"><span>DASE</span></a></li><li 
class="level-3"><a class="fin
 al" href="/templates/recommendation/evaluation/"><span>Evaluation 
Explained</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/how-to/"><span>How-To</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/recommendation/reading-custom-events/"><span>Read Custom 
Events</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/customize-data-prep/"><span>Customize Data 
Preparator</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/customize-serving/"><span>Customize 
Serving</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/training-with-implicit-preference/"><span>Train 
with Implicit Preference</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/blacklist-items/"><span>Filter Recommended 
Items by Blacklist in Query</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/batch-evaluator/"><s
 pan>Batch Persistable Evaluator</span></a></li></ul></li><li 
class="level-2"><a class="expandible" href="#"><span>E-Commerce 
Recommendation</span></a><ul><li class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/quickstart/"><span>Quick 
Start</span></a></li><li class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/dase/"><span>DASE</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/how-to/"><span>How-To</span></a></li><li
 class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/train-with-rate-event/"><span>Train 
with Rate Event</span></a></li><li class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/adjust-score/"><span>Adjust 
Score</span></a></li></ul></li><li class="level-2"><a class="expandible" 
href="#"><span>Similar Product</span></a><ul><li class="level-3"><a 
class="final" href="/templates/similarproduct/quickstart/"><span>Quick 
Start</span></a></li><li class
 ="level-3"><a class="final" 
href="/templates/similarproduct/dase/"><span>DASE</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/similarproduct/how-to/"><span>How-To</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/similarproduct/multi-events-multi-algos/"><span>Multiple 
Events and Multiple Algorithms</span></a></li><li class="level-3"><a 
class="final" 
href="/templates/similarproduct/return-item-properties/"><span>Returns Item 
Properties</span></a></li><li class="level-3"><a class="final" 
href="/templates/similarproduct/train-with-rate-event/"><span>Train with Rate 
Event</span></a></li><li class="level-3"><a class="final" 
href="/templates/similarproduct/rid-user-set-event/"><span>Get Rid of Events 
for Users</span></a></li><li class="level-3"><a class="final" 
href="/templates/similarproduct/recommended-user/"><span>Recommend 
Users</span></a></li></ul></li><li class="level-2"><a class="expandible" 
href="#"><span>Classification</span></a><ul><li cl
 ass="level-3"><a class="final" 
href="/templates/classification/quickstart/"><span>Quick 
Start</span></a></li><li class="level-3"><a class="final" 
href="/templates/classification/dase/"><span>DASE</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/classification/how-to/"><span>How-To</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/classification/add-algorithm/"><span>Use Alternative 
Algorithm</span></a></li><li class="level-3"><a class="final" 
href="/templates/classification/reading-custom-properties/"><span>Read Custom 
Properties</span></a></li></ul></li></ul></li><li class="level-1"><a 
class="expandible" href="#"><span>Engine Template Gallery</span></a><ul><li 
class="level-2"><a class="final" 
href="/gallery/template-gallery/"><span>Browse</span></a></li><li 
class="level-2"><a class="final" 
href="/community/submit-template/"><span>Submit your Engine as a 
Template</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>
 Demo Tutorials</span></a><ul><li class="level-2"><a class="final" 
href="/community/projects/#demos"><span>Community Contributed 
Demo</span></a></li><li class="level-2"><a class="final" 
href="/demo/textclassification/"><span>Text Classification Engine 
Tutorial</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="/community/"><span>Getting Involved</span></a><ul><li class="level-2"><a 
class="final" href="/community/contribute-code/"><span>Contribute 
Code</span></a></li><li class="level-2"><a class="final" 
href="/community/contribute-documentation/"><span>Contribute 
Documentation</span></a></li><li class="level-2"><a class="final" 
href="/community/contribute-sdk/"><span>Contribute a SDK</span></a></li><li 
class="level-2"><a class="final" 
href="/community/contribute-webhook/"><span>Contribute a 
Webhook</span></a></li><li class="level-2"><a class="final" 
href="/community/projects/"><span>Community 
Projects</span></a></li></ul></li><li class="level-1"><a class="expandi
 ble" href="#"><span>Getting Help</span></a><ul><li class="level-2"><a 
class="final" href="/resources/faq/"><span>FAQs</span></a></li><li 
class="level-2"><a class="final" 
href="/support/"><span>Support</span></a></li></ul></li><li class="level-1"><a 
class="expandible" href="#"><span>Resources</span></a><ul><li 
class="level-2"><a class="final" href="/cli/"><span>Command-line 
Interface</span></a></li><li class="level-2"><a class="final" 
href="/resources/release/"><span>Release Cadence</span></a></li><li 
class="level-2"><a class="final" href="/resources/intellij/"><span>Developing 
Engines with IntelliJ IDEA</span></a></li><li class="level-2"><a class="final" 
href="/resources/upgrade/"><span>Upgrade Instructions</span></a></li><li 
class="level-2"><a class="final" 
href="/resources/glossary/"><span>Glossary</span></a></li></ul></li><li 
class="level-1"><a class="expandible" href="#"><span>Apache Software 
Foundation</span></a><ul><li class="level-2"><a class="final" 
href="https://www.apache.
 org/"><span>Apache Homepage</span></a></li><li class="level-2"><a 
class="final" 
href="https://www.apache.org/licenses/";><span>License</span></a></li><li 
class="level-2"><a class="final" 
href="https://www.apache.org/foundation/sponsorship.html";><span>Sponsorship</span></a></li><li
 class="level-2"><a class="final" 
href="https://www.apache.org/foundation/thanks.html";><span>Thanks</span></a></li><li
 class="level-2"><a class="final" 
href="https://www.apache.org/security/";><span>Security</span></a></li></ul></li></ul></nav></div><div
 class="col-md-9 col-sm-12"><div class="content-header hidden-md 
hidden-lg"><div id="page-title"><h1>Comics Recommendation 
Demo</h1></div></div><div id="table-of-content-wrapper"><h5>On this 
page</h5><aside id="table-of-contents"><ul> <li> <a 
href="#introduction">Introduction</a> </li> <li> <a 
href="#tapster-demo-application">Tapster Demo Application</a> </li> <li> <a 
href="#apache-predictionio-setup">Apache PredictionIO Setup</a> </li> <li> <a 
href="#import-d
 ata">Import Data</a> </li> <li> <a 
href="#connect-demo-app-with-apache-predictionio">Connect Demo app with Apache 
PredictionIO</a> </li> <li> <a href="#links">Links</a> </li> <li> <a 
href="#conclusion">Conclusion</a> </li> </ul> </aside><hr/><a 
id="edit-page-link" 
href="https://github.com/apache/predictionio/tree/livedoc/docs/manual/source/archived/tapster.html.md";><img
 src="/images/icons/edit-pencil-d6c1bb3d.png"/>Edit this page</a></div><div 
class="content-header hidden-sm hidden-xs"><div id="page-title"><h1>Comics 
Recommendation Demo</h1></div></div><div class="content"> <h2 id='introduction' 
class='header-anchors'>Introduction</h2><p>In this demo, we will show you how 
to build a Tinder-style web application (named &quot;Tapster&quot;) 
recommending comics to users based on their likes/dislikes of episodes 
interactively.</p><p>The demo will use <a 
href="https://predictionio.apache.org/templates/similarproduct/quickstart/";>Similar
 Product Template</a>. Similar Product Template is a
  great choice if you want to make recommendations based on immediate user 
activities or for new users with limited history. It uses MLLib Alternating 
Least Squares (ALS) recommendation algorithm, a <a 
href="http://en.wikipedia.org/wiki/Recommender_system#Collaborative_filtering";>Collaborative
 filtering</a> (CF) algorithm commonly used for recommender systems. These 
techniques aim to fill in the missing entries of a user-item association 
matrix. Users and products are described by a small set of latent factors that 
can be used to predict missing entries. A layman&#39;s interpretation of 
Collaborative Filtering is &quot;People who like this comic, also like these 
comics.&quot;</p><p>All the code and data is on GitHub at: <a 
href="https://github.com/PredictionIO/Demo-Tapster";>github.com/PredictionIO/Demo-Tapster</a>.</p><h3
 id='data' class='header-anchors'>Data</h3><p>The source of the data is from <a 
href="http://tapastic.com/";>Tapastic</a>. You can find the data files <a 
href="https:
 //github.com/PredictionIO/Demo-Tapster/tree/master/data">here</a>.</p><p>The 
data structure looks like this:</p><p><a 
href="https://github.com/PredictionIO/Demo-Tapster/blob/master/data/episode_list.csv";>Episode
 List</a> <code>data/episode_list.csv</code></p><p><strong>Fields:</strong> 
episodeId | episodeTitle | episodeCategories | episodeUrl | 
episodeImageUrls</p><p>1,000 rows. Each row represents one episode.</p><p><a 
href="https://github.com/PredictionIO/Demo-Tapster/blob/master/data/user_list.csv";>User
 Like Event List</a> 
<code>data/user_list.csv</code></p><p><strong>Fields:</strong> userId | 
episodeId | likedTimestamp</p><p>192,587 rows. Each row represents one user 
like for the given episode.</p><p>The tutorial has four major steps:</p> <ul> 
<li>Demo application setup</li> <li>PredictionIO installation and setup</li> 
<li>Import data into database and PredictionIO</li> <li>Integrate demo 
application with PredictionIO</li> </ul> <h2 id='tapster-demo-application' 
class='header-an
 chors'>Tapster Demo Application</h2><p>The demo application is built using 
Rails.</p><p>You can clone the existing application with:</p><div 
class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3</pre></td><td class="code"><pre><span class="gp">$ </span>git clone  
https://github.com/PredictionIO/Demo-Tapster.git
+<span class="gp">$ </span><span class="nb">cd </span>Demo-Tapster
+<span class="gp">$ </span>bundle install
+</pre></td></tr></tbody></table> </div> <p>You will need to edit 
<code>config/database.yml</code> to match your local database settings. We have 
provided some sensible defaults for PostgreSQL, MySQL, and SQLite.</p><p>Setup 
the database with:</p><div class="highlight shell"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1
+2</pre></td><td class="code"><pre><span class="gp">$ </span>rake db:create
+<span class="gp">$ </span>rake db:migrate
+</pre></td></tr></tbody></table> </div> <p>At this point, you should have the 
demo application ready but with an empty database. Lets import the episodes 
data into our database. We will do this with: <code>$ rake 
import:episodes</code>. An &quot;Episode&quot; is a single <a 
href="http://en.wikipedia.org/wiki/Comic_strip";>comic strip</a>.</p><p><a 
href="https://github.com/PredictionIO/Demo-Tapster/blob/master/lib/tasks/import/episodes.rake";>View
 on GitHub</a></p><p>This script is pretty simple. It loops through the CSV 
file and creates a new episode for each line in the file in our local 
database.</p><p>You can start the app and point your browser to <a 
href="http://localhost:3000";>http://localhost:3000</a></p><div class="highlight 
shell"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" 
style="text-align: right"><pre class="lineno">1</pre></td><td 
class="code"><pre><span class="nv">$rails</span> server
+</pre></td></tr></tbody></table> </div> <p><img alt="Rails Server" 
src="/images/demo/tapster/rails-server-997d690e.png"/></p><h2 
id='apache-predictionio-setup' class='header-anchors'>Apache PredictionIO 
Setup</h2><h3 id='install-apache-predictionio' class='header-anchors'>Install 
Apache PredictionIO</h3><p>Follow the installation instructions <a 
href="http://predictionio.apache.org/install/";>here</a> or simply run:</p><div 
class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1</pre></td><td 
class="code"><pre><span class="gp">$ </span>bash -c <span 
class="s2">"</span><span class="k">$(</span>curl -s 
https://raw.githubusercontent.com/apache/predictionio/master/bin/install.sh<span
 class="k">)</span><span class="s2">"</span>
+</pre></td></tr></tbody></table> </div> <p><img alt="PIO Install" 
src="/images/demo/tapster/pio-install-2d870aed.png"/></p><h3 
id='create-a-new-app' class='header-anchors'>Create a New App</h3><p>You will 
need to create a new app on Apache PredictionIO to house the Tapster demo. You 
can do this with:</p><div class="highlight shell"><table style="border-spacing: 
0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre 
class="lineno">1</pre></td><td class="code"><pre><span class="gp">$ </span>pio 
app new tapster
+</pre></td></tr></tbody></table> </div> <p>Take note of the App ID and Access 
Key.</p><p><img alt="PIO App New" 
src="/images/demo/tapster/pio-app-new-5a8ae503.png"/></p><h3 id='setup-engine' 
class='header-anchors'>Setup Engine</h3><p>We are going to copy the Similar 
Product Template into the PIO directory.</p><div class="highlight shell"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1
+2</pre></td><td class="code"><pre><span class="gp">$ </span><span 
class="nb">cd </span>PredictionIO
+<span class="gp">$ </span>git clone 
https://github.com/apache/predictionio-template-similar-product.git 
tapster-episode-similar
+</pre></td></tr></tbody></table> </div> <p>Next we are going to update the App 
ID in the ‘engine.json’ file to match the App ID we just created.</p><div 
class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3</pre></td><td class="code"><pre><span class="gp">$ </span><span 
class="nb">cd </span>tapster-episode-similar
+<span class="gp">$ </span>nano engine.json
+<span class="gp">$ </span><span class="nb">cd</span> ..
+</pre></td></tr></tbody></table> </div> <p><img alt="Engine Setup" 
src="/images/demo/tapster/pio-engine-setup-88e25cc0.png"/></p><h3 
id='modify--engine-template' class='header-anchors'>Modify Engine 
Template</h3><p>By the default, the engine template reads the “view” 
events. We can easily to change it to read “like” events.</p> <p>Modify 
<code>readTraining()</code> in DataSource.scala:</p><div class="highlight 
scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" 
style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19
+20
+21
+22
+23
+24
+25
+26
+27
+28
+29
+30
+31
+32
+33
+34
+35
+36</pre></td><td class="code"><pre>
+  <span class="k">override</span>
+  <span class="k">def</span> <span class="n">readTraining</span><span 
class="o">(</span><span class="n">sc</span><span class="k">:</span> <span 
class="kt">SparkContext</span><span class="o">)</span><span class="k">:</span> 
<span class="kt">TrainingData</span> <span class="o">=</span> <span 
class="o">{</span>
+
+    <span class="o">...</span>
+
+    <span class="k">val</span> <span class="n">viewEventsRDD</span><span 
class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span 
class="kt">ViewEvent</span><span class="o">]</span> <span class="k">=</span> 
<span class="n">eventsDb</span><span class="o">.</span><span 
class="n">find</span><span class="o">(</span>
+      <span class="n">appId</span> <span class="k">=</span> <span 
class="n">dsp</span><span class="o">.</span><span class="n">appId</span><span 
class="o">,</span>
+      <span class="n">entityType</span> <span class="k">=</span> <span 
class="nc">Some</span><span class="o">(</span><span 
class="s">"user"</span><span class="o">),</span>
+      <span class="n">eventNames</span> <span class="k">=</span> <span 
class="nc">Some</span><span class="o">(</span><span class="nc">List</span><span 
class="o">(</span><span class="s">"like"</span><span class="o">)),</span> <span 
class="c1">// MODIFIED
+</span>      <span class="c1">// targetEntityType is optional field of an 
event.
+</span>      <span class="n">targetEntityType</span> <span class="k">=</span> 
<span class="nc">Some</span><span class="o">(</span><span 
class="nc">Some</span><span class="o">(</span><span 
class="s">"item"</span><span class="o">)))(</span><span 
class="n">sc</span><span class="o">)</span>
+      <span class="c1">// eventsDb.find() returns RDD[Event]
+</span>      <span class="o">.</span><span class="n">map</span> <span 
class="o">{</span> <span class="n">event</span> <span class="k">=&gt;</span>
+        <span class="k">val</span> <span class="n">viewEvent</span> <span 
class="k">=</span> <span class="k">try</span> <span class="o">{</span>
+          <span class="n">event</span><span class="o">.</span><span 
class="n">event</span> <span class="k">match</span> <span class="o">{</span>
+            <span class="k">case</span> <span class="s">"like"</span> <span 
class="k">=&gt;</span> <span class="nc">ViewEvent</span><span 
class="o">(</span> <span class="c1">// MODIFIED
+</span>              <span class="n">user</span> <span class="k">=</span> 
<span class="n">event</span><span class="o">.</span><span 
class="n">entityId</span><span class="o">,</span>
+              <span class="n">item</span> <span class="k">=</span> <span 
class="n">event</span><span class="o">.</span><span 
class="n">targetEntityId</span><span class="o">.</span><span 
class="n">get</span><span class="o">,</span>
+              <span class="n">t</span> <span class="k">=</span> <span 
class="n">event</span><span class="o">.</span><span 
class="n">eventTime</span><span class="o">.</span><span 
class="n">getMillis</span><span class="o">)</span>
+            <span class="k">case</span> <span class="k">_</span> <span 
class="k">=&gt;</span> <span class="k">throw</span> <span class="k">new</span> 
<span class="nc">Exception</span><span class="o">(</span><span 
class="n">s</span><span class="s">"Unexpected event ${event} is 
read."</span><span class="o">)</span>
+          <span class="o">}</span>
+        <span class="o">}</span> <span class="k">catch</span> <span 
class="o">{</span>
+          <span class="k">case</span> <span class="n">e</span><span 
class="k">:</span> <span class="kt">Exception</span> <span 
class="o">=&gt;</span> <span class="o">{</span>
+            <span class="n">logger</span><span class="o">.</span><span 
class="n">error</span><span class="o">(</span><span class="n">s</span><span 
class="s">"Cannot convert ${event} to ViewEvent."</span> <span 
class="o">+</span>
+              <span class="n">s</span><span class="s">" Exception: 
${e}."</span><span class="o">)</span>
+            <span class="k">throw</span> <span class="n">e</span>
+          <span class="o">}</span>
+        <span class="o">}</span>
+        <span class="n">viewEvent</span>
+      <span class="o">}</span>
+
+    <span class="o">...</span>
+  <span class="o">}</span>
+<span class="o">}</span>
+
+</pre></td></tr></tbody></table> </div> <p>Finally to build the engine we will 
run:</p><div class="highlight shell"><table style="border-spacing: 
0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre 
class="lineno">1
+2
+3</pre></td><td class="code"><pre><span class="gp">$ </span><span 
class="nb">cd </span>tapster-episode-similar
+<span class="gp">$ </span>pio build
+<span class="gp">$ </span><span class="nb">cd</span> ..
+</pre></td></tr></tbody></table> </div> <p><img alt="PIO Build" 
src="/images/demo/tapster/pio-build-e6eb1d7c.png"/></p><h2 id='import-data' 
class='header-anchors'>Import Data</h2><p>Once everything is installed, start 
the event server by running: <code>$ pio eventserver</code></p><p><img 
alt="Event Server" 
src="/images/demo/tapster/pio-eventserver-88889ec0.png"/></p><div 
class="alert-message info"><p>You can check the status of Apache PredictionIO 
at any time by running: <code>$ pio status</code></p></div><p>ALERT: If your 
laptop goes to sleep you might manually need to restart HBase with:</p><div 
class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3</pre></td><td class="code"><pre><span class="gp">$ </span><span 
class="nb">cd </span>PredictionIO/venders/hbase-0.98.6/bin
+<span class="gp">$ </span>./stop-hbase.sh
+<span class="gp">$ </span>./start-hbase.sh
+</pre></td></tr></tbody></table> </div> <p>The key event we are importing into 
Apache PredictionIO event server is the &quot;Like&quot; event (for example, 
user X likes episode Y).</p><p>We will send this data to Apache PredictionIO by 
executing <code>$ rake import:predictionio</code> command.</p><p><a 
href="https://github.com/PredictionIO/Demo-Tapster/blob/master/lib/tasks/import/predictionio.rake";>View
 on GitHub</a></p><p>This script is a little more complex. First we need to 
connect to the Event Server.</p><div class="highlight shell"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1</pre></td><td class="code"><pre>client <span 
class="o">=</span> PredictionIO::EventClient.new<span 
class="o">(</span>ENV[<span class="s1">'PIO_ACCESS_KEY'</span><span 
class="o">]</span>, ENV[<span class="s1">'PIO_EVENT_SERVER_URL'</span><span 
class="o">]</span>, THREADS<span class="o">)</span>
+</pre></td></tr></tbody></table> </div> <p>You will need to create the 
environmental variables <code>PIO_ACCESS_KEY</code> and 
<code>PIO_EVENT_SERVER_URL</code>. The default Event Server URL is: <a 
href="http://localhost:7070";>http://localhost:7070</a>.</p><div 
class="alert-message info"><p>If you forget your <strong>Access Key</strong> 
you can always run: <code>$ pio app list</code></p></div><p>You can set these 
values in the <code>.env</code> file located in the application root directory 
and it will be automatically loaded into your environment each time Rails is 
run.</p><p>The next part of the script loops through each line of the 
<code>data/user_list.csv</code> file and returns an array of unique user and 
episode IDs. Once we have those we can send the data to Apache PredictionIO 
like this.</p><p>First the users:</p><div class="highlight shell"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1
+2
+3
+4
+5</pre></td><td class="code"><pre>user_ids.each_with_index <span 
class="k">do</span> |id, i|
+  <span class="c"># Send unique user IDs to PredictionIO.</span>
+  client.aset_user<span class="o">(</span>id<span class="o">)</span>
+  puts <span class="s2">"Sent user ID #{id} to PredictionIO. Action #{i + 1} 
of #{user_count}"</span>
+end
+</pre></td></tr></tbody></table> </div> <p>And now the episodes:</p><div 
class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17</pre></td><td class="code"><pre>episode_ids.each_with_index <span 
class="k">do</span> |id, i|
+  <span class="c"># Load episode from database - we will need this to include 
the categories!</span>
+  episode <span class="o">=</span> Episode.where<span 
class="o">(</span>episode_id: id<span class="o">)</span>.take
+
+  <span class="k">if </span>episode
+    <span class="c"># Send unique episode IDs to PredictionIO.</span>
+    client.acreate_event<span class="o">(</span>
+      <span class="s1">'$set'</span>,
+      <span class="s1">'item'</span>,
+      id,
+      properties: <span class="o">{</span> categories: episode.categories 
<span class="o">}</span>
+    <span class="o">)</span>
+    puts <span class="s2">"Sent episode ID #{id} to PredictionIO. Action #{i + 
1} of #{episode_count}"</span>
+  <span class="k">else
+    </span>puts <span class="s2">"Episode ID #{id} not found in database! 
Skipping!"</span>.color<span class="o">(</span>:red<span class="o">)</span>
+  end
+end
+</pre></td></tr></tbody></table> </div> <p>Finally we loop through the 
<code>data/user_list.csv</code> file a final time to send the like 
events:</p><div class="highlight shell"><table style="border-spacing: 
0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre 
class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14</pre></td><td class="code"><pre>CSV.foreach<span 
class="o">(</span>USER_LIST, headers: <span class="nb">true</span><span 
class="o">)</span> <span class="k">do</span> |row|
+  user_id <span class="o">=</span> row[0] <span class="c"># userId</span>
+  episode_id <span class="o">=</span> row[1] <span class="c"># episodeId</span>
+
+  <span class="c"># Send like to PredictionIO.</span>
+  client.acreate_event<span class="o">(</span>
+    <span class="s1">'like'</span>,
+    <span class="s1">'user'</span>,
+    user_id,
+    <span class="o">{</span> <span class="s1">'targetEntityType'</span> <span 
class="o">=</span>&gt; <span class="s1">'item'</span>, <span 
class="s1">'targetEntityId'</span> <span class="o">=</span>&gt; episode_id 
<span class="o">}</span>
+  <span class="o">)</span>
+
+  puts <span class="s2">"Sent user ID #{user_id} liked episode ID 
#{episode_id} to PredictionIO. Action #{</span><span 
class="nv">$INPUT_LINE_NUMBER</span><span class="s2">} of #{line_count}."</span>
+end
+</pre></td></tr></tbody></table> </div> <p>In total the script takes about 4 
minutes to run on a basic laptop. At this point all the data is now imported to 
Apache PredictionIO.</p><p><img alt="Import" 
src="/images/demo/tapster/pio-import-predictionio-1ecd11fd.png"/></p><h3 
id='engine-training' class='header-anchors'>Engine Training</h3><p>We train the 
engine with the following command:</p><div class="highlight shell"><table 
style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: 
right"><pre class="lineno">1
+2</pre></td><td class="code"><pre><span class="gp">$ </span><span 
class="nb">cd </span>tapster-episode-similar
+<span class="gp">$ </span>pio train -- --driver-memory 4g
+</pre></td></tr></tbody></table> </div> <p><img alt="PIO Train" 
src="/images/demo/tapster/pio-train-7edffad4.png"/></p><p>Using the 
--driver-memory option to limit the memory used by Apache PredictionIO. Without 
this Apache PredictionIO can consume too much memory leading to a crash. You 
can adjust the 4g up or down depending on your system specs.</p><p>You can set 
up a job to periodically retrain the engine so the model is updated with the 
latest dataset.</p><h3 id='deploy-model' class='header-anchors'>Deploy 
Model</h3><p>You can deploy the model with: <code>$ pio deploy</code> from the 
<code>tapster-episode-similar</code> directory.</p><p>At this point, you have 
an demo app with data and a Apache PredictionIO server with a trained model all 
setup. Next, we will connect the two so you can log the live interaction 
(likes) events into Apache PredictionIO event server and query the engine 
server for recommendation.</p><h2 
id='connect-demo-app-with-apache-predictionio' class='header-an
 chors'>Connect Demo app with Apache PredictionIO</h2><h3 id='overview' 
class='header-anchors'>Overview</h3><p>On a high level the application keeps a 
record of each like and dislike. It uses jQuery to send an array of both likes 
and dislikes to the server on each click. The server then queries Apache 
PredictionIO for a similar episode which is relayed to jQuery and displayed to 
the user.</p><p>Data flow:</p> <ul> <li>The user likes an episode.</li> 
<li>Tapster sends the &quot;Like&quot; event to Apache PredictionIO event 
server.</li> <li>Tapster queries Apache PredictionIO engine with all the 
episodes the user has rated (likes and dislikes) in this session.</li> 
<li>Apache PredictionIO returns 1 recommended episode.</li> </ul> <h3 
id='javascript' class='header-anchors'>JavaScript</h3><p>All the important code 
lives in <code>app/assets/javascripts/application.js</code> <a 
href="https://github.com/PredictionIO/Demo-Tapster/blob/master/app/assets/javascripts/application.js";>View
 on Git
 Hub</a></p><p>Most of this file is just handlers for click things, displaying 
the loading dialog and other such things.</p><p>The most important function is 
to query the Rails server for results from Apache PredictionIO.</p><div 
class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14</pre></td><td class="code"><pre>// Query the server <span class="k">for 
</span>a comic based on previous likes. See episodes#query.
+queryPIO: <span class="k">function</span><span class="o">()</span> <span 
class="o">{</span>
+  var _this <span class="o">=</span> this; // For closure.
+  <span class="nv">$.</span>ajax<span class="o">({</span>
+    url: <span class="s1">'/episodes/query'</span>,
+    <span class="nb">type</span>: <span class="s1">'POST'</span>,
+    data: <span class="o">{</span>
+      likes: JSON.stringify<span class="o">(</span>_this.likes<span 
class="o">)</span>,
+      dislikes: JSON.stringify<span class="o">(</span>_this.dislikes<span 
class="o">)</span>,
+    <span class="o">}</span>
+  <span class="o">})</span>.done<span class="o">(</span><span 
class="k">function</span><span class="o">(</span>data<span class="o">)</span> 
<span class="o">{</span>
+    _this.setComic<span class="o">(</span>data<span class="o">)</span>;
+  <span class="o">})</span>;
+<span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <h3 id='rails' 
class='header-anchors'>Rails</h3><p>On the Rails side all the fun things happen 
in the episodes controller located at: 
<code>app/controllers/episodes_controller</code> <a 
href="https://github.com/PredictionIO/Demo-Tapster/blob/master/app/controllers/episodes_controller.rb";>View
 on GitHub</a>.</p><div class="highlight shell"><table style="border-spacing: 
0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre 
class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19
+20
+21
+22
+23
+24
+25
+26
+27
+28
+29
+30
+31
+32</pre></td><td class="code"><pre>def query
+  <span class="c"># Create PredictionIO client.</span>
+  client <span class="o">=</span> PredictionIO::EngineClient.new<span 
class="o">(</span>ENV[<span class="s1">'PIO_ENGINE_URL'</span><span 
class="o">])</span>
+
+  <span class="c"># Get posted likes and dislikes.</span>
+  likes <span class="o">=</span> ActiveSupport::JSON.decode<span 
class="o">(</span>params[:likes]<span class="o">)</span>
+  dislikes <span class="o">=</span> ActiveSupport::JSON.decode<span 
class="o">(</span>params[:dislikes]<span class="o">)</span>
+
+  <span class="k">if </span>likes.empty?
+    <span class="c"># We can't query PredictionIO with no likes so</span>
+    <span class="c"># we will return a random comic instead.</span>
+    @episode <span class="o">=</span> random_episode
+
+    render json: @episode
+    <span class="k">return
+  </span>end
+
+  <span class="c"># Query PredictionIO.</span>
+  <span class="c"># Here we black list the disliked items so they are not 
shown again!</span>
+  response <span class="o">=</span> client.send_query<span 
class="o">(</span>items: likes, blackList: dislikes,  num: 1<span 
class="o">)</span>
+
+  <span class="c"># With a real application you would want to do some</span>
+  <span class="c"># better sanity checking of the response here!</span>
+
+  <span class="c"># Get ID of response.</span>
+  id <span class="o">=</span> response[<span 
class="s1">'itemScores'</span><span class="o">][</span>0][<span 
class="s1">'item'</span><span class="o">]</span>
+
+  <span class="c"># Find episode in database.</span>
+  @episode <span class="o">=</span> Episode.where<span 
class="o">(</span>episode_id: id<span class="o">)</span>.take
+
+  render json: @episode
+end
+</pre></td></tr></tbody></table> </div> <p>On the first line we make a 
connection to Apache PredictionIO. You will need to set the 
<code>PIO_ENGINE_URL</code>. This can be done in the <code>.env</code> file. 
The default URL is: <a 
href="http://localhost:8000";>http://localhost:8000</a>.</p><p>Next we decode 
the JSON sent from the browser.</p><p>After that we check to see if the user 
has liked anything yet. If not we just return a random episode.</p><p>If the 
user has likes then we can send that data to Apache PredictionIO event 
server.</p><p>We also blacklist the dislikes so that they are not 
returned.</p><p>With our response from Apache PredictionIO it’s just a matter 
of looking it up in the database and rendering that object as JSON.</p><p>Once 
the response is sent to the browser JavaScript is used to replace the existing 
comic and hide the loading message.</p><p>Thats it. You’re done! If Ruby is 
not your language of choice check out our other <a 
href="http://predictionio.apach
 e.org/sdk/">SDKs</a> and remember you can always interact with the Event 
Server though it’s native JSON API.</p><h2 id='links' 
class='header-anchors'>Links</h2><p>Source code is on GitHub at: <a 
href="https://github.com/PredictionIO/Demo-Tapster";>github.com/PredictionIO/Demo-Tapster</a></p><h2
 id='conclusion' class='header-anchors'>Conclusion</h2><p>Love this tutorial 
and Apache PredictionIO? Both are open source (Apache 2 License). <a 
href="https://github.com/PredictionIO/Demo-Tapster";>Fork</a> this demo and 
build upon it. If you produce something cool shoot us an email and we will link 
to it from here.</p><p>Found a typo? Think something should be explained 
better? This tutorial (and all our other documentation) live in the main repo 
<a 
href="https://github.com/apache/predictionio/blob/livedoc/docs/manual/source/demo/tapster.html.md";>here</a>.
 Our documentation is in the <code>livedoc</code> branch. Find out how to 
contribute documentation at <a href="http://predictionio.apache.
 
org/community/contribute-documentation/">http://predictionio.apache.org/community/contribute-documentation/</a>].</p><p>We
 &hearts; pull requests!</p></div></div></div></div><footer><div 
class="container"><div class="seperator"></div><div class="row"><div 
class="col-md-6 footer-link-column"><div 
class="footer-link-column-row"><h4>Community</h4><ul><li><a 
href="//predictionio.apache.org/install/" 
target="blank">Download</a></li><li><a href="//predictionio.apache.org/" 
target="blank">Docs</a></li><li><a href="//github.com/apache/predictionio" 
target="blank">GitHub</a></li><li><a 
href="mailto:user-subscr...@predictionio.apache.org"; target="blank">Subscribe 
to User Mailing List</a></li><li><a 
href="//stackoverflow.com/questions/tagged/predictionio" 
target="blank">Stackoverflow</a></li></ul></div></div><div class="col-md-6 
footer-link-column"><div 
class="footer-link-column-row"><h4>Contribute</h4><ul><li><a 
href="//predictionio.apache.org/community/contribute-code/" 
target="blank">Contri
 bute</a></li><li><a href="//github.com/apache/predictionio" 
target="blank">Source Code</a></li><li><a 
href="//issues.apache.org/jira/browse/PIO" target="blank">Bug 
Tracker</a></li><li><a href="mailto:dev-subscr...@predictionio.apache.org"; 
target="blank">Subscribe to Development Mailing 
List</a></li></ul></div></div></div><div class="row"><div class="col-md-12 
footer-link-column"><p>Apache PredictionIO, PredictionIO, Apache, the Apache 
feather logo, and the Apache PredictionIO project logo are either registered 
trademarks or trademarks of The Apache Software Foundation in the United States 
and other countries.</p><p>All other marks mentioned may be trademarks or 
registered trademarks of their respective owners.</p></div></div></div><div 
id="footer-bottom"><div class="container"><div class="row"><div 
class="col-md-12"><div id="footer-logo-wrapper"><img alt="PredictionIO" 
src="/images/logos/logo-white-d1e9c6e6.png"/><span>®</span></div><div 
id="social-icons-wrapper"><a class="github-b
 utton" href="https://github.com/apache/predictionio"; data-icon="octicon-star" 
data-show-count="true" aria-label="Star apache/predictionio on GitHub">Star</a> 
<a class="github-button" href="https://github.com/apache/predictionio/fork"; 
data-icon="octicon-repo-forked" data-show-count="true" aria-label="Fork 
apache/predictionio on GitHub">Fork</a> <script id="github-bjs" async="" 
defer="" src="https://buttons.github.io/buttons.js";></script><a 
href="https://twitter.com/predictionio"; target="blank"><img alt="PredictionIO 
on Twitter" src="/images/icons/twitter-ea9dc152.png"/></a> <a 
href="https://www.facebook.com/predictionio"; target="blank"><img 
alt="PredictionIO on Facebook" src="/images/icons/facebook-5c57939c.png"/></a> 
</div></div></div></div></div></footer></div><script>(function(w,d,t,u,n,s,e){w['SwiftypeObject']=n;w[n]=w[n]||function(){
+(w[n].q=w[n].q||[]).push(arguments);};s=d.createElement(t);
+e=d.getElementsByTagName(t)[0];s.async=1;s.src=u;e.parentNode.insertBefore(s,e);
+})(window,document,'script','//s.swiftypecdn.com/install/v1/st.js','_st');
+
+_st('install','HaUfpXXV87xoB_zzCQ45');</script><script 
src="/javascripts/application-d943a254.js"></script></body></html>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/predictionio-site/blob/765e178c/batchpredict/index.html
----------------------------------------------------------------------
diff --git a/batchpredict/index.html b/batchpredict/index.html
index c8e4073..fd62add 100644
--- a/batchpredict/index.html
+++ b/batchpredict/index.html
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html><head><title>Batch Predictions</title><meta 
charset="utf-8"/><meta content="IE=edge,chrome=1" 
http-equiv="X-UA-Compatible"/><meta name="viewport" 
content="width=device-width, initial-scale=1.0"/><meta class="swiftype" 
name="title" data-type="string" content="Batch Predictions"/><link 
rel="canonical" href="https://predictionio.apache.org/batchpredict/"/><link 
href="/images/favicon/normal-b330020a.png" rel="shortcut icon"/><link 
href="/images/favicon/apple-c0febcf2.png" rel="apple-touch-icon"/><link 
href="//fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,800italic,400,300,600,700,800"
 rel="stylesheet"/><link 
href="//maxcdn.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.min.css" 
rel="stylesheet"/><link href="/stylesheets/application-eccfc6cb.css" 
rel="stylesheet" type="text/css"/><script 
src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.2/html5shiv.min.js"></script><script
 src="//cdn.mathjax.org/mathjax/latest/MathJax.js?co
 nfig=TeX-AMS-MML_HTMLorMML"></script><script 
src="//use.typekit.net/pqo0itb.js"></script><script>try{Typekit.load({ async: 
true });}catch(e){}</script></head><body><div id="global"><header><div 
class="container" id="header-wrapper"><div class="row"><div 
class="col-sm-12"><div id="logo-wrapper"><span id="drawer-toggle"></span><a 
href="#"></a><a href="http://predictionio.apache.org/";><img alt="Apache 
PredictionIO" id="logo" 
src="/images/logos/logo-ee2b9bb3.png"/></a><span>®</span></div><div 
id="menu-wrapper"><div id="pill-wrapper"><a class="pill left" 
href="/gallery/template-gallery">TEMPLATES</a> <a class="pill right" 
href="//github.com/apache/predictionio/">OPEN SOURCE</a></div></div><img 
class="mobile-search-bar-toggler hidden-md hidden-lg" 
src="/images/icons/search-glass-704bd4ff.png"/></div></div></div></header><div 
id="search-bar-row-wrapper"><div class="container-fluid" 
id="search-bar-row"><div class="row"><div class="col-md-9 col-sm-11 
col-xs-11"><div class="hidden-md hidden-
 lg" id="mobile-page-heading-wrapper"><p>PredictionIO Docs</p><h4>Batch 
Predictions</h4></div><h4 class="hidden-sm hidden-xs">PredictionIO 
Docs</h4></div><div class="col-md-3 col-sm-1 col-xs-1 hidden-md hidden-lg"><img 
id="left-menu-indicator" 
src="/images/icons/down-arrow-dfe9f7fe.png"/></div><div class="col-md-3 
col-sm-12 col-xs-12 swiftype-wrapper"><div class="swiftype"><form 
class="search-form"><img class="search-box-toggler hidden-xs hidden-sm" 
src="/images/icons/search-glass-704bd4ff.png"/><div class="search-box"><img 
src="/images/icons/search-glass-704bd4ff.png"/><input type="text" 
id="st-search-input" class="st-search-input" placeholder="Search 
Doc..."/></div><img class="swiftype-row-hider hidden-md hidden-lg" 
src="/images/icons/drawer-toggle-active-fcbef12a.png"/></form></div></div><div 
class="mobile-left-menu-toggler hidden-md 
hidden-lg"></div></div></div></div><div id="page" class="container-fluid"><div 
class="row"><div id="left-menu-wrapper" class="col-md-3"><nav id="nav-
 main"><ul><li class="level-1"><a class="expandible" href="/"><span>Apache 
PredictionIO® Documentation</span></a><ul><li class="level-2"><a class="final" 
href="/"><span>Welcome to Apache PredictionIO®</span></a></li></ul></li><li 
class="level-1"><a class="expandible" href="#"><span>Getting 
Started</span></a><ul><li class="level-2"><a class="final" 
href="/start/"><span>A Quick Intro</span></a></li><li class="level-2"><a 
class="final" href="/install/"><span>Installing Apache 
PredictionIO</span></a></li><li class="level-2"><a class="final" 
href="/start/download/"><span>Downloading an Engine Template</span></a></li><li 
class="level-2"><a class="final" href="/start/deploy/"><span>Deploying Your 
First Engine</span></a></li><li class="level-2"><a class="final" 
href="/start/customize/"><span>Customizing the 
Engine</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Integrating with Your App</span></a><ul><li class="level-2"><a 
class="final" href="/appintegrati
 on/"><span>App Integration Overview</span></a></li><li class="level-2"><a 
class="expandible" href="/sdk/"><span>List of SDKs</span></a><ul><li 
class="level-3"><a class="final" href="/sdk/java/"><span>Java & Android 
SDK</span></a></li><li class="level-3"><a class="final" 
href="/sdk/php/"><span>PHP SDK</span></a></li><li class="level-3"><a 
class="final" href="/sdk/python/"><span>Python SDK</span></a></li><li 
class="level-3"><a class="final" href="/sdk/ruby/"><span>Ruby 
SDK</span></a></li><li class="level-3"><a class="final" 
href="/sdk/community/"><span>Community Powered 
SDKs</span></a></li></ul></li></ul></li><li class="level-1"><a 
class="expandible" href="#"><span>Deploying an Engine</span></a><ul><li 
class="level-2"><a class="final" href="/deploy/"><span>Deploying as a Web 
Service</span></a></li><li class="level-2"><a class="final active" 
href="/batchpredict/"><span>Batch Predictions</span></a></li><li 
class="level-2"><a class="final" href="/deploy/monitoring/"><span>Monitoring 
Engi
 ne</span></a></li><li class="level-2"><a class="final" 
href="/deploy/engineparams/"><span>Setting Engine Parameters</span></a></li><li 
class="level-2"><a class="final" href="/deploy/enginevariants/"><span>Deploying 
Multiple Engine Variants</span></a></li><li class="level-2"><a class="final" 
href="/deploy/plugin/"><span>Engine Server Plugin</span></a></li></ul></li><li 
class="level-1"><a class="expandible" href="#"><span>Customizing an 
Engine</span></a><ul><li class="level-2"><a class="final" 
href="/customize/"><span>Learning DASE</span></a></li><li class="level-2"><a 
class="final" href="/customize/dase/"><span>Implement DASE</span></a></li><li 
class="level-2"><a class="final" 
href="/customize/troubleshooting/"><span>Troubleshooting Engine 
Development</span></a></li><li class="level-2"><a class="final" 
href="/api/current/#package"><span>Engine Scala 
APIs</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Collecting and Analyzing Data</span></a><ul><li c
 lass="level-2"><a class="final" href="/datacollection/"><span>Event Server 
Overview</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/eventapi/"><span>Collecting Data with 
REST/SDKs</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/eventmodel/"><span>Events Modeling</span></a></li><li 
class="level-2"><a class="final" 
href="/datacollection/webhooks/"><span>Unifying Multichannel Data with 
Webhooks</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/channel/"><span>Channel</span></a></li><li 
class="level-2"><a class="final" 
href="/datacollection/batchimport/"><span>Importing Data in 
Batch</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/analytics/"><span>Using Analytics 
Tools</span></a></li><li class="level-2"><a class="final" 
href="/datacollection/plugin/"><span>Event Server 
Plugin</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Choosing an Algorithm</span><
 /a><ul><li class="level-2"><a class="final" href="/algorithm/"><span>Built-in 
Algorithm Libraries</span></a></li><li class="level-2"><a class="final" 
href="/algorithm/switch/"><span>Switching to Another 
Algorithm</span></a></li><li class="level-2"><a class="final" 
href="/algorithm/multiple/"><span>Combining Multiple 
Algorithms</span></a></li><li class="level-2"><a class="final" 
href="/algorithm/custom/"><span>Adding Your Own 
Algorithms</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Tuning and Evaluation</span></a><ul><li class="level-2"><a 
class="final" href="/evaluation/"><span>Overview</span></a></li><li 
class="level-2"><a class="final" 
href="/evaluation/paramtuning/"><span>Hyperparameter Tuning</span></a></li><li 
class="level-2"><a class="final" 
href="/evaluation/evaluationdashboard/"><span>Evaluation 
Dashboard</span></a></li><li class="level-2"><a class="final" 
href="/evaluation/metricchoose/"><span>Choosing Evaluation 
Metrics</span></a></li><l
 i class="level-2"><a class="final" 
href="/evaluation/metricbuild/"><span>Building Evaluation 
Metrics</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>System Architecture</span></a><ul><li class="level-2"><a 
class="final" href="/system/"><span>Architecture Overview</span></a></li><li 
class="level-2"><a class="final" href="/system/anotherdatastore/"><span>Using 
Another Data Store</span></a></li></ul></li><li class="level-1"><a 
class="expandible" href="#"><span>PredictionIO® Official 
Templates</span></a><ul><li class="level-2"><a class="final" 
href="/templates/"><span>Intro</span></a></li><li class="level-2"><a 
class="expandible" href="#"><span>Recommendation</span></a><ul><li 
class="level-3"><a class="final" 
href="/templates/recommendation/quickstart/"><span>Quick 
Start</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/dase/"><span>DASE</span></a></li><li 
class="level-3"><a class="final" href="/templates/recommendatio
 n/evaluation/"><span>Evaluation Explained</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/recommendation/how-to/"><span>How-To</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/recommendation/reading-custom-events/"><span>Read Custom 
Events</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/customize-data-prep/"><span>Customize Data 
Preparator</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/customize-serving/"><span>Customize 
Serving</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/training-with-implicit-preference/"><span>Train 
with Implicit Preference</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/blacklist-items/"><span>Filter Recommended 
Items by Blacklist in Query</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/batch-evaluator/"><span>Batch Persistable 
Evaluator</s
 pan></a></li></ul></li><li class="level-2"><a class="expandible" 
href="#"><span>E-Commerce Recommendation</span></a><ul><li class="level-3"><a 
class="final" href="/templates/ecommercerecommendation/quickstart/"><span>Quick 
Start</span></a></li><li class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/dase/"><span>DASE</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/how-to/"><span>How-To</span></a></li><li
 class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/train-with-rate-event/"><span>Train 
with Rate Event</span></a></li><li class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/adjust-score/"><span>Adjust 
Score</span></a></li></ul></li><li class="level-2"><a class="expandible" 
href="#"><span>Similar Product</span></a><ul><li class="level-3"><a 
class="final" href="/templates/similarproduct/quickstart/"><span>Quick 
Start</span></a></li><li class="level-3"><a class="final" href="
 /templates/similarproduct/dase/"><span>DASE</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/similarproduct/how-to/"><span>How-To</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/similarproduct/multi-events-multi-algos/"><span>Multiple 
Events and Multiple Algorithms</span></a></li><li class="level-3"><a 
class="final" 
href="/templates/similarproduct/return-item-properties/"><span>Returns Item 
Properties</span></a></li><li class="level-3"><a class="final" 
href="/templates/similarproduct/train-with-rate-event/"><span>Train with Rate 
Event</span></a></li><li class="level-3"><a class="final" 
href="/templates/similarproduct/rid-user-set-event/"><span>Get Rid of Events 
for Users</span></a></li><li class="level-3"><a class="final" 
href="/templates/similarproduct/recommended-user/"><span>Recommend 
Users</span></a></li></ul></li><li class="level-2"><a class="expandible" 
href="#"><span>Classification</span></a><ul><li class="level-3"><a 
class="final" hre
 f="/templates/classification/quickstart/"><span>Quick Start</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/classification/dase/"><span>DASE</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/classification/how-to/"><span>How-To</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/classification/add-algorithm/"><span>Use Alternative 
Algorithm</span></a></li><li class="level-3"><a class="final" 
href="/templates/classification/reading-custom-properties/"><span>Read Custom 
Properties</span></a></li></ul></li></ul></li><li class="level-1"><a 
class="expandible" href="#"><span>Engine Template Gallery</span></a><ul><li 
class="level-2"><a class="final" 
href="/gallery/template-gallery/"><span>Browse</span></a></li><li 
class="level-2"><a class="final" 
href="/community/submit-template/"><span>Submit your Engine as a 
Template</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Demo Tutorials</span></a><ul><li c
 lass="level-2"><a class="final" href="/demo/tapster/"><span>Comics 
Recommendation Demo</span></a></li><li class="level-2"><a class="final" 
href="/demo/community/"><span>Community Contributed Demo</span></a></li><li 
class="level-2"><a class="final" href="/demo/textclassification/"><span>Text 
Classification Engine Tutorial</span></a></li></ul></li><li class="level-1"><a 
class="expandible" href="/community/"><span>Getting Involved</span></a><ul><li 
class="level-2"><a class="final" 
href="/community/contribute-code/"><span>Contribute Code</span></a></li><li 
class="level-2"><a class="final" 
href="/community/contribute-documentation/"><span>Contribute 
Documentation</span></a></li><li class="level-2"><a class="final" 
href="/community/contribute-sdk/"><span>Contribute a SDK</span></a></li><li 
class="level-2"><a class="final" 
href="/community/contribute-webhook/"><span>Contribute a 
Webhook</span></a></li><li class="level-2"><a class="final" 
href="/community/projects/"><span>Community Projects
 </span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Getting Help</span></a><ul><li class="level-2"><a class="final" 
href="/resources/faq/"><span>FAQs</span></a></li><li class="level-2"><a 
class="final" href="/support/"><span>Support</span></a></li></ul></li><li 
class="level-1"><a class="expandible" 
href="#"><span>Resources</span></a><ul><li class="level-2"><a class="final" 
href="/cli/"><span>Command-line Interface</span></a></li><li class="level-2"><a 
class="final" href="/resources/release/"><span>Release 
Cadence</span></a></li><li class="level-2"><a class="final" 
href="/resources/intellij/"><span>Developing Engines with IntelliJ 
IDEA</span></a></li><li class="level-2"><a class="final" 
href="/resources/upgrade/"><span>Upgrade Instructions</span></a></li><li 
class="level-2"><a class="final" 
href="/resources/glossary/"><span>Glossary</span></a></li></ul></li><li 
class="level-1"><a class="expandible" href="#"><span>Apache Software 
Foundation</span></a><ul
 ><li class="level-2"><a class="final" 
 >href="https://www.apache.org/";><span>Apache Homepage</span></a></li><li 
 >class="level-2"><a class="final" 
 >href="https://www.apache.org/licenses/";><span>License</span></a></li><li 
 >class="level-2"><a class="final" 
 >href="https://www.apache.org/foundation/sponsorship.html";><span>Sponsorship</span></a></li><li
 > class="level-2"><a class="final" 
 >href="https://www.apache.org/foundation/thanks.html";><span>Thanks</span></a></li><li
 > class="level-2"><a class="final" 
 >href="https://www.apache.org/security/";><span>Security</span></a></li></ul></li></ul></nav></div><div
 > class="col-md-9 col-sm-12"><div class="content-header hidden-md 
 >hidden-lg"><div id="breadcrumbs" class="hidden-sm hidden xs"><ul><li><a 
 >href="#">Deploying an Engine</a><span 
 >class="spacer">&gt;</span></li><li><span class="last">Batch 
 >Predictions</span></li></ul></div><div id="page-title"><h1>Batch 
 >Predictions</h1></div></div><div id="table-of-content-wrapper"><h5>On this 
 >page</h5><aside id="table-
 of-contents"><ul> <li> <a href="#overview">Overview</a> </li> <li> <a 
href="#compatibility">Compatibility</a> </li> <li> <a href="#usage">Usage</a> 
</li> <li> <a href="#example">Example</a> </li> </ul> </aside><hr/><a 
id="edit-page-link" 
href="https://github.com/apache/predictionio/tree/livedoc/docs/manual/source/batchpredict/index.html.md";><img
 src="/images/icons/edit-pencil-d6c1bb3d.png"/>Edit this page</a></div><div 
class="content-header hidden-sm hidden-xs"><div id="breadcrumbs" 
class="hidden-sm hidden xs"><ul><li><a href="#">Deploying an Engine</a><span 
class="spacer">&gt;</span></li><li><span class="last">Batch 
Predictions</span></li></ul></div><div id="page-title"><h1>Batch 
Predictions</h1></div></div><div class="content"> <h2 id='overview' 
class='header-anchors'>Overview</h2><p>Process predictions for many queries 
using efficient parallelization through Spark. Useful for mass auditing of 
predictions and for generating predictions to push into other 
systems.</p><p>Batch predi
 ct reads and writes multi-object JSON files similar to the <a 
href="/datacollection/batchimport/">batch import</a> format. JSON objects are 
separated by newlines and cannot themselves contain unencoded newlines.</p><h2 
id='compatibility' class='header-anchors'>Compatibility</h2><p><code>pio 
batchpredict</code> loads the engine and processes queries exactly like 
<code>pio deploy</code>. There is only one additional requirement for engines 
to utilize batch predict:</p><div class="alert-message warning"><p>All 
algorithm classes used in the engine must be <a 
href="https://www.scala-lang.org/api/2.11.8/index.html#scala.Serializable";>serializable</a>.
 <strong>This is already true for PredictionIO&#39;s base algorithm 
classes</strong>, but may be broken by including non-serializable fields in 
their constructor. Using the <a 
href="http://fdahms.com/2015/10/14/scala-and-the-transient-lazy-val-pattern/";><code>@transient</code>
 annotation</a> may help in these cases.</p></div><p>This requireme
 nt is due to processing the input queries as a <a 
href="https://spark.apache.org/docs/latest/rdd-programming-guide.html#resilient-distributed-datasets-rdds";>Spark
 RDD</a> which enables high-performance parallelization, even on a single 
machine.</p><h2 id='usage' class='header-anchors'>Usage</h2><h3 
id='<code>pio-batchpredict</code>' class='header-anchors' ><code>pio 
batchpredict</code></h3><p>Command to process bulk predictions. Takes the same 
options as <code>pio deploy</code> plus:</p><h3 
id='<code>--input-&lt;value&gt;</code>' class='header-anchors' ><code>--input 
&lt;value&gt;</code></h3><p>Path to file containing queries; a multi-object 
JSON file with one query object per line. Accepts any valid Hadoop file 
URL.</p><p>Default: <code>batchpredict-input.json</code></p><h3 
id='<code>--output-&lt;value&gt;</code>' class='header-anchors' ><code>--output 
&lt;value&gt;</code></h3><p>Path to file to receive results; a multi-object 
JSON file with one object per line, the prediction + or
 iginal query. Accepts any valid Hadoop file URL. Actual output will be written 
as Hadoop partition files in a directory with the output name.</p><p>Default: 
<code>batchpredict-output.json</code></p><h3 
id='<code>--query-partitions-&lt;value&gt;</code>' class='header-anchors' 
><code>--query-partitions &lt;value&gt;</code></h3><p>Configure the concurrency 
of predictions by setting the number of partitions used internally for the RDD 
of queries. This will directly effect the number of resulting 
<code>part-*</code> output files. While setting to <code>1</code> may seem 
appealing to get a single output file, this will remove parallelization for the 
batch process, reducing performance and possibly exhausting 
memory.</p><p>Default: number created by Spark context&#39;s 
<code>textFile</code> (probably the number of cores available on the local 
machine)</p><h3 id='<code>--engine-instance-id-&lt;value&gt;</code>' 
class='header-anchors' ><code>--engine-instance-id &lt;value&gt;</code></h3><p>I
 dentifier for the trained instance to use for batch predict.</p><p>Default: 
the latest trained instance.</p><h2 id='example' 
class='header-anchors'>Example</h2><h3 id='input' 
class='header-anchors'>Input</h3><p>A multi-object JSON file of queries as they 
would be sent to the engine&#39;s HTTP Queries API.</p><div 
class="alert-message note"><p>Read via <a 
href="https://spark.apache.org/docs/latest/rdd-programming-guide.html#external-datasets";>SparkContext&#39;s
 <code>textFile</code></a> and so may be a single file or any supported Hadoop 
format.</p></div><p>File: <code>batchpredict-input.json</code></p><div 
class="highlight json"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
+<!DOCTYPE html><html><head><title>Batch Predictions</title><meta 
charset="utf-8"/><meta content="IE=edge,chrome=1" 
http-equiv="X-UA-Compatible"/><meta name="viewport" 
content="width=device-width, initial-scale=1.0"/><meta class="swiftype" 
name="title" data-type="string" content="Batch Predictions"/><link 
rel="canonical" href="https://predictionio.apache.org/batchpredict/"/><link 
href="/images/favicon/normal-b330020a.png" rel="shortcut icon"/><link 
href="/images/favicon/apple-c0febcf2.png" rel="apple-touch-icon"/><link 
href="//fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,800italic,400,300,600,700,800"
 rel="stylesheet"/><link 
href="//maxcdn.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.min.css" 
rel="stylesheet"/><link href="/stylesheets/application-eccfc6cb.css" 
rel="stylesheet" type="text/css"/><script 
src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.2/html5shiv.min.js"></script><script
 src="//cdn.mathjax.org/mathjax/latest/MathJax.js?co
 nfig=TeX-AMS-MML_HTMLorMML"></script><script 
src="//use.typekit.net/pqo0itb.js"></script><script>try{Typekit.load({ async: 
true });}catch(e){}</script></head><body><div id="global"><header><div 
class="container" id="header-wrapper"><div class="row"><div 
class="col-sm-12"><div id="logo-wrapper"><span id="drawer-toggle"></span><a 
href="#"></a><a href="http://predictionio.apache.org/";><img alt="Apache 
PredictionIO" id="logo" 
src="/images/logos/logo-ee2b9bb3.png"/></a><span>®</span></div><div 
id="menu-wrapper"><div id="pill-wrapper"><a class="pill left" 
href="/gallery/template-gallery">TEMPLATES</a> <a class="pill right" 
href="//github.com/apache/predictionio/">OPEN SOURCE</a></div></div><img 
class="mobile-search-bar-toggler hidden-md hidden-lg" 
src="/images/icons/search-glass-704bd4ff.png"/></div></div></div></header><div 
id="search-bar-row-wrapper"><div class="container-fluid" 
id="search-bar-row"><div class="row"><div class="col-md-9 col-sm-11 
col-xs-11"><div class="hidden-md hidden-
 lg" id="mobile-page-heading-wrapper"><p>PredictionIO Docs</p><h4>Batch 
Predictions</h4></div><h4 class="hidden-sm hidden-xs">PredictionIO 
Docs</h4></div><div class="col-md-3 col-sm-1 col-xs-1 hidden-md hidden-lg"><img 
id="left-menu-indicator" 
src="/images/icons/down-arrow-dfe9f7fe.png"/></div><div class="col-md-3 
col-sm-12 col-xs-12 swiftype-wrapper"><div class="swiftype"><form 
class="search-form"><img class="search-box-toggler hidden-xs hidden-sm" 
src="/images/icons/search-glass-704bd4ff.png"/><div class="search-box"><img 
src="/images/icons/search-glass-704bd4ff.png"/><input type="text" 
id="st-search-input" class="st-search-input" placeholder="Search 
Doc..."/></div><img class="swiftype-row-hider hidden-md hidden-lg" 
src="/images/icons/drawer-toggle-active-fcbef12a.png"/></form></div></div><div 
class="mobile-left-menu-toggler hidden-md 
hidden-lg"></div></div></div></div><div id="page" class="container-fluid"><div 
class="row"><div id="left-menu-wrapper" class="col-md-3"><nav id="nav-
 main"><ul><li class="level-1"><a class="expandible" href="/"><span>Apache 
PredictionIO® Documentation</span></a><ul><li class="level-2"><a class="final" 
href="/"><span>Welcome to Apache PredictionIO®</span></a></li></ul></li><li 
class="level-1"><a class="expandible" href="#"><span>Getting 
Started</span></a><ul><li class="level-2"><a class="final" 
href="/start/"><span>A Quick Intro</span></a></li><li class="level-2"><a 
class="final" href="/install/"><span>Installing Apache 
PredictionIO</span></a></li><li class="level-2"><a class="final" 
href="/start/download/"><span>Downloading an Engine Template</span></a></li><li 
class="level-2"><a class="final" href="/start/deploy/"><span>Deploying Your 
First Engine</span></a></li><li class="level-2"><a class="final" 
href="/start/customize/"><span>Customizing the 
Engine</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Integrating with Your App</span></a><ul><li class="level-2"><a 
class="final" href="/appintegrati
 on/"><span>App Integration Overview</span></a></li><li class="level-2"><a 
class="expandible" href="/sdk/"><span>List of SDKs</span></a><ul><li 
class="level-3"><a class="final" href="/sdk/java/"><span>Java & Android 
SDK</span></a></li><li class="level-3"><a class="final" 
href="/sdk/php/"><span>PHP SDK</span></a></li><li class="level-3"><a 
class="final" href="/sdk/python/"><span>Python SDK</span></a></li><li 
class="level-3"><a class="final" href="/sdk/ruby/"><span>Ruby 
SDK</span></a></li><li class="level-3"><a class="final" 
href="/community/projects/#sdks"><span>Community Powered 
SDKs</span></a></li></ul></li></ul></li><li class="level-1"><a 
class="expandible" href="#"><span>Deploying an Engine</span></a><ul><li 
class="level-2"><a class="final" href="/deploy/"><span>Deploying as a Web 
Service</span></a></li><li class="level-2"><a class="final active" 
href="/batchpredict/"><span>Batch Predictions</span></a></li><li 
class="level-2"><a class="final" href="/deploy/monitoring/"><span>Monit
 oring Engine</span></a></li><li class="level-2"><a class="final" 
href="/deploy/engineparams/"><span>Setting Engine Parameters</span></a></li><li 
class="level-2"><a class="final" href="/deploy/enginevariants/"><span>Deploying 
Multiple Engine Variants</span></a></li><li class="level-2"><a class="final" 
href="/deploy/plugin/"><span>Engine Server Plugin</span></a></li></ul></li><li 
class="level-1"><a class="expandible" href="#"><span>Customizing an 
Engine</span></a><ul><li class="level-2"><a class="final" 
href="/customize/"><span>Learning DASE</span></a></li><li class="level-2"><a 
class="final" href="/customize/dase/"><span>Implement DASE</span></a></li><li 
class="level-2"><a class="final" 
href="/customize/troubleshooting/"><span>Troubleshooting Engine 
Development</span></a></li><li class="level-2"><a class="final" 
href="/api/current/#package"><span>Engine Scala 
APIs</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Collecting and Analyzing Data</span></a
 ><ul><li class="level-2"><a class="final" href="/datacollection/"><span>Event 
 >Server Overview</span></a></li><li class="level-2"><a class="final" 
 >href="/datacollection/eventapi/"><span>Collecting Data with 
 >REST/SDKs</span></a></li><li class="level-2"><a class="final" 
 >href="/datacollection/eventmodel/"><span>Events Modeling</span></a></li><li 
 >class="level-2"><a class="final" 
 >href="/datacollection/webhooks/"><span>Unifying Multichannel Data with 
 >Webhooks</span></a></li><li class="level-2"><a class="final" 
 >href="/datacollection/channel/"><span>Channel</span></a></li><li 
 >class="level-2"><a class="final" 
 >href="/datacollection/batchimport/"><span>Importing Data in 
 >Batch</span></a></li><li class="level-2"><a class="final" 
 >href="/datacollection/analytics/"><span>Using Analytics 
 >Tools</span></a></li><li class="level-2"><a class="final" 
 >href="/datacollection/plugin/"><span>Event Server 
 >Plugin</span></a></li></ul></li><li class="level-1"><a class="expandible" 
 >href="#"><span>Choosing an Algorit
 hm</span></a><ul><li class="level-2"><a class="final" 
href="/algorithm/"><span>Built-in Algorithm Libraries</span></a></li><li 
class="level-2"><a class="final" href="/algorithm/switch/"><span>Switching to 
Another Algorithm</span></a></li><li class="level-2"><a class="final" 
href="/algorithm/multiple/"><span>Combining Multiple 
Algorithms</span></a></li><li class="level-2"><a class="final" 
href="/algorithm/custom/"><span>Adding Your Own 
Algorithms</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Tuning and Evaluation</span></a><ul><li class="level-2"><a 
class="final" href="/evaluation/"><span>Overview</span></a></li><li 
class="level-2"><a class="final" 
href="/evaluation/paramtuning/"><span>Hyperparameter Tuning</span></a></li><li 
class="level-2"><a class="final" 
href="/evaluation/evaluationdashboard/"><span>Evaluation 
Dashboard</span></a></li><li class="level-2"><a class="final" 
href="/evaluation/metricchoose/"><span>Choosing Evaluation Metrics</span><
 /a></li><li class="level-2"><a class="final" 
href="/evaluation/metricbuild/"><span>Building Evaluation 
Metrics</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>System Architecture</span></a><ul><li class="level-2"><a 
class="final" href="/system/"><span>Architecture Overview</span></a></li><li 
class="level-2"><a class="final" href="/system/anotherdatastore/"><span>Using 
Another Data Store</span></a></li></ul></li><li class="level-1"><a 
class="expandible" href="#"><span>PredictionIO® Official 
Templates</span></a><ul><li class="level-2"><a class="final" 
href="/templates/"><span>Intro</span></a></li><li class="level-2"><a 
class="expandible" href="#"><span>Recommendation</span></a><ul><li 
class="level-3"><a class="final" 
href="/templates/recommendation/quickstart/"><span>Quick 
Start</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/dase/"><span>DASE</span></a></li><li 
class="level-3"><a class="final" href="/templates/rec
 ommendation/evaluation/"><span>Evaluation Explained</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/recommendation/how-to/"><span>How-To</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/recommendation/reading-custom-events/"><span>Read Custom 
Events</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/customize-data-prep/"><span>Customize Data 
Preparator</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/customize-serving/"><span>Customize 
Serving</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/training-with-implicit-preference/"><span>Train 
with Implicit Preference</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/blacklist-items/"><span>Filter Recommended 
Items by Blacklist in Query</span></a></li><li class="level-3"><a class="final" 
href="/templates/recommendation/batch-evaluator/"><span>Batch Persistable Ev
 aluator</span></a></li></ul></li><li class="level-2"><a class="expandible" 
href="#"><span>E-Commerce Recommendation</span></a><ul><li class="level-3"><a 
class="final" href="/templates/ecommercerecommendation/quickstart/"><span>Quick 
Start</span></a></li><li class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/dase/"><span>DASE</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/how-to/"><span>How-To</span></a></li><li
 class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/train-with-rate-event/"><span>Train 
with Rate Event</span></a></li><li class="level-3"><a class="final" 
href="/templates/ecommercerecommendation/adjust-score/"><span>Adjust 
Score</span></a></li></ul></li><li class="level-2"><a class="expandible" 
href="#"><span>Similar Product</span></a><ul><li class="level-3"><a 
class="final" href="/templates/similarproduct/quickstart/"><span>Quick 
Start</span></a></li><li class="level-3"><a class="fin
 al" href="/templates/similarproduct/dase/"><span>DASE</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/similarproduct/how-to/"><span>How-To</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/similarproduct/multi-events-multi-algos/"><span>Multiple 
Events and Multiple Algorithms</span></a></li><li class="level-3"><a 
class="final" 
href="/templates/similarproduct/return-item-properties/"><span>Returns Item 
Properties</span></a></li><li class="level-3"><a class="final" 
href="/templates/similarproduct/train-with-rate-event/"><span>Train with Rate 
Event</span></a></li><li class="level-3"><a class="final" 
href="/templates/similarproduct/rid-user-set-event/"><span>Get Rid of Events 
for Users</span></a></li><li class="level-3"><a class="final" 
href="/templates/similarproduct/recommended-user/"><span>Recommend 
Users</span></a></li></ul></li><li class="level-2"><a class="expandible" 
href="#"><span>Classification</span></a><ul><li class="level-3"><a class="
 final" href="/templates/classification/quickstart/"><span>Quick 
Start</span></a></li><li class="level-3"><a class="final" 
href="/templates/classification/dase/"><span>DASE</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/classification/how-to/"><span>How-To</span></a></li><li 
class="level-3"><a class="final" 
href="/templates/classification/add-algorithm/"><span>Use Alternative 
Algorithm</span></a></li><li class="level-3"><a class="final" 
href="/templates/classification/reading-custom-properties/"><span>Read Custom 
Properties</span></a></li></ul></li></ul></li><li class="level-1"><a 
class="expandible" href="#"><span>Engine Template Gallery</span></a><ul><li 
class="level-2"><a class="final" 
href="/gallery/template-gallery/"><span>Browse</span></a></li><li 
class="level-2"><a class="final" 
href="/community/submit-template/"><span>Submit your Engine as a 
Template</span></a></li></ul></li><li class="level-1"><a class="expandible" 
href="#"><span>Demo Tutorials</span></a
 ><ul><li class="level-2"><a class="final" 
 >href="/community/projects/#demos"><span>Community Contributed 
 >Demo</span></a></li><li class="level-2"><a class="final" 
 >href="/demo/textclassification/"><span>Text Classification Engine 
 >Tutorial</span></a></li></ul></li><li class="level-1"><a class="expandible" 
 >href="/community/"><span>Getting Involved</span></a><ul><li 
 >class="level-2"><a class="final" 
 >href="/community/contribute-code/"><span>Contribute Code</span></a></li><li 
 >class="level-2"><a class="final" 
 >href="/community/contribute-documentation/"><span>Contribute 
 >Documentation</span></a></li><li class="level-2"><a class="final" 
 >href="/community/contribute-sdk/"><span>Contribute a SDK</span></a></li><li 
 >class="level-2"><a class="final" 
 >href="/community/contribute-webhook/"><span>Contribute a 
 >Webhook</span></a></li><li class="level-2"><a class="final" 
 >href="/community/projects/"><span>Community 
 >Projects</span></a></li></ul></li><li class="level-1"><a class="expandible" 
 >href="#"><span>Gett
 ing Help</span></a><ul><li class="level-2"><a class="final" 
href="/resources/faq/"><span>FAQs</span></a></li><li class="level-2"><a 
class="final" href="/support/"><span>Support</span></a></li></ul></li><li 
class="level-1"><a class="expandible" 
href="#"><span>Resources</span></a><ul><li class="level-2"><a class="final" 
href="/cli/"><span>Command-line Interface</span></a></li><li class="level-2"><a 
class="final" href="/resources/release/"><span>Release 
Cadence</span></a></li><li class="level-2"><a class="final" 
href="/resources/intellij/"><span>Developing Engines with IntelliJ 
IDEA</span></a></li><li class="level-2"><a class="final" 
href="/resources/upgrade/"><span>Upgrade Instructions</span></a></li><li 
class="level-2"><a class="final" 
href="/resources/glossary/"><span>Glossary</span></a></li></ul></li><li 
class="level-1"><a class="expandible" href="#"><span>Apache Software 
Foundation</span></a><ul><li class="level-2"><a class="final" 
href="https://www.apache.org/";><span>Apache Homep
 age</span></a></li><li class="level-2"><a class="final" 
href="https://www.apache.org/licenses/";><span>License</span></a></li><li 
class="level-2"><a class="final" 
href="https://www.apache.org/foundation/sponsorship.html";><span>Sponsorship</span></a></li><li
 class="level-2"><a class="final" 
href="https://www.apache.org/foundation/thanks.html";><span>Thanks</span></a></li><li
 class="level-2"><a class="final" 
href="https://www.apache.org/security/";><span>Security</span></a></li></ul></li></ul></nav></div><div
 class="col-md-9 col-sm-12"><div class="content-header hidden-md 
hidden-lg"><div id="breadcrumbs" class="hidden-sm hidden xs"><ul><li><a 
href="#">Deploying an Engine</a><span class="spacer">&gt;</span></li><li><span 
class="last">Batch Predictions</span></li></ul></div><div 
id="page-title"><h1>Batch Predictions</h1></div></div><div 
id="table-of-content-wrapper"><h5>On this page</h5><aside 
id="table-of-contents"><ul> <li> <a href="#overview">Overview</a> </li> <li> <a 
href="#compatibil
 ity">Compatibility</a> </li> <li> <a href="#usage">Usage</a> </li> <li> <a 
href="#example">Example</a> </li> </ul> </aside><hr/><a id="edit-page-link" 
href="https://github.com/apache/predictionio/tree/livedoc/docs/manual/source/batchpredict/index.html.md";><img
 src="/images/icons/edit-pencil-d6c1bb3d.png"/>Edit this page</a></div><div 
class="content-header hidden-sm hidden-xs"><div id="breadcrumbs" 
class="hidden-sm hidden xs"><ul><li><a href="#">Deploying an Engine</a><span 
class="spacer">&gt;</span></li><li><span class="last">Batch 
Predictions</span></li></ul></div><div id="page-title"><h1>Batch 
Predictions</h1></div></div><div class="content"> <h2 id='overview' 
class='header-anchors'>Overview</h2><p>Process predictions for many queries 
using efficient parallelization through Spark. Useful for mass auditing of 
predictions and for generating predictions to push into other 
systems.</p><p>Batch predict reads and writes multi-object JSON files similar 
to the <a href="/datacollection/bat
 chimport/">batch import</a> format. JSON objects are separated by newlines and 
cannot themselves contain unencoded newlines.</p><h2 id='compatibility' 
class='header-anchors'>Compatibility</h2><p><code>pio batchpredict</code> loads 
the engine and processes queries exactly like <code>pio deploy</code>. There is 
only one additional requirement for engines to utilize batch predict:</p><div 
class="alert-message warning"><p>All algorithm classes used in the engine must 
be <a 
href="https://www.scala-lang.org/api/2.11.8/index.html#scala.Serializable";>serializable</a>.
 <strong>This is already true for PredictionIO&#39;s base algorithm 
classes</strong>, but may be broken by including non-serializable fields in 
their constructor. Using the <a 
href="http://fdahms.com/2015/10/14/scala-and-the-transient-lazy-val-pattern/";><code>@transient</code>
 annotation</a> may help in these cases.</p></div><p>This requirement is due to 
processing the input queries as a <a href="https://spark.apache.org/docs/l
 atest/rdd-programming-guide.html#resilient-distributed-datasets-rdds">Spark 
RDD</a> which enables high-performance parallelization, even on a single 
machine.</p><h2 id='usage' class='header-anchors'>Usage</h2><h3 
id='<code>pio-batchpredict</code>' class='header-anchors' ><code>pio 
batchpredict</code></h3><p>Command to process bulk predictions. Takes the same 
options as <code>pio deploy</code> plus:</p><h3 
id='<code>--input-&lt;value&gt;</code>' class='header-anchors' ><code>--input 
&lt;value&gt;</code></h3><p>Path to file containing queries; a multi-object 
JSON file with one query object per line. Accepts any valid Hadoop file 
URL.</p><p>Default: <code>batchpredict-input.json</code></p><h3 
id='<code>--output-&lt;value&gt;</code>' class='header-anchors' ><code>--output 
&lt;value&gt;</code></h3><p>Path to file to receive results; a multi-object 
JSON file with one object per line, the prediction + original query. Accepts 
any valid Hadoop file URL. Actual output will be written as Hadoo
 p partition files in a directory with the output name.</p><p>Default: 
<code>batchpredict-output.json</code></p><h3 
id='<code>--query-partitions-&lt;value&gt;</code>' class='header-anchors' 
><code>--query-partitions &lt;value&gt;</code></h3><p>Configure the concurrency 
of predictions by setting the number of partitions used internally for the RDD 
of queries. This will directly effect the number of resulting 
<code>part-*</code> output files. While setting to <code>1</code> may seem 
appealing to get a single output file, this will remove parallelization for the 
batch process, reducing performance and possibly exhausting 
memory.</p><p>Default: number created by Spark context&#39;s 
<code>textFile</code> (probably the number of cores available on the local 
machine)</p><h3 id='<code>--engine-instance-id-&lt;value&gt;</code>' 
class='header-anchors' ><code>--engine-instance-id 
&lt;value&gt;</code></h3><p>Identifier for the trained instance to use for 
batch predict.</p><p>Default: the latest 
 trained instance.</p><h2 id='example' class='header-anchors'>Example</h2><h3 
id='input' class='header-anchors'>Input</h3><p>A multi-object JSON file of 
queries as they would be sent to the engine&#39;s HTTP Queries API.</p><div 
class="alert-message note"><p>Read via <a 
href="https://spark.apache.org/docs/latest/rdd-programming-guide.html#external-datasets";>SparkContext&#39;s
 <code>textFile</code></a> and so may be a single file or any supported Hadoop 
format.</p></div><p>File: <code>batchpredict-input.json</code></p><div 
class="highlight json"><table style="border-spacing: 0"><tbody><tr><td 
class="gutter gl" style="text-align: right"><pre class="lineno">1
 2
 3
 4

Reply via email to