Repository: incubator-predictionio Updated Branches: refs/heads/develop 9bb85ab3c -> e1e71280c
http://git-wip-us.apache.org/repos/asf/incubator-predictionio/blob/e1e71280/docs/manual/obsolete/tutorials/engines/quickstart.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/obsolete/tutorials/engines/quickstart.html.md b/docs/manual/obsolete/tutorials/engines/quickstart.html.md deleted file mode 100644 index f2747d1..0000000 --- a/docs/manual/obsolete/tutorials/engines/quickstart.html.md +++ /dev/null @@ -1,503 +0,0 @@ ---- -title: Quick Start - Using a Built-in Engine ---- - -# Quick Start - Using a Built-in Engine -<code>This doc is applicable to 0.8.0 only. Updated version for 0.8.2 will be available soon.</code> - - -This is a quick start guide of using a PredictionIO's built-in engine and its -SDKs to write a very simple app. It assumes that you have [installed -PredictionIO server](/install). - -Let's start with a classic example in Machine Learning - build a ranking -engine. We are going to launch a ranking engine instance that can: - -* collect *real-time event data* from your app through REST API or SDKs; -* update the predictive model with *new data* regularly and automatically; -* answer *prediction query* through REST API or SDKs. - -> **Notes about HADOOP_CONF_DIR** - -> Before you begin this tutorial, make sure your environment does not have the -variable `HADOOP_CONF_DIR` set. When this is set, PredictionIO will -automatically pick it up and some functionality will expect an operational -Hadoop 2 environment. - -# Create a Simple App Project - -Create a new project directory for a simple app that will use the engine. - -``` -$ mkdir quickstartapp -$ cd quickstartapp -``` - -# Install SDK - -To communicate with PredictionIO server, we can use a PredictionIO SDK of a -specific programming language: - -<div class="tabs"> - <div data-tab="PHP SDK" data-lang="php"> -<p>To use the PredictionIO PHP SDK, we are going to install it with Composer:</p> -<p>1. Create a file called ``composer.json`` in your project directory, which adds predictionorg.apache.predictionioio as a dependency. It should look like this:</p> - -```json -{ - "require": { - "predictionorg.apache.predictionioio": "~0.8.0" - } -} -``` - -<p>2. Install Composer:</p> - -```bash -$ curl -sS https://getcomposer.org/installer | php -d detect_unicode=Off -``` - -<p>3. Use Composer to install your dependencies:</p> - -```bash -$ php composer.phar install -``` - -<p>Now you are ready to write the actual PHP code.</p> - </div> - <div data-tab="Python SDK" data-lang="python"> -```bash -$ pip install predictionio -``` -or -```bash -$ easy_install predictionio -``` - </div> - <div data-tab="Ruby SDK" data-lang="ruby"> -```ruby -$ gem install predictionio -``` - </div> - <div data-tab="Java SDK" data-lang="java"> -To use PredictionIO in your project, add this to the <code>dependencies</code> -section of your project's <code>pom.xml</code> file: -```bash -<dependencies> - <dependency> - <groupId>org.apache.predictionio</groupId> - <artifactId>client</artifactId> - <version>0.8.0</version> - </dependency> -</dependencies> -``` - -To run examples in PredictionIO Java SDK, clone the PredictionIO-Java-SDK -repository and build it using Maven: -```bash -$ cd ~ -$ git clone git://github.com/PredictionIO/PredictionIO-Java-SDK.git -$ cd PredictionIO-Java-SDK -$ mvn clean install -``` -Javadoc appears in client/target/apidocs/index.html. - </div> -</div> - - -# Collect Data into PredictionIO - -## Launch the Event Server - -```bash -$ $PIO_HOME/bin/pio eventserver -``` - -where `$PIO_HOME` is the installation directory of PredictionIO. As long as the -Event Server is running, PredictionIO keeps listening to new data. - -To bind to a different address, -```bash -$ $PIO_HOME/bin/pio eventserver --ip <IP> -``` - -## Collecting Data - -We are going to write a script that generates some random data and simulates -data collection. With the *EventClient* of one of the PredictionIO SDKs, your -application can send data to the Event Server in real-time easily through the -[EventAPI](/eventapi.html). In the *quickstartapp* directory: - -<div class="tabs"> - <div data-tab="PHP SDK" data-lang="php"> -<p>Create <em>import.php</em> as below. Replace <code>your_app_id</code> with -your app id (integer).</p> -```php -<?php - // use composer's autoloader to load PredictionIO PHP SDK - require_once("vendor/autoload.php"); - use predictionio\EventClient; - - $client = new EventClient(your_app_id); - - // generate 10 users, with user ids 1,2,....,10 - for ($i=1; $i<=10; $i++) { - echo "Add user ". $i . "\n"; - $response=$client->setUser($i); - } - - // generate 50 items, with item ids 1,2,....,50 - // assign type id 1 to all of them - for ($i=1; $i<=50; $i++) { - echo "Add item ". $i . "\n"; - $response=$client->setItem($i, array('pio_itypes'=>array('1'))); - } - - // each user randomly views 10 items - for ($u=1; $u<=10; $u++) { - for ($count=0; $count<10; $count++) { - $i = rand(1, 50); // randomly pick an item - echo "User ". $u . " views item ". $i ."\n"; - $response=$client->recordUserActionOnItem('view', $u, $i); - } - } -?> -``` -and run it: -```php -$ php import.php -``` - </div> - <div data-tab="Python SDK" data-lang="python"> -<p>Create <em>import.py</em> as below. Replace <code>your_app_id</code> with -your app id (integer).</p> - -```python -import predictionio -import random - -random.seed() - -client = predictionio.EventClient(app_id=your_app_id) - -# generate 10 users, with user ids 1,2,....,10 -user_ids = [str(i) for i in range(1, 11)] -for user_id in user_ids: - print "Set user", user_id - client.set_user(user_id) - -# generate 50 items, with item ids 1,2,....,50 -# assign type id 1 to all of them -item_ids = [str(i) for i in range(1, 51)] -for item_id in item_ids: - print "Set item", item_id - client.set_item(item_id, { - "pio_itypes" : ['1'] - }) - -# each user randomly views 10 items -for user_id in user_ids: - for viewed_item in random.sample(item_ids, 10): - print "User", user_id ,"views item", viewed_item - client.record_user_action_on_item("view", user_id, viewed_item) - -client.close() - -``` -and run it: -```bash -$ python import.py -``` - - </div> - <div data-tab="Ruby SDK" data-lang="ruby"> -<p>Create <em>import.rb</em> as below. Replace <code>your_app_id</code> with -your app id (integer).</p> - -```ruby -require 'predictionio' - -# Instantiate an EventClient -client = PredictionIO::EventClient.new(your_app_id) - -# Generate 10 users, with user IDs 1 to 10. -(1..10).each do |uid| - puts "Add user #{uid}" - client.set_user(uid) -end - -# Generate 50 items, with item IDs 1 to 10. -(1..50).each do |iid| - puts "Add item #{iid}" - client.set_item(iid, 'properties' => { 'pio_itypes' => %w(1) }) -end - -# Each user randomly views 10 items. -(1..10).each do |uid| - (1..10).each do |count| - iid = Random.rand(51) - puts "User #{uid} views item #{iid}" - client.record_user_action_on_item('view', uid.to_s, iid.to_s) - end -end -``` -and run it: -```bash -$ ruby import.rb -``` - </div> - <div data-tab="Java SDK" data-lang="java"> -<p><em>QuickstartImport.java</em> is located under -PredictionIO-Java-SDK/examples/quickstart_import/src/main/java/org.apache.predictionio/samples/. -Replace <code>your_app_id</code> with your app id (integer).</p> - -```java -package org.apache.predictionio.samples; - -import com.google.common.collect.ImmutableList; -import com.google.common.collect.ImmutableMap; - -import org.apache.predictionio.EventClient; - -import java.io.IOException; -import java.util.Map; -import java.util.Random; -import java.util.concurrent.ExecutionException; - -public class QuickstartImport { - public static void main(String[] args) - throws ExecutionException, InterruptedException, IOException { - EventClient client = new EventClient(your_app_id); - Random rand = new Random(); - Map<String, Object> emptyProperty = ImmutableMap.of(); - - // generate 10 users, with user ids 1 to 10 - for (int user = 1; user <= 10; user++) { - System.out.println("Add user " + user); - client.setUser(""+user, emptyProperty); - } - - // generate 50 items, with item ids 1 to 50 - // assign type id 1 to all of them - Map<String, Object> itemProperty = ImmutableMap.<String, Object>of( - "pio_itypes", ImmutableList.of("1")); - for (int item = 1; item <= 50; item++) { - System.out.println("Add item " + item); - client.setItem(""+item, itemProperty); - } - - // each user randomly views 10 items - for (int user = 1; user <= 10; user++) { - for (int i = 1; i <= 10; i++) { - int item = rand.nextInt(50) + 1; - System.out.println("User " + user + " views item " + item); - client.userActionItem("view", ""+user, ""+item, emptyProperty); - } - } - - client.close(); - } -} -``` -To compile and run it: -```bash -$ cd PredictionIO-Java-SDK/examples/quickstart_import -$ mvn clean compile assembly:single -$ java -jar target/quickstart-import-<latest version>-jar-with-dependencies.jar -``` - </div> -</div> - - - - -# Deploying an Engine Instance - -Each engine deals with one type of Machine Learning task. For instance, Item -Ranking Engine (itemrank) makes personalized item (e.g. product or content) -ranking to your users. - -> **What is an Engine Instance?** -> -> You can deploy one or more *engine instances* from an engine. It means that -you can run multiple ranking *engine instances* at the same time with different -settings, or even for different applications. - -To deploy an engine instance for *quickstartapp*, first create an engine -instance project: - -```bash -$ $PIO_HOME/bin/pio instance org.apache.predictionio.engines.itemrank -$ cd org.apache.predictionio.engines.itemrank -$ $PIO_HOME/bin/pio register -``` - -Edit `params/datasource.json` and modify the value of `appId` to fit your app. - -Now, you can kick start the predictive model training with: - -INFO: If you are using **Linux**, Apache Spark local mode, which is the default -operation mode without further configuration, may not work. In that case, -configure your Apache Spark to run in [standalone cluster -mode](http://spark.apache.org/docs/latest/spark-standalone.html). - -```bash -$ $PIO_HOME/bin/pio train -... -2014-09-11 16:25:44,591 INFO spark.SparkContext - Job finished: collect at Workflow.scala:674, took 0.078664 s -2014-09-11 16:25:44,737 INFO workflow.CoreWorkflow$ - Saved engine instance with ID: KxOsC2FRSdGGe1lv0oaHiw -``` - -> **Notes for Apache Spark in Cluster Mode** - -> If you are using an Apache Spark cluster, you will need to pass the cluster's -master URL to the `pio train` command, e.g. - -> ```bash -$ $PIO_HOME/bin/pio train -- --master spark://`hostname`:7077 -``` - -> You may replace the command `hostname` with your hostname, which can be found -> on [Spark's UI](http://localhost:8080). - -If your training was successful, you should see the lines shown above. Now you are ready to deploy the instance: - -```bash -$ $PIO_HOME/bin/pio deploy -... -[INFO] [09/11/2014 16:26:16.525] [pio-server-akka.actor.default-dispatcher-2] [akka://pio-server/user/IO-HTTP/listener-0] Bound to localhost/127.0.0.1:8000 -[INFO] [09/11/2014 16:26:16.526] [pio-server-akka.actor.default-dispatcher-5] [akka://pio-server/user/master] Bind successful. Ready to serve. -``` - -Notice that the `deploy` command runs the engine instance in the foreground. You can also use the --ip option to bind to a different ip address. Now we are ready to take a look at the results! - - -# Retrieve Prediction Results - -With the *EngineClients* of a PredictionIO SDK, your application can send -queries to a deployed engine instance through the Engine API. In the -*quickstartapp* directory: - -<div class="tabs"> -<div data-tab="PHP SDK" data-lang="php"> -<p>Create a file <em>show.php</em> with this code:</p> -```php -<?php - // use composer's autoloader to load PredictionIO PHP SDK - require_once("vendor/autoload.php"); - use predictionio\EngineClient; - - $client = new EngineClient(); - - // Rank item 1 to 5 for each user - for ($i=1; $i<=10; $i++) { - $response=$client->sendQuery(array('uid'=>$i, - 'iids'=>array(1,2,3,4,5))); - print_r($response); - } -?> -``` -and run it: -```bash -$ php show.php -``` - </div> - <div data-tab="Python SDK" data-lang="python"> -<p>Create a file <em>show.py</em> with this code:</p> - -```python -import predictionio - -client = predictionio.EngineClient() - -# Rank item 1 to 5 for each user -item_ids = [str(i) for i in range(1, 6)] -user_ids = [str(x) for x in range(1, 11)] -for user_id in user_ids: - print "Rank item 1 to 5 for user", user_id - try: - response = client.send_query({ - "uid": user_id, - "iids": item_ids - }) - print response - except predictionio.PredictionIOAPIError as e: - print 'Caught exception:', e.strerror() - -client.close() - -``` - -and run it: - -```bash -$ python show.py -``` - </div> - <div data-tab="Ruby SDK" data-lang="ruby"> -<p>Create a file <em>show.rb</em> with this code:</p> -```ruby -require 'predictionio' - -client = PredictionIO::EngineClient.new - -(1..10).each do |uid| - predictions = client.send_query('uid' => uid.to_s, 'iids' => %w(1 2 3 4 5)) - puts predictions -end -``` - -and run it: - -```bash -$ ruby show.rb -``` - </div> - <div data-tab="Java SDK" data-lang="java"> -<p><em>QuickstartShow.java</em> is located under -PredictionIO-Java-SDK/examples/quickstart_show/src/main/java/org.apache.predictionio/samples/.</p> - -```java -package org.apache.predictionio.samples; - -import com.google.common.collect.ImmutableList; - -import org.apache.predictionio.EngineClient; - -import java.io.IOException; -import java.util.HashMap; -import java.util.Map; -import java.util.concurrent.ExecutionException; - -public class QuickstartShow { - public static void main(String[] args) - throws ExecutionException, InterruptedException, IOException { - EngineClient client = new EngineClient(); - - // rank item 1 to 5 for each user - Map<String, Object> query = new HashMap<>(); - query.put("iids", ImmutableList.of("1", "2", "3", "4", "5")); - for (int user = 1; user <= 10; user++) { - query.put("uid", user); - System.out.println("Rank item 1 to 5 for user " + user); - System.out.println(client.sendQuery(query)); - } - - client.close(); - } -} -``` - -To compile and run it: -```bash -$ cd PredictionIO-Java-SDK/examples/quickstart_show -$ mvn clean compile assembly:single -$ java -jar target/quickstart-show-<latest version>-jar-with-dependencies.jar -``` - </div> -</div> - -Well done! You have created a simple, but production-ready app with PredictionIO -ranking engine. - -Next: Learn more about [collecting data through Event API](/eventapi.html). http://git-wip-us.apache.org/repos/asf/incubator-predictionio/blob/e1e71280/docs/manual/obsolete/tutorials/recommendation/movielens.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/obsolete/tutorials/recommendation/movielens.html.md b/docs/manual/obsolete/tutorials/recommendation/movielens.html.md deleted file mode 100644 index ac94757..0000000 --- a/docs/manual/obsolete/tutorials/recommendation/movielens.html.md +++ /dev/null @@ -1,158 +0,0 @@ ---- -title: Tutorial on Item Recommendation Engine - Movie Recommendation ---- - -# Building Movie Recommendation App with Item Recommendation Engine -<code>This doc is applicable to 0.8.0 only. Updated version for 0.8.2 will be availble soon.</code> - -## Importing Movie-Lens Data - -Clone our -[Python-SDK](https://github.com/PredictionIO/PredictionIO-Python-SDK) and -switch to develop branch to get the latest changes. - -``` -$ git clone https://github.com/PredictionIO/PredictionIO-Python-SDK.git -$ git checkout develop -``` - -Install external dependencies. - -``` -python setup.py install -```` - -Download Movie-Lens data into the /PredictionIO-Python-SDK folder. - -``` -$ curl -o ml-100k.zip http://files.grouplens.org/datasets/movielens/ml-100k.zip -$ unzip ml-100k.zip -``` - -Launch EventServer. $PIO_HOME is the installation directory of PredictionIO. - -``` -$ $PIO_HOME/bin/pio eventserver -``` - -Import data. You should have at least 2GB of driver memory to accommodate the dataset. The import script takes two parameters: `<app_id> <url>`. -`<app_id>` is an integer identifies your address space; `<url>` is the -EventServer url (default: http://localhost:7070). We will use the same -`<app_id>` through out this tutorial. - -``` -$ cd PredictionIO-Python-SDK -$ python -m examples.demo-movielens.batch_import <app_id> http://localhost:7070 -``` - -The import takes a minute or two. At the end you should see the following -output: - -``` -{u'status': u'alive'} -[Info] Initializing users... -[Info] 943 users were initialized. -... -[Info] Importing rate actions to PredictionIO... -[Info] 100000 rate actions were imported. -``` - -> You may delete *all* data belonging to a specific `<app_id>` with this -> request. There is no way to undo this delete, use it cautiously! -``` -$ curl -i -X DELETE http://localhost:7070/events.json?appId=<app_id> -``` - -## Deploying the Item Recommendation engine -Create an engine instance project base on the default Item Recommendation -Engine. - -``` -$ $PIO_HOME/bin/pio instance org.apache.predictionio.engines.itemrec -$ cd org.apache.predictionio.engines.itemrec -$ $PIO_HOME/bin/pio register -``` -where `$PIO_HOME` is your installation path of PredictionIO. -Under the directory `org.apache.predictionio.engines.itemrec`, you will see a -self-contained set of configuation files for an instance of Item Recommendation -Engine. - -### Specify the Target App - -PredictionIO uses `<app_id>` to distinguish data between different applications. -Engines usually use data from one application. Inside the engine instance project, -the file `params/datasource.json` defines how data are read from the Event Server. -Change the value of `appId` to `<app_id>` which you used for importing. - -```json -{ - "appId": <app_id>, - "actions": [ - "view", - "like", - ... - ], - ... -} -``` - -### Train and deploy - -Call `pio train` to kick start training. - -``` -$ $PIO_HOME/bin/pio train -... -2014-09-20 01:17:39,997 INFO spark.ContextCleaner - Cleaned broadcast 9 -2014-09-20 01:17:40,194 INFO workflow.CoreWorkflow$ - Saved engine instance with ID: RWTien4GSeCrl3fpZwhJDA -``` - -Once the training is successful, you can deploy the instance. It will start a -Engine Server at `http://localhost:8000`, you can change to another port by -specifying `--port <port>`. - -``` -$ $PIO_HOME/bin/pio deploy -... -[INFO] [09/20/2014 01:19:45.428] [pio-server-akka.actor.default-dispatcher-3] [akka://pio-server/user/IO-HTTP/listener-0] Bound to localhost/127.0.0.1:8000 -[INFO] [09/20/2014 01:19:45.429] [pio-server-akka.actor.default-dispatcher-5] [akka://pio-server/user/master] Bind successful. Ready to serve. -``` - -Another way of making sure an Engine Server is deployed successfully is by -visiting its status page [http://localhost:8000]. You will see information -associated with engine instance like when it is started, trained, its component -classes and parameters. - -### Retrieving Prediction Results -With the EngineClients of a PredictionIO SDK, your application can send queries -to a deployed engine instance through the Engine API. - -To get 3 personalized item recommendations for user "100". - -<div class="tabs"> - <div data-tab="Raw HTTP" data-lang="bash"> -<p>Line breaks are added for illustration in the response.</p> - -```bash -$ curl -i -X POST -d '{"uid": "100", "n": 3}' http://localhost:8000/queries.json -{"items":[ - {"272":9.929327011108398}, - {"313":9.92607593536377}, - {"347":9.92170524597168}]} -``` - </div> - <div data-tab="Python SDK" data-lang="python"> -```python -from predictionio import EngineClient -client = EngineClient(url="http://localhost:8000") -prediction = client.send_query({"uid": "100", "n": 3}) -print prediction -``` - -<p>Output:</p> -```bash -{u'items': [{u'272': 9.929327011108398}, {u'313': 9.92607593536377}, {u'347': -9.92170524597168}]} -``` - </div> -</div> http://git-wip-us.apache.org/repos/asf/incubator-predictionio/blob/e1e71280/docs/manual/obsolete/tutorials/recommendation/yelp.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/obsolete/tutorials/recommendation/yelp.html.md b/docs/manual/obsolete/tutorials/recommendation/yelp.html.md deleted file mode 100644 index 4c69547..0000000 --- a/docs/manual/obsolete/tutorials/recommendation/yelp.html.md +++ /dev/null @@ -1,355 +0,0 @@ ---- -title: Build a Sample Rails Application with Yelp Data ---- -# Introduction - -WARNING: This doc is applicable to 0.8.0 only. Updated version for 0.8.2 will be available soon. - -In this tutorial we are going to create a business recommendation app using -item [recommendation engine](/engines/itemrec/) -and data set from [Yelp](https://www.kaggle.com/c/yelp-recsys-2013/data). - -The Yelp [data set](https://www.kaggle.com/c/yelp-recsys-2013/data) contains data on 43k users, -12k business, and 230k reviews using a five star system. - -At the end you will have a fully functional app that shows the top five recommended business for -each user based on the businesses they has rated in the past. - -## Guide Assumptions - -This is a beginner guide however it assumes you have already installed -[PredictionIO](/install) and that you have a basic familiarity with -[Rails](http://rubyonrails.org/). Some other prerequisites that will be needed include: - -* [Ruby 2.1](https://www.ruby-lang.org) installed (earlier versions will probably work as well) -* [Rails 4.1](http://rubygems.org/gems/rails) installed -* [Postgres](http://www.postgresql.org/) or equivalent database - -# Setting Up the Application - -We are going to setup a basic Rails application. -**Experts** feel free to [skip ahead](#predictionio-setup) to the good stuff. - -First we will create the sample application. - -```$ rails new predictionio_rails``` - -## Setup Database ## - -You will need a database setup for this application. -Make the appropriate changes to your `Gemfile` and `config/database.yml` files. -In our case we are going to use Postgres but any database should work. - -At this point your Rails app should work. `$ rails s` and open [http://localhost:3000](http://localhost:3000/) -in your browser: You should see the standard Rails welcome message. - - - -## Importing Data Into the Application Database - -We will be importing the users, business, and reviews into Postgres. - -Later in the tutorial on we will loop through each record in the database and send it to PredicionIO Event Server. - -Importing all this data can take some time and requires a fair bit of code. Look though the -code on in `lib/tasks/import/*.rake` on [GitHub](https://github.com/ramaboo/predictionio_rails/tree/master/lib/tasks/import) -if you are interested in the exact process otherwise - -You can also just [download](https://s3.amazonaws.com/predictionio-david/predictionio_rails.dump) a Postgres dump and import -it into your database. - -# PredictionIO Setup - -This guide assumes you have set `$PIO_HOME` to the path of the PredictionIO. -If you have not already done this you can do so with the following command: - -``` -$ export PIO_HOME=/path/to/your/PredictionIO-0.8.0 -``` - -## Add PredictionIO Gem - -To easily communicate with PredictionIO we will use the official -[PredictionIO gem](https://github.com/PredictionIO/PredictionIO-Ruby-SDK). - -Add the following to your Gemfile: - -```gem 'predictionio'``` - -and run `$ bundle install`. - -## Checking PredictionIO - -**Before continuing you want to check that PredictionIO is running correctly.** - -You can use the [ps](http://en.wikipedia.org/wiki/Ps_(Unix)) command -and [grep](http://en.wikipedia.org/wiki/Grep) to check that each component is currently running. - -PredictionIO requires the following components running: - -* PredictionIO [Event Server](}/eventapi.html) -* [Elasticsearch](http://www.elasticsearch.org/) -* [HBase](http://hbase.apache.org/) - -To check if the Event Server is running you can -type `$ ps aux | grep eventserver` which should output something like this: - - - -For Elasticsearch use `$ ps aux | grep elasticsearch`. For HBase use `$ ps aux | grep hbase`. - -For additional help read [installing PredictionIO](/install/). - -## Creating the Engine - -An engine represents a type of prediction. For our purposes we will be using the -[item recommendation engine](/engines/itemrec/). - -``` -$ $PIO_HOME/bin/pio instance org.apache.predictionio.engines.itemrec -$ cd org.apache.predictionio.engines.itemrec -$ $PIO_HOME/bin/pio register -``` - -Which should output something like this: - - - -## Specify the Target App - -Inside the engine instance folder edit -`params/datasource.json` and change the value of `appId` to fit your app - in our case 1. - -The `appId` is a **numeric** ID that uniquely identifies your application. - - - -We will also need to configure the engine to recognize our ranking scores as numeric values. -Edit `params/algorithms.json` and change `booleanData` to false. - - - -## Sending Data to PredictionIO - -We will be using a rake task `lib/tasks/import/predictionio.rake` -([show source](https://github.com/ramaboo/predictionio_rails/blob/master/lib/tasks/import/predictionio.rake)). - -Run with `$ rake import:predictionio`. - -## Train the Engine Instance - -Before you can deploy your engine you need to train it. - -First you need to **change into the engine instance folder:** - -``` -$ cd $PIO_HOME/org.apache.predictionio.engines.itemrec -``` - -Train the engine with the imported data: - -``` -$ $PIO_HOME/bin/pio train -``` - -If it works you should see output that looks like this: - - - -Each training command will produce an unique engine instance. - -## Launch the Engine Instance - -Now it is time to launch the engine. - -First you need to **change into the engine instance folder:** - -``` -$ cd $PIO_HOME/org.apache.predictionio.engines.itemrec -``` - -Then you can deploy with: - -``` -$ $PIO_HOME/bin/pio deploy -``` - -This will deploy the **most recent** engine instance. - -You should see output that looks like this: - - - -To check the engine is running correctly open a browser and navigate to [http://localhost:8000](http://localhost:8000/). - -You should see something like this: - - - -If not check out the [engine documentation](/engines/) for additional troubleshooting options. - -You can also run a quick query to double check that everything is good to go: - -``` -$ curl -i -X POST -d '{"uid": "261", "n": 5}' http://localhost:8000/queries.json -``` - -If you imported the data into a clean database then user 261 should have plenty -of entries to return a response that looks like this: - - - -If you need to **delete all** your data you can do so with the following command: - -``` -$ curl -i -X DELETE http://localhost:7070/events.json?appId=<your_appId> -``` - -You will have to train and deploy again after this as well to completely remove everything! - -## Model Retraining - -In a production environment you will want to periodically re-train and re-deploy your model. - -With Linux the easiest way to accomplish this is with [Cron](http://en.wikipedia.org/wiki/Cron) though any scheduler will work. - -For our example we could train and deploy every 6 hours with the following: - -``` -$ crontab -e - -0 */6 * * * cd $PIO_HOME/org.apache.predictionio.engines.itemrec; $PIO_HOME/bin/pio train; $PIO_HOME/bin/pio deploy -``` - -It is not necessary to undeploy, the deploy command will do that automatically. - -## Application Scaffolding - -Now that our data is in PredictionIO it's time to build out our application a little. - -First we are going to add a [counter cache column](http://guides.rubyonrails.org/association_basics.html#counter-cache) -to the users table so we can query users with lots of reviews. - -You will need to change one line on `app/models/review.rb` to this: - -``` -belongs_to :user, counter_cache: true -``` - -And create a migration that updates each users counts ([show source](https://github.com/ramaboo/predictionio_rails/blob/master/db/migrate/20141008013504_add_counter_cache_to_users.rb)). - -``` -$ rails g migration add_counter_cache_to_users reviews_count:integer - -``` - -and then `$ rake db:migrate`. Because we loop through the entire users table with this migration -it may take 20 minutes or more to run the [downloadable](https://s3.amazonaws.com/predictionio-david/predictionio_rails.dump) database dump already includes this migration. - - -Next create a simple controller at `app/controllers/users_controller.rb` with an index action. - -``` -class UsersController < ApplicationController - def index - @users = User.order('reviews_count DESC').limit(20) - end -end -``` - -And setup routes in `config/routes.rb`. - -``` -resources :users, only: [:index, :show] -root 'users#index' -``` - -Next we will create a basic view at `app/views/users/index.html.erb` -([view source](https://github.com/ramaboo/predictionio_rails/blob/master/app/views/users/index.html.erb)). - -As well as a few other cosmetic changes to `app/views/layouts/application.html.erb` -([view source](https://github.com/ramaboo/predictionio_rails/blob/master/app/views/layouts/application.html.erb)). - -At this point if you open a browser and navigate to [http://localhost:3000](http://localhost:3000/) you should see this: - - - -## Querying PredictionIO - -Now the fun stuff. We are going to query PredictionIO for business recommendations based on the users rating history. - -For our user page we want to display three things - -* Basic information about the user. -* A list of 10 recent reviews by the user. -* A list of 5 **recommended** business for the user based on PredictionIO. - -First lets create a show action in `app/controllers/users_controller.rb` -([view source](https://github.com/ramaboo/predictionio_rails/blob/master/app/controllers/users_controller.rb)). - -The first part of the code finds the correct user object. Then we find some -recent reviews for that user and finally we query PredictionIO for recommended businesses. -With the query results we then need to loop through them and load our respective businesses. - -``` -def show - # Find the correct user. - @user = User.find(params[:id]) - - # Find 10 recent reviews by the user. We use eager loading here to reduce database queries. - @recent_reviews = @user.reviews.includes(:business).order('created_at DESC').limit(10) - - # Create new PredictionIO client. - client = PredictionIO::EngineClient.new - - # Query PredictionIO for 5 recommendations! - object = client.send_query('uid' => @user.id, 'n' => 5) - - # Initialize empty recommendations array. - @recommendations = [] - - # Loop though item recommendations returned from PredictionIO. - object['items'].each do |item| - # Initialize empty recommendation hash. - recommendation = {} - - # Each item hash has only one key value pair so the first key is the item ID (in our case the business ID). - business_id = item.keys.first - - # Find the business. - business = Business.find(business_id) - recommendation[:business] = business - - # The value of the hash is the predicted preference score. - score = item.values.first - recommendation[:score] = score - - # Add to the array of recommendations. - @recommendations << recommendation - end -end -``` - -And finally we create view at `app/views/users/show.html.erb` -([view source](https://github.com/ramaboo/predictionio_rails/blob/master/app/views/users/show.html.erb)) to display the information -to the user with a little styling help from [Bootstrap](http://getbootstrap.com/). - -At this point if you open a browser and navigate to -[http://localhost:3000/users/6279](http://localhost:3000/users/6279) you should see something like this: - - - -# Conclusion - -Now you have learned how to use PredictionIO to offer recommendations to -users based on previous actions. Personalized recommendations can be a great addition to any application. -Next, you can try [adding additional engines](/tutorials/enginebuilders/local-helloworld.html) or -play with different input features to see how the results change. - -If you have any questions about PredictionIO check out our [documentation](http://docs.prediction.io/) and feel -free to ask for help in our [Google Group](http://groups.google.com/group/predictionio-dev). - -As always the code in this tutorial is open source (MIT) so feel free to -[fork](https://github.com/ramaboo/predictionio_rails/) it on GitHub. -Let us know [@PredictionIO](http://twitter.com/PredictionIO) if you create something cool with it. - http://git-wip-us.apache.org/repos/asf/incubator-predictionio/blob/e1e71280/docs/manual/obsolete/upgrade.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/obsolete/upgrade.html.md b/docs/manual/obsolete/upgrade.html.md deleted file mode 100644 index b2ba875..0000000 --- a/docs/manual/obsolete/upgrade.html.md +++ /dev/null @@ -1,60 +0,0 @@ -<!-- -Licensed to the Apache Software Foundation (ASF) under one or more -contributor license agreements. See the NOTICE file distributed with -this work for additional information regarding copyright ownership. -The ASF licenses this file to You under the Apache License, Version 2.0 -(the "License"); you may not use this file except in compliance with -the License. You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - -Unless required by applicable law or agreed to in writing, software -distributed under the License is distributed on an "AS IS" BASIS, -WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -See the License for the specific language governing permissions and -limitations under the License. ---> - ---- -title: Version Upgrade ---- - -# Version Upgrade - -The 0.8.x series has been rewritten from the ground up to facilitate easy -building of various types of machine learning engines. The changes are -fundamental and requires a migration to properly upgrade from 0.7.x to 0.8.x. - -## Conceptual Changes - -Before upgrading, it is necessary to understand some fundamental changes and -limitation between the 0.7.x and the 0.8.x series. - -### Event-based Data - -In 0.7.x, users and items are stored separately from user-to-item actions. In -0.8.x, users, items, and user-to-item actions are all recorded as events. - -In 0.8.x, creating, updating, and deleting users and items are recorded as -events. When an engine is trained, information about users and items are built -from an aggregation of these events to form the most recent view of users and -items. The most recent event about a user or an item will always take -precedence. - -The concept of user-to-item action maps directly to the event-based data model -in 0.8.x and requires almost no change during migration. - -### Web UI Users - -0.8.x assumes a trusted environment and no longer associates apps with a -particular web-based user. This data does not need to be migrated. - -### Apps - -Apps and its access key can now be created using the command line interface, -which can be conveniently scripted. - -### Engines and Algorithms - -Engines and algorithms settings in 0.7.x are now stored in individual engine -variant JSON files which allow easy version control and programmatic access. http://git-wip-us.apache.org/repos/asf/incubator-predictionio/blob/e1e71280/tests/.rat-excludes ---------------------------------------------------------------------- diff --git a/tests/.rat-excludes b/tests/.rat-excludes index 1a816f9..840a3e9 100644 --- a/tests/.rat-excludes +++ b/tests/.rat-excludes @@ -19,7 +19,6 @@ Gemfile.lock templates.yaml semver.sh -obsolete/* PredictionIO-.*/* target/* /source
