Repository: predictionio Updated Branches: refs/heads/livedoc c9e564dde -> 6c607aa23
Fix typos in Markdown files Closes #464 Project: http://git-wip-us.apache.org/repos/asf/predictionio/repo Commit: http://git-wip-us.apache.org/repos/asf/predictionio/commit/6c607aa2 Tree: http://git-wip-us.apache.org/repos/asf/predictionio/tree/6c607aa2 Diff: http://git-wip-us.apache.org/repos/asf/predictionio/diff/6c607aa2 Branch: refs/heads/livedoc Commit: 6c607aa23f2ffaf70f5ba50bdc9bff11f5ebc345 Parents: c9e564d Author: Naoki Takezoe <take...@apache.org> Authored: Thu Sep 20 11:58:52 2018 -0700 Committer: Donald Szeto <don...@apache.org> Committed: Thu Sep 20 12:00:08 2018 -0700 ---------------------------------------------------------------------- docs/manual/source/appintegration/index.html.md | 2 +- .../community/contribute-documentation.html.md | 2 +- .../source/community/contribute-webhook.html.md | 2 +- docs/manual/source/customize/dase.html.md.erb | 2 +- .../source/datacollection/eventapi.html.md | 2 +- .../datacollection/eventmodel.html.md.erb | 10 ++++----- docs/manual/source/demo/tapster.html.md | 2 +- .../source/demo/textclassification.html.md.erb | 6 +++--- docs/manual/source/deploy/monitoring.html.md | 2 +- docs/manual/source/evaluation/index.html.md | 4 ++-- .../source/evaluation/metricbuild.html.md | 4 ++-- .../source/evaluation/paramtuning.html.md | 2 +- .../source/install/install-vagrant.html.md.erb | 4 ++-- .../source/install/launch-aws.html.md.erb | 2 +- .../dimensionalityreduction.html.md | 2 +- docs/manual/source/resources/faq.html.md | 2 +- docs/manual/source/resources/glossary.html.md | 2 +- .../source/resources/intellij.html.md.erb | 2 +- docs/manual/source/support/index.html.md.erb | 4 ++-- .../classification/quickstart.html.md.erb | 2 +- .../complementarypurchase/dase.html.md.erb | 2 +- .../quickstart.html.md.erb | 4 ++-- .../ecommercerecommendation/dase.html.md.erb | 22 ++++++++++---------- .../quickstart.html.md.erb | 6 +++--- .../dase.html.md.erb | 4 ++-- .../templates/leadscoring/dase.html.md.erb | 6 +++--- .../leadscoring/quickstart.html.md.erb | 2 +- .../templates/productranking/dase.html.md.erb | 2 +- .../recommendation/batch-evaluator.html.md | 2 +- .../templates/recommendation/dase.html.md.erb | 2 +- .../recommendation/evaluation.html.md.erb | 8 +++---- .../templates/similarproduct/dase.html.md.erb | 2 +- .../multi-events-multi-algos.html.md.erb | 6 +++--- .../src/main/scala/ECommAlgorithm.scala | 2 +- .../src/main/scala/ECommAlgorithm.scala | 2 +- 35 files changed, 66 insertions(+), 66 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/appintegration/index.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/appintegration/index.html.md b/docs/manual/source/appintegration/index.html.md index a087318..55f8b58 100644 --- a/docs/manual/source/appintegration/index.html.md +++ b/docs/manual/source/appintegration/index.html.md @@ -34,7 +34,7 @@ Overview](/images/overview-singleengine.png) ## Sending Event Data Apache PredictionIO's Event Server receives event data from your -application. The data can be used by engines as training data to build preditive +application. The data can be used by engines as training data to build predictive models. Event Server listens to port 7070 by default. You can change the port with the http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/community/contribute-documentation.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/community/contribute-documentation.html.md b/docs/manual/source/community/contribute-documentation.html.md index 5f1ebf4..1645066 100644 --- a/docs/manual/source/community/contribute-documentation.html.md +++ b/docs/manual/source/community/contribute-documentation.html.md @@ -85,7 +85,7 @@ Please follow this styleguide for any documentation contributions. ### Text -View our [Sample Typography](/samples/) page for all posible styles. +View our [Sample Typography](/samples/) page for all possible styles. ### Headings http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/community/contribute-webhook.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/community/contribute-webhook.html.md b/docs/manual/source/community/contribute-webhook.html.md index 14fda55..bcafca2 100644 --- a/docs/manual/source/community/contribute-webhook.html.md +++ b/docs/manual/source/community/contribute-webhook.html.md @@ -197,7 +197,7 @@ and tests should be in data/src/test/scala/org.apache.predictionio/data/webhooks/segmentio/ ``` -**For form-submission data**, you can find the comple example [the GitHub +**For form-submission data**, you can find the complete example [the GitHub repo](https://github.com/apache/predictionio/blob/develop/data/src/main/scala/org/apache/predictionio/data/webhooks/exampleform/ExampleFormConnector.scala) and how to write [tests for the connector](https://github.com/apache/predictionio/blob/develop/data/src/test/scala/org/apache/predictionio/data/webhooks/exampleform/ExampleFormConnectorSpec.scala). http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/customize/dase.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/customize/dase.html.md.erb b/docs/manual/source/customize/dase.html.md.erb index c06d86c..25f036a 100644 --- a/docs/manual/source/customize/dase.html.md.erb +++ b/docs/manual/source/customize/dase.html.md.erb @@ -27,7 +27,7 @@ DataSource reads and selects useful data from the Event Store (data store of the ## readTraining() -You need to implment readTraining() of [PDataSource](https://predictionio.apache.org/api/current/#org.apache.predictionio.controller.PDataSource), where you can use the [PEventStore Engine API](https://predictionio.apache.org/api/current/#org.apache.predictionio.data.store.PEventStore$) to read the events and create the TrainingData based on the events. +You need to implement readTraining() of [PDataSource](https://predictionio.apache.org/api/current/#org.apache.predictionio.controller.PDataSource), where you can use the [PEventStore Engine API](https://predictionio.apache.org/api/current/#org.apache.predictionio.data.store.PEventStore$) to read the events and create the TrainingData based on the events. The following code example reads user "view" and "buy" item events, filters specific type of events for future processing and returns TrainingData accordingly. http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/datacollection/eventapi.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/datacollection/eventapi.html.md b/docs/manual/source/datacollection/eventapi.html.md index 2c4b740..dea3782 100644 --- a/docs/manual/source/datacollection/eventapi.html.md +++ b/docs/manual/source/datacollection/eventapi.html.md @@ -328,7 +328,7 @@ Field | Type | Description | | are reserved and shouldn't be used. `targetEntityId` | String | (Optional) The target entity ID. `properties` | JSON | (Optional) See **Note About Properties** below - | | **Note**: All peroperty names start with "$" and "pio_" + | | **Note**: All property names start with "$" and "pio_" | | are reserved and shouldn't be used as keys inside `properties`. `eventTime` | String | (Optional) The time of the event. Although Event Server's | | current system time and UTC timezone will be used if this is http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/datacollection/eventmodel.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/datacollection/eventmodel.html.md.erb b/docs/manual/source/datacollection/eventmodel.html.md.erb index 814f6b2..ec8e5a8 100644 --- a/docs/manual/source/datacollection/eventmodel.html.md.erb +++ b/docs/manual/source/datacollection/eventmodel.html.md.erb @@ -25,7 +25,7 @@ This section explains how to model your application data as events. For example, your application may have users and some items which the user can interact with. Then you can model them as two entity types: **user** and **item** and the entityId can uniquely identify the entity within each entityType (e.g. user with ID 1, item with ID 1). -An entity may peform some events (e.g user 1 does something), and entity may have properties associated with it (e.g. user may have gender, age, email etc). Hence, **events** involve **entities** and there are three types of events, respectively: +An entity may perform some events (e.g user 1 does something), and entity may have properties associated with it (e.g. user may have gender, age, email etc). Hence, **events** involve **entities** and there are three types of events, respectively: 1. Generic events performed by an entity. 2. Special events for recording changes of an entity's properties @@ -78,7 +78,7 @@ The following are some simple examples: ## 2. Special events for recording changes of an entity's properties -The generic events described above are used to record general actions performed by the entity. However, an entity may have properties (or attributes) associated with it. Morever, the properties of the entity may change over time (for example, user may have new address, item may have new categories). In order to record such changes of an entity's properties. Special events `$set` , `$unset` and `$delete` are introduced. +The generic events described above are used to record general actions performed by the entity. However, an entity may have properties (or attributes) associated with it. Moreover, the properties of the entity may change over time (for example, user may have new address, item may have new categories). In order to record such changes of an entity's properties. Special events `$set` , `$unset` and `$delete` are introduced. The following special events are reserved for updating entities and their properties: @@ -108,7 +108,7 @@ For example, setting entity `user-1`'s properties `birthday` and `address`: NOTE: Although it doesn't hurt to import duplicated special events for an entity (exactly same properties) into event server (it just means that the entity changes to the same state as before and new duplicated event provides no new information about the user), it could waste storage space. -To demonstrate the concept of these special events, we are going to import a sequence of events and see how it affects the retrieved entitiy's properties. +To demonstrate the concept of these special events, we are going to import a sequence of events and see how it affects the retrieved entity's properties. Assuming you have created the App (named "MyTestApp") for testing and Event Server is started. @@ -151,7 +151,7 @@ After this eventTime, user-2 is created and has properties of a = 3 and b = 4. #### Event 2 -Then, on `2014-09-10T...`, let's say the user has updated the properties b = 5 and c = 6. To record such propertiy change, create another `$set` event. Run the following command: +Then, on `2014-09-10T...`, let's say the user has updated the properties b = 5 and c = 6. To record such property change, create another `$set` event. Run the following command: ```bash $ curl -i -X POST http://localhost:7070/events.json?accessKey=$ACCESS_KEY \ @@ -283,7 +283,7 @@ scala> import org.joda.time.DateTime scala> PEventStore.aggregateProperties(appName=appName, entityType="user", untilTime=Some(new DateTime(2014, 9, 11, 0, 0)))(sc).collect() ``` -You should see the following ouptut and the aggregated properties matches what we expected as described earlier (right befor event 3): user-2 has properties of a = 3, b = 5 and c = 6. +You should see the following ouptut and the aggregated properties matches what we expected as described earlier (right before event 3): user-2 has properties of a = 3, b = 5 and c = 6. ``` res2: Array[(String, org.apache.predictionio.data.storage.PropertyMap)] = http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/demo/tapster.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/demo/tapster.html.md b/docs/manual/source/demo/tapster.html.md index 54f5b59..7d3178b 100644 --- a/docs/manual/source/demo/tapster.html.md +++ b/docs/manual/source/demo/tapster.html.md @@ -417,7 +417,7 @@ demo and build upon it. If you produce something cool shoot us an email and we will link to it from here. Found a typo? Think something should be explained better? This tutorial (and all -our other documenation) live in the main repo +our other documentation) live in the main repo [here](https://github.com/apache/predictionio/blob/livedoc/docs/manual/source/demo/tapster.html.md). Our documentation is in the `livedoc` branch. Find out how to contribute documentation at http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/demo/textclassification.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/demo/textclassification.html.md.erb b/docs/manual/source/demo/textclassification.html.md.erb index f638a44..8b88855 100644 --- a/docs/manual/source/demo/textclassification.html.md.erb +++ b/docs/manual/source/demo/textclassification.html.md.erb @@ -23,7 +23,7 @@ limitations under the License. ## Introduction -In the real world, there are many applications that collect text as data. For example, spam detectors take email and header content to automatically determine what is or is not spam; applications can gague the general sentiment in a geographical area by analyzing Twitter data; and news articles can be automatically categorized based solely on the text content.There are a wide array of machine learning models you can use to create, or train, a predictive model to assign an incoming article, or query, to an existing category. Before you can use these techniques you must first transform the text data (in this case the set of news articles) into numeric vectors, or feature vectors, that can be used to train your model. +In the real world, there are many applications that collect text as data. For example, spam detectors take email and header content to automatically determine what is or is not spam; applications can gauge the general sentiment in a geographical area by analyzing Twitter data; and news articles can be automatically categorized based solely on the text content.There are a wide array of machine learning models you can use to create, or train, a predictive model to assign an incoming article, or query, to an existing category. Before you can use these techniques you must first transform the text data (in this case the set of news articles) into numeric vectors, or feature vectors, that can be used to train your model. The purpose of this tutorial is to illustrate how you can go about doing this using PredictionIO's platform. The advantages of using this platform include: a dynamic engine that responds to queries in real-time; [separation of concerns](http://en.wikipedia.org/wiki/Separation_of_concerns), which offers code re-use and maintainability, and distributed computing capabilities for scalability and efficiency. Moreover, it is easy to incorporate non-trivial data modeling tasks into the DASE architecture allowing Data Scientists to focus on tasks related to modeling. This tutorial will exemplify some of these ideas by guiding you through PredictionIO's [text classification template](/gallery/template-gallery/#natural-language-processing). @@ -91,7 +91,7 @@ $ pio import --appid *** --input data/emails.json ### 3. Set the engine parameters in the file `engine.json`. -The default settings are shown below. By default, it uses the algorithm name "lr" which is logstic regression. Please see later section for more detailed explanation of engine.json setting. +The default settings are shown below. By default, it uses the algorithm name "lr" which is logistic regression. Please see later section for more detailed explanation of engine.json setting. Make sure the "appName" is same as the app you created in step1. @@ -272,7 +272,7 @@ Note that `readEventData` and `readStopWords` use different entity types and eve Now, the default dataset used for training is contained in the file `data/emails.json` and contains a set of e-mail spam data. If we want to switch over to one of the other data sets we must make sure that the `eventNames` and `entityType` fields are changed accordingly. -In the data/ directory, you will find different sets of data files for different types of text classifcaiton application. The following show one observation from each of the provided data files: +In the data/ directory, you will find different sets of data files for different types of text classificaiton application. The following show one observation from each of the provided data files: - `emails.json`: http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/deploy/monitoring.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/deploy/monitoring.html.md b/docs/manual/source/deploy/monitoring.html.md index 0be7589..191898b 100644 --- a/docs/manual/source/deploy/monitoring.html.md +++ b/docs/manual/source/deploy/monitoring.html.md @@ -28,7 +28,7 @@ sudo apt-get install monit ``` ##Configure Basics -Now we can configure monit by the configuration file `/etc/monit/monitrc` with your favorite editor. You will notice that this file contains quite a bit already, most of which is commented intructions/examples. +Now we can configure monit by the configuration file `/etc/monit/monitrc` with your favorite editor. You will notice that this file contains quite a bit already, most of which is commented instructions/examples. First, choose the interval on which you want monit to check the status of your system. Use the `set daemon` command for this, it should already exist in the configuration file. http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/evaluation/index.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/evaluation/index.html.md b/docs/manual/source/evaluation/index.html.md index 1cf01f4..bce9da5 100644 --- a/docs/manual/source/evaluation/index.html.md +++ b/docs/manual/source/evaluation/index.html.md @@ -21,7 +21,7 @@ limitations under the License. PredictionIO's evaluation module allows you to streamline the process of testing lots of knobs in engine parameters and deploy the best one out -of it using statisically sound cross-validation methods. +of it using statistically sound cross-validation methods. There are two key components: @@ -51,6 +51,6 @@ We will discuss various aspects of evaluation with PredictionIO. where you can see a detailed breakdown of all previous evaluations. - [Choosing Evaluation Metrics](/evaluation/metricchoose/) - we cover some basic machine learning metrics -- [Bulding Evaluation Metrics](/evaluation/metricbuild/) - we illustrate how to +- [Building Evaluation Metrics](/evaluation/metricbuild/) - we illustrate how to implement a custom metric with as few as one line of code (plus some boilerplates). http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/evaluation/metricbuild.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/evaluation/metricbuild.html.md b/docs/manual/source/evaluation/metricbuild.html.md index 7560a28..ea9759c 100644 --- a/docs/manual/source/evaluation/metricbuild.html.md +++ b/docs/manual/source/evaluation/metricbuild.html.md @@ -97,11 +97,11 @@ negative cases. PredictionIO provides a helper class `OptionAverageMetric` allows user to specify *don't care* values as `None`. It only aggregates the non-None values. -Lines 3 to 4 is the method signature of `calcuate` method. The key difference +Lines 3 to 4 is the method signature of `calculate` method. The key difference is that the return value is a `Option[Double]`, in contrast to `Double` for `AverageMetric`. This class only computes the average of `Some(.)` results. Lines 5 to 13 are the actual logic. The first `if` factors out the -positively predicted case, and the computation is similiar to the accuracy +positively predicted case, and the computation is similar to the accuracy metric. The negatively predicted case are the *don't cares*, which we return `None`. http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/evaluation/paramtuning.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/evaluation/paramtuning.html.md b/docs/manual/source/evaluation/paramtuning.html.md index 7046bed..8c28486 100644 --- a/docs/manual/source/evaluation/paramtuning.html.md +++ b/docs/manual/source/evaluation/paramtuning.html.md @@ -278,7 +278,7 @@ validation set, `EvaluationInfo` can be used to hold some global evaluation data ; it is not used in the current example. Lines 11 to 41 is the logic of reading and transforming data from the -datastore; it is equvialent to the existing `readTraining` method. After line +datastore; it is equivalent to the existing `readTraining` method. After line 41, the variable `labeledPoints` contains the complete dataset with which we use to generate the (training, validation) sequence. http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/install/install-vagrant.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/install/install-vagrant.html.md.erb b/docs/manual/source/install/install-vagrant.html.md.erb index 558d9ba..9122d56 100644 --- a/docs/manual/source/install/install-vagrant.html.md.erb +++ b/docs/manual/source/install/install-vagrant.html.md.erb @@ -64,7 +64,7 @@ INFO: When you run `vagrant up` for the first time, it will download the base box ubuntu/trusty64 if you don't have it. Then it will also install all necessary libraries and setup PredictionIO in the virtual machine. -When it finishes successfully, you should see somthing like the following: +When it finishes successfully, you should see something like the following: ``` ==> default: Installation done! @@ -112,7 +112,7 @@ $ vagrant halt ``` WARNING: If you didn't shut down VM properly or you ran `vagrant suspend`, the -VM may go to suspend state. HBase may not be running propoerly next time when +VM may go to suspend state. HBase may not be running properly next time when you run `vagrant up.` In this case, you can always run `vagrant halt` to do a clean shutdown first before run `vagrant up` again. http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/install/launch-aws.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/install/launch-aws.html.md.erb b/docs/manual/source/install/launch-aws.html.md.erb index 3c2ed76..47f8f8f 100644 --- a/docs/manual/source/install/launch-aws.html.md.erb +++ b/docs/manual/source/install/launch-aws.html.md.erb @@ -40,7 +40,7 @@ You should see the following screen after you have logged in. ![alt text](../images/awsm-product.png) -Under the big yellow "Continue" botton, select the region where you want to +Under the big yellow "Continue" button, select the region where you want to launch the PredictionIO EC2 instance, then click "Continue". ![alt text](../images/awsm-1click.png) http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/machinelearning/dimensionalityreduction.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/machinelearning/dimensionalityreduction.html.md b/docs/manual/source/machinelearning/dimensionalityreduction.html.md index 4c8f57e..f496bb4 100644 --- a/docs/manual/source/machinelearning/dimensionalityreduction.html.md +++ b/docs/manual/source/machinelearning/dimensionalityreduction.html.md @@ -176,7 +176,7 @@ The data is now in the event server. ## Principal Component Analysis -PCA begins with the data matrix \\(\bf X\\) whose rows are feature vectors corresponding to a set of observations. In our case, each row represents the pixel information of the corresponding hand-written numerc digit image. The model then computes the [covariance matrix](https://en.wikipedia.org/wiki/Covariance_matrix) estimated from the data matrix \\(\bf X\\). The algorithm then takes the covariance matrix and computes the [eigenvectors](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) that correspond to its \\(k\\) (some integer) largest [eigenvalues](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors). The data matrix is then mapped to the space generated by these \\(k\\) vectors, which are called the \\(k\\) **ptincipal components** of \\(\bf X\\). What this is doing is mapping the data observations into a lower-dimensional space that explains the largest variability in the data (contains the most information). The algorithm for implementing PCA is listed as follows: +PCA begins with the data matrix \\(\bf X\\) whose rows are feature vectors corresponding to a set of observations. In our case, each row represents the pixel information of the corresponding hand-written numeric digit image. The model then computes the [covariance matrix](https://en.wikipedia.org/wiki/Covariance_matrix) estimated from the data matrix \\(\bf X\\). The algorithm then takes the covariance matrix and computes the [eigenvectors](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) that correspond to its \\(k\\) (some integer) largest [eigenvalues](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors). The data matrix is then mapped to the space generated by these \\(k\\) vectors, which are called the \\(k\\) **principal components** of \\(\bf X\\). What this is doing is mapping the data observations into a lower-dimensional space that explains the largest variability in the data (contains the most information). The algorithm for implementing PCA is listed as follows: ### PCA Algorithm http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/resources/faq.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/resources/faq.html.md b/docs/manual/source/resources/faq.html.md index b999f58..e9bcf99 100644 --- a/docs/manual/source/resources/faq.html.md +++ b/docs/manual/source/resources/faq.html.md @@ -81,7 +81,7 @@ Storage Backend Connections 2015-02-03 18:40:04,812 ERROR zookeeper.ZooKeeperWatcher - hconnection-0x1e4075ce, quorum=localhost:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid ... -2015-02-03 18:40:07,021 ERROR hbase.StorageClient - Failed to connect to HBase. Plase check if HBase is running properly. +2015-02-03 18:40:07,021 ERROR hbase.StorageClient - Failed to connect to HBase. Please check if HBase is running properly. 2015-02-03 18:40:07,026 ERROR storage.Storage$ - Error initializing storage client for source HBASE 2015-02-03 18:40:07,027 ERROR storage.Storage$ - Can't connect to ZooKeeper java.util.NoSuchElementException: None.get http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/resources/glossary.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/resources/glossary.html.md b/docs/manual/source/resources/glossary.html.md index 2667417..700c5ab 100644 --- a/docs/manual/source/resources/glossary.html.md +++ b/docs/manual/source/resources/glossary.html.md @@ -34,7 +34,7 @@ Algorithm, [S] Serving, [E] Evaluation Metrics. **EngineClient** - Part of PredictionSDK. It sends queries to a deployed engine instance through -the Engine API and retrives prediction results. +the Engine API and retrieves prediction results. **Event API** - Please see Event Server. http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/resources/intellij.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/resources/intellij.html.md.erb b/docs/manual/source/resources/intellij.html.md.erb index aa10743..c54c440 100644 --- a/docs/manual/source/resources/intellij.html.md.erb +++ b/docs/manual/source/resources/intellij.html.md.erb @@ -220,7 +220,7 @@ the following. You can execute a query with the correct SDK. For a recommender that has been trained with the sample MovieLens dataset perhaps the easiest query is a `curl` -one. Start by running or debuging your `pio deploy` config so the service is +one. Start by running or debugging your `pio deploy` config so the service is waiting for the query. Then go to the "Terminal" tab at the very bottom of the IntelliJ IDEA window and enter the `curl` request: http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/support/index.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/support/index.html.md.erb b/docs/manual/source/support/index.html.md.erb index ab8691c..d7be67a 100644 --- a/docs/manual/source/support/index.html.md.erb +++ b/docs/manual/source/support/index.html.md.erb @@ -21,8 +21,8 @@ limitations under the License. ## Community Support -Apahce PredictionIO has a welcoming and active community. We are -here to support you and make sure that you can use Apahce PredictionIO +Apache PredictionIO has a welcoming and active community. We are +here to support you and make sure that you can use Apache PredictionIO successfully. If you are a user, please subscribe to our user mailing list. http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/classification/quickstart.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/classification/quickstart.html.md.erb b/docs/manual/source/templates/classification/quickstart.html.md.erb index f596645..56141f2 100644 --- a/docs/manual/source/templates/classification/quickstart.html.md.erb +++ b/docs/manual/source/templates/classification/quickstart.html.md.erb @@ -191,7 +191,7 @@ client.createEvent(event); Note that you can also set the properties for the user with multiple `$set` events (They will be aggregated during engine training). -To set properties "attr0", "attr1" and "attr2", and "plan" for user "u1" at different time, you can send follwing `$set` events for the user. To send these events, run the following `curl` command: +To set properties "attr0", "attr1" and "attr2", and "plan" for user "u1" at different time, you can send following `$set` events for the user. To send these events, run the following `curl` command: <div class="tabs"> <div data-tab="REST API" data-lang="json"> http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/complementarypurchase/dase.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/complementarypurchase/dase.html.md.erb b/docs/manual/source/templates/complementarypurchase/dase.html.md.erb index 75a9193..884c4a1 100644 --- a/docs/manual/source/templates/complementarypurchase/dase.html.md.erb +++ b/docs/manual/source/templates/complementarypurchase/dase.html.md.erb @@ -238,7 +238,7 @@ case class AlgorithmParams( Parameter description: - **basketWindow**: The buy event is considered as the same basket as previous one if the time difference is within this window (in unit of seconds). For example, if it's set to 120, it means that if the user buys item B within 2 minutes of previous purchase (item A), then the item set [A, B] is considered as the same basket. The purchase of this *basket* is referred as one *transaction*. -- **maxRuleLength**: The maximum length of the association rule length. Must be at least 2. For example, rule of "A implies B" has length of 2 while rule "A, B implies C" has a length of 3. Increasing this number will incrase the training time significantly because more combinations are considered. +- **maxRuleLength**: The maximum length of the association rule length. Must be at least 2. For example, rule of "A implies B" has length of 2 while rule "A, B implies C" has a length of 3. Increasing this number will increase the training time significantly because more combinations are considered. - **minSupport**: The minimum required *support* for the item set to be considered as rule (valid range is 0 to 1). It's the percentage of the item set appearing among all transactions. This is used to filter out infrequent item set. For example, setting to 0.1 means that the item set must appear in 10 % of all transactions. - **minConfidence**: The minimum *confidence* required for the rules (valid range is 0 to 1). The confidence indicates the probability of the condition and conseuquence appear in the same transaction. For example, if A appears in 30 transactions and the item set [A, B] appears in 20 transactions, then the rule "A implies B" has confidence of 0.66. - **minLift**: The minimum *lift* required for the rule. It should be set to 1 to find high quality rule. It's the confidence of the rule divided by the support of the consequence. It is used to filter out rules that the consequence is very frequent anyway regardless of the condition. http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/complementarypurchase/quickstart.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/complementarypurchase/quickstart.html.md.erb b/docs/manual/source/templates/complementarypurchase/quickstart.html.md.erb index 09a4e15..6c28825 100644 --- a/docs/manual/source/templates/complementarypurchase/quickstart.html.md.erb +++ b/docs/manual/source/templates/complementarypurchase/quickstart.html.md.erb @@ -211,7 +211,7 @@ client.createEvent(buyEvent); <%= partial 'shared/quickstart/import_sample_data' %> -A Python import script `import_eventserver.py` is provided to import sample data. The script generates some frequent item sets (prefix with "s"), some other random items (prefix with "i") and a few popular items (prefix with "p"). Then each user (with user ID "u1" to "u10") performs 5 buy transactions (buy events are within 10 seconds in each transcation). In each transcation, the user may or may not buy some random items, always buy one of the popular items and buy 2 or more items in one of the frequent item sets. +A Python import script `import_eventserver.py` is provided to import sample data. The script generates some frequent item sets (prefix with "s"), some other random items (prefix with "i") and a few popular items (prefix with "p"). Then each user (with user ID "u1" to "u10") performs 5 buy transactions (buy events are within 10 seconds in each transaction). In each transaction, the user may or may not buy some random items, always buy one of the popular items and buy 2 or more items in one of the frequent item sets. <%= partial 'shared/quickstart/install_python_sdk' %> @@ -329,7 +329,7 @@ JsonObject response = engineClient.sendQuery(ImmutableMap.<String, Object>of( </div> </div> -The following is sample JSON response. The `cond` field is one of the combination of query items used as condition to determine other frequently bought items with this condition, followed by top items. If there are multiple conditions with recommended items found, the `rules` array will contain mutliple elements, and each correspond to the condition. +The following is sample JSON response. The `cond` field is one of the combination of query items used as condition to determine other frequently bought items with this condition, followed by top items. If there are multiple conditions with recommended items found, the `rules` array will contain multiple elements, and each correspond to the condition. ``` { http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/ecommercerecommendation/dase.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/ecommercerecommendation/dase.html.md.erb b/docs/manual/source/templates/ecommercerecommendation/dase.html.md.erb index baa0017..0c1e399 100644 --- a/docs/manual/source/templates/ecommercerecommendation/dase.html.md.erb +++ b/docs/manual/source/templates/ecommercerecommendation/dase.html.md.erb @@ -153,7 +153,7 @@ In ***engine.json***: } ``` -In `readTraining()`, `PEventStore` is an object which provides function to access dataa that is collected by PredictionIO Event Server. +In `readTraining()`, `PEventStore` is an object which provides function to access data that is collected by PredictionIO Event Server. This E-Commerce Recommendation Engine Template requires "user" and "item" entities that are set by events. @@ -364,9 +364,9 @@ case class ECommAlgorithmParams( Parameter description: - **appName**: Your App name. Events defined by "seenEvents" and "similarEvents" will be read from this app during `predict`. -- **unseenOnly**: true or false. Set to true if you want to recommmend unseen items only. Seen items are defined by *seenEvents* which mean if the user has these events on the items, then it's treated as *seen*. +- **unseenOnly**: true or false. Set to true if you want to recommend unseen items only. Seen items are defined by *seenEvents* which mean if the user has these events on the items, then it's treated as *seen*. - **seenEvents**: A list of user-to-item events which will be treated as *seen* events. Used when *unseenOnly* is set to true. -- **similarEvents**: A list of user-item-item events which will be used to find similar items to the items which the user has performend these events on. +- **similarEvents**: A list of user-item-item events which will be used to find similar items to the items which the user has performed these events on. - **rank**: Parameter of the MLlib ALS algorithm. Number of latent features. - **numIterations**: Parameter of the MLlib ALS algorithm. Number of iterations. - **lambda**: Regularization parameter of the MLlib ALS algorithm. @@ -480,7 +480,7 @@ Then convert the user and item String ID in each ViewEvent to Int with these BiM ``` -NOTE: You can customize this function if you want to convert other events to MLlibRating or need different ways to aggreagte the events into MLlibRating. +NOTE: You can customize this function if you want to convert other events to MLlibRating or need different ways to aggregate the events into MLlibRating. In addition to `RDD[MLlibRating]`, `ALS.trainImplicit` takes the following parameters: *rank*, *iterations*, *lambda* and *seed*. @@ -514,11 +514,11 @@ The parameters `appName`, `unseenOnly`, `seenEvents` and `similarEvents` are use PredictionIO will automatically loads these values into the constructor `ap`, which has a corresponding case class `ECommAlgorithmParams`. -The `seed` parameter is an optional parameter, which is used by MLlib ALS algorithm internally to generate random values. If the `seed` is not specified, current system time would be used and hence each train may produce different reuslts. Specify a fixed value for the `seed` if you want to have deterministic result (For example, when you are testing). +The `seed` parameter is an optional parameter, which is used by MLlib ALS algorithm internally to generate random values. If the `seed` is not specified, current system time would be used and hence each train may produce different results. Specify a fixed value for the `seed` if you want to have deterministic result (For example, when you are testing). `ALS.trainImplicit()` returns a `MatrixFactorizationModel` model which contains two RDDs: userFeatures and productFeatures. They correspond to the user X latent features matrix and item X latent features matrix, respectively. -In addition to the latent feature vector, the item properties (e.g. categories) and popular count are also used during `predict()`. Hence, we also save these data along with the feature vector by joining them and then collect the data as local Map. Each item is represented by a `ProductModel` class, which cosists of the `item` information, `features` calculated by ALS, and `count` returned by `trainDefault()`. +In addition to the latent feature vector, the item properties (e.g. categories) and popular count are also used during `predict()`. Hence, we also save these data along with the feature vector by joining them and then collect the data as local Map. Each item is represented by a `ProductModel` class, which consists of the `item` information, `features` calculated by ALS, and `count` returned by `trainDefault()`. ```scala @@ -581,16 +581,16 @@ http://localhost:8000/queries.json. PredictionIO converts the query, such as `{ We can use the userFeatures and productFeatures stored in ECommModel to calculate the scores of items for the user. -This template also supports additional business logic features, such as filtering items by categories, recommending items in the white list, excluding items in the black list, recommend unseen items only, and exclude unavaiable items defined in constraint event. +This template also supports additional business logic features, such as filtering items by categories, recommending items in the white list, excluding items in the black list, recommend unseen items only, and exclude unavailable items defined in constraint event. The `predict()` function does the following: -1. Convert the item in query's whilteList from string ID to integer index -2. Get a list seen items by the user (defined by parmater `seenEvents`) +1. Convert the item in query's whiteList from string ID to integer index +2. Get a list seen items by the user (defined by parameter `seenEvents`) 3. Get the latest unavailableItems which is used to exclude unavailable items for all users 4. Combine query's blackList, seenItems, and unavailableItems into a final black list of items to be excluded from recommendation. 5. Get the user feature vector from the ECommModel. -6. If there is feature vector for the user, recommend top N items based on the user feature and prodcut features. +6. If there is feature vector for the user, recommend top N items based on the user feature and product features. 7. If there is no feature vector for the user, use the recent items acted by the user (defined by `similarEvents` parameter) to recommend similar items. 8. If there is no recent `similarEvents` available for the user, popular items are then recommended (added in template version 0.4.0). @@ -616,7 +616,7 @@ INFO: You can easily modify `isCandidate()` checking or related logic if you hav // generate final blackList based on additional constraints val finalBlackList: Set[Int] = genBlackList(query = query) - // convert seen Items list from String ID to interger Index + // convert seen Items list from String ID to integer Index .flatMap(x => model.itemStringIntMap.get(x)) // look up user feature from model http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/ecommercerecommendation/quickstart.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/ecommercerecommendation/quickstart.html.md.erb b/docs/manual/source/templates/ecommercerecommendation/quickstart.html.md.erb index 2f2670c..0f0fdd9 100644 --- a/docs/manual/source/templates/ecommercerecommendation/quickstart.html.md.erb +++ b/docs/manual/source/templates/ecommercerecommendation/quickstart.html.md.erb @@ -46,7 +46,7 @@ INFO: This template can easily be customized to consider more user events such a The *view* events are used as Training Data to train the model. The algorithm has a parameter *unseenOnly*; when this parameter is set to true, the engine would recommend unseen items only. You can specify a list of events which are considered as *seen* events with the algorithm parameter *seenEvents*. The default values are *view* and *buy* events, which means that the engine by default recommends un-viewed and un-bought items only. You can also define your own events which are considered as *seen*. -The constraint *unavailableItems* set events are used to exclude a list of unaviable items (such as out of stock) for all users in real time. +The constraint *unavailableItems* set events are used to exclude a list of unavailable items (such as out of stock) for all users in real time. ### Input Query @@ -80,7 +80,7 @@ Likewise, if a blacklist is provided, the engine will exclude those products in Next, let's collect training data for this Engine. By default, the E-Commerce Recommendation Engine Template supports 2 types of entities and 2 events: **user** and -**item**; events **view** and **buy**. An item has the **categories** property, which is a list of category names (String). A user can view and buy an item. The specical **constraint** entiy with entityId **unavailableItems** defines a list of unavailable items and is taken into account in realtime during serving. +**item**; events **view** and **buy**. An item has the **categories** property, which is a list of category names (String). A user can view and buy an item. The special **constraint** entity with entityId **unavailableItems** defines a list of unavailable items and is taken into account in realtime during serving. In summary, this template requires '$set' user event, '$set' item event, user-view-item events, user-buy-item event and '$set' constraint event. @@ -620,7 +620,7 @@ client.createEvent(itemEvent) </div> -Try to get recommendation for user *u1* again, the unavaiable items (e.g. i4, i14, i11). won't be recommended anymore: +Try to get recommendation for user *u1* again, the unavailable items (e.g. i4, i14, i11). won't be recommended anymore: ``` $ curl -H "Content-Type: application/json" \ http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/javaecommercerecommendation/dase.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/javaecommercerecommendation/dase.html.md.erb b/docs/manual/source/templates/javaecommercerecommendation/dase.html.md.erb index 69ef622..4c16cb0 100644 --- a/docs/manual/source/templates/javaecommercerecommendation/dase.html.md.erb +++ b/docs/manual/source/templates/javaecommercerecommendation/dase.html.md.erb @@ -327,9 +327,9 @@ public class AlgorithmParams implements Params{ Parameter description: - **appName**: Your App name. Events defined by "seenItemEvents" and "similarItemEvents" will be read from this app during `predict`. -- **unseenOnly**: true or false. Set to true if you want to recommmend unseen items only. Seen items are defined by *seenItemEvents* which mean if the user has these events on the items, then it's treated as *seen*. +- **unseenOnly**: true or false. Set to true if you want to recommend unseen items only. Seen items are defined by *seenItemEvents* which mean if the user has these events on the items, then it's treated as *seen*. - **seenItemEvents**: A list of user-to-item events which will be treated as *seen* events. Used when *unseenOnly* is set to true. -- **similarItemEvents**: A list of user-item-item events which will be used to find similar items to the items which the user has performend these events on. +- **similarItemEvents**: A list of user-item-item events which will be used to find similar items to the items which the user has performed these events on. - **rank**: Parameter of the MLlib ALS algorithm. Number of latent features. - **iteration**: Parameter of the MLlib ALS algorithm. Number of iterations. - **lambda**: Regularization parameter of the MLlib ALS algorithm. http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/leadscoring/dase.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/leadscoring/dase.html.md.erb b/docs/manual/source/templates/leadscoring/dase.html.md.erb index 6cee64c..27fd3bc 100644 --- a/docs/manual/source/templates/leadscoring/dase.html.md.erb +++ b/docs/manual/source/templates/leadscoring/dase.html.md.erb @@ -180,7 +180,7 @@ In `readTraining()`, `PEventStore` is an object which provides function to acces This Lead Scoring Engine Template requires "view" and "buy" events with `sessionId` in event property. -`PEventStore.find(...)` specifies the events that you want to read. In this case, "user view page" and "user buy item" events are read and then each is mapped to tuple of (sessionId, event). The event are then "cogrouped" by sessionId to find out the information in the session, such as first page view (landing page view), and whether the user converts (buy event), to craete a RDD of Session as TrainingData: +`PEventStore.find(...)` specifies the events that you want to read. In this case, "user view page" and "user buy item" events are read and then each is mapped to tuple of (sessionId, event). The event are then "cogrouped" by sessionId to find out the information in the session, such as first page view (landing page view), and whether the user converts (buy event), to create a RDD of Session as TrainingData: ```scala case class Session( @@ -223,7 +223,7 @@ The `LabeledPoint` class is defined in Spark MLlib and it's required for the Ran By default, the feature used for classification is "landingPage", "referrer" and "browser". Since these features contain categorical values, we need to create a map of categorical values to the integer values for the algorithm to use. -NOTE: You can customize the tempate to use other features. +NOTE: You can customize the template to use other features. For example, if the feature "landingPage" can be any of the following values: "page1", "page2", "page3", "page4". We can create a categorical Int value Map, such as: @@ -238,7 +238,7 @@ Map( Instead of manually create such Map, a helper method `createCategoricalIntMap()` is defined in **Prepraator.scala** for this purpose. -Each `labeledPoint` is a label and a feature vector. The element index of the vector for the coresponding feature is defined by `featureIndex` Map. By default, it's defined as +Each `labeledPoint` is a label and a feature vector. The element index of the vector for the corresponding feature is defined by `featureIndex` Map. By default, it's defined as ```scala val featureIndex = Map( http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/leadscoring/quickstart.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/leadscoring/quickstart.html.md.erb b/docs/manual/source/templates/leadscoring/quickstart.html.md.erb index 5df6e13..5f93020 100644 --- a/docs/manual/source/templates/leadscoring/quickstart.html.md.erb +++ b/docs/manual/source/templates/leadscoring/quickstart.html.md.erb @@ -192,7 +192,7 @@ client.createEvent(viewEvent); </div> -In the same browing session "akdj230fj8ass", the user with ID u0 buys an item i0 on time `2014-11-02T09:42:00.123-08:00` (current time will be used if eventTime is not specified), you can send the following buy event. Run the following `curl` command: +In the same browsing session "akdj230fj8ass", the user with ID u0 buys an item i0 on time `2014-11-02T09:42:00.123-08:00` (current time will be used if eventTime is not specified), you can send the following buy event. Run the following `curl` command: <div class="tabs"> <div data-tab="REST API" data-lang="json"> http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/productranking/dase.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/productranking/dase.html.md.erb b/docs/manual/source/templates/productranking/dase.html.md.erb index ee42b95..f6f19bc 100644 --- a/docs/manual/source/templates/productranking/dase.html.md.erb +++ b/docs/manual/source/templates/productranking/dase.html.md.erb @@ -443,7 +443,7 @@ case class ALSAlgorithmParams( seed: Option[Long]) extends Params ``` -The `seed` parameter is an optional parameter, which is used by MLlib ALS algorithm internally to generate random values. If the `seed` is not specified, current system time would be used and hence each train may produce different reuslts. Specify a fixed value for the `seed` if you want to have deterministic result (For example, when you are testing). +The `seed` parameter is an optional parameter, which is used by MLlib ALS algorithm internally to generate random values. If the `seed` is not specified, current system time would be used and hence each train may produce different results. Specify a fixed value for the `seed` if you want to have deterministic result (For example, when you are testing). `ALS.trainImplicit()` then returns a `MatrixFactorizationModel` model which contains two RDDs: userFeatures and productFeatures. They correspond to the user X latent features matrix and item X latent features matrix, respectively. In this case, we will make use of both userFeatures and productFeatures matrix to rank the items for the user. These matrixes are stored as local model. You could see the `ALSModel` class is defined as: http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/recommendation/batch-evaluator.html.md ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/recommendation/batch-evaluator.html.md b/docs/manual/source/templates/recommendation/batch-evaluator.html.md index d5eb8b2..e754f29 100644 --- a/docs/manual/source/templates/recommendation/batch-evaluator.html.md +++ b/docs/manual/source/templates/recommendation/batch-evaluator.html.md @@ -19,7 +19,7 @@ See the License for the specific language governing permissions and limitations under the License. --> -This how-to tutorial would explain how you can also use `$pio eval` to persist predicted result for a batch of queries. Please read the [Evaluation](/templates/recommendation/evaluation/) to understand the usage of DataSoure's `readEval()` and the Evaluation component first. +This how-to tutorial would explain how you can also use `$pio eval` to persist predicted result for a batch of queries. Please read the [Evaluation](/templates/recommendation/evaluation/) to understand the usage of DataSource's `readEval()` and the Evaluation component first. WARNING: This tutorial is based on some experimental and developer features, which may be changed in future release. http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/recommendation/dase.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/recommendation/dase.html.md.erb b/docs/manual/source/templates/recommendation/dase.html.md.erb index 32f1f8b..5735a97 100644 --- a/docs/manual/source/templates/recommendation/dase.html.md.erb +++ b/docs/manual/source/templates/recommendation/dase.html.md.erb @@ -350,7 +350,7 @@ case class ALSAlgorithmParams( seed: Option[Long]) extends Params ``` -The `seed` parameter is an optional parameter, which is used by MLlib ALS algorithm internally to generate random values. If the `seed` is not specified, current system time would be used and hence each train may produce different reuslts. Specify a fixed value for the `seed` if you want to have deterministic result (For example, when you are testing). +The `seed` parameter is an optional parameter, which is used by MLlib ALS algorithm internally to generate random values. If the `seed` is not specified, current system time would be used and hence each train may produce different results. Specify a fixed value for the `seed` if you want to have deterministic result (For example, when you are testing). `ALS.train` then returns a `MatrixFactorizationModel` model which contains RDD data. RDD is a distributed collection of items which *does not* persist. To http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/recommendation/evaluation.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/recommendation/evaluation.html.md.erb b/docs/manual/source/templates/recommendation/evaluation.html.md.erb index fc4ca43..6131790 100644 --- a/docs/manual/source/templates/recommendation/evaluation.html.md.erb +++ b/docs/manual/source/templates/recommendation/evaluation.html.md.erb @@ -115,8 +115,8 @@ Metrics: ``` -The console prints out the evaluation meric score of each engine params, and finally -pretty print the optimal engine params. Amongs the 3 engine params we evaluate, +The console prints out the evaluation metric score of each engine params, and finally +pretty print the optimal engine params. Amongst the 3 engine params we evaluate, the best Prediction@k has a score of ~0.1521. @@ -145,7 +145,7 @@ recommendation engine: - Definition of 'good'. We want to quantify if the engine is able to recommend items which the user likes, we need to define what is meant by 'good'. In this -examle, we have two kinds of events: 'rate' and 'buy'. The 'rate' event is +example, we have two kinds of events: 'rate' and 'buy'. The 'rate' event is associated with a rating value which ranges between 1 to 4, and the 'buy' event is mapped to a rating of 4. When we implement the metric, we have to specify a rating threshold, only the rating @@ -160,7 +160,7 @@ that the final metric is only an approximation of the actual result. - Recommendation affects user behavior. Suppose you are a e-commerce company and would like to use the recommendation engine to personalize the landing page, -the item you show in the langing page directly impacts what the user is going to +the item you show in the landing page directly impacts what the user is going to purchase. This is different from weather prediction, whatever the weather forecast engine predicts, tomorrow's weather won't be affected. Therefore, when we conduct offline evaluation for recommendation engines, it is possible that http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/similarproduct/dase.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/similarproduct/dase.html.md.erb b/docs/manual/source/templates/similarproduct/dase.html.md.erb index 030ee75..363a4c9 100644 --- a/docs/manual/source/templates/similarproduct/dase.html.md.erb +++ b/docs/manual/source/templates/similarproduct/dase.html.md.erb @@ -452,7 +452,7 @@ case class ALSAlgorithmParams( seed: Option[Long]) extends Params ``` -The `seed` parameter is an optional parameter, which is used by MLlib ALS algorithm internally to generate random values. If the `seed` is not specified, current system time would be used and hence each train may produce different reuslts. Specify a fixed value for the `seed` if you want to have deterministic result (For example, when you are testing). +The `seed` parameter is an optional parameter, which is used by MLlib ALS algorithm internally to generate random values. If the `seed` is not specified, current system time would be used and hence each train may produce different results. Specify a fixed value for the `seed` if you want to have deterministic result (For example, when you are testing). `ALS.trainImplicit()` then returns a `MatrixFactorizationModel` model which contains two RDDs: userFeatures and productFeatures. They correspond to the user X latent features matrix and item X latent features matrix, respectively. In this case, we will make use of the productFeatures matrix to find similar products by comparing the similarity of the latent features. Hence, we store this productFeatures as defined in `ALSModel` class: http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/docs/manual/source/templates/similarproduct/multi-events-multi-algos.html.md.erb ---------------------------------------------------------------------- diff --git a/docs/manual/source/templates/similarproduct/multi-events-multi-algos.html.md.erb b/docs/manual/source/templates/similarproduct/multi-events-multi-algos.html.md.erb index 1e21b49..7d3e06f 100644 --- a/docs/manual/source/templates/similarproduct/multi-events-multi-algos.html.md.erb +++ b/docs/manual/source/templates/similarproduct/multi-events-multi-algos.html.md.erb @@ -33,7 +33,7 @@ This example will demonstrate the following: - Use positive and negative implicit events such as like and dislike with MLlib ALS algorithm - Integrate multiple algorithms into one engine -The complete source code of this examlpe can be found in [here](https://github.com/apache/predictionio/tree/develop/examples/scala-parallel-similarproduct/multi-events-multi-algos). +The complete source code of this example can be found in [here](https://github.com/apache/predictionio/tree/develop/examples/scala-parallel-similarproduct/multi-events-multi-algos). ### Step 1. Read "like" and "dislike" events as TrainingData @@ -41,7 +41,7 @@ Modify the following in DataSource.scala: - In addition to the original `ViewEvent` class, add a new class `LikeEvent` which has a boolean `like` field to represent it's like or dislike event. - Add a new field `likeEvents` into `TrainingData` class to store the `RDD[LikeEvent]`. -- Modidy DataSource's `readTraining()` function to read "like" and "dislike" events from the Event Store. +- Modify DataSource's `readTraining()` function to read "like" and "dislike" events from the Event Store. The modification is shown below: @@ -397,7 +397,7 @@ Next, in order to train and deploy two algorithms for this engine, we also need ``` -INFO: You may notice that the parameters of the new `"likealgo"` contains the same fields as `"als"`. It is just becasuse the `LikeAlgorithm` class extends the original `ALSAlgorithm` class and shares the same algorithm parameter class definition. If the other algorithm you add has its own parameter class, you just need to specify them inside its `params` field accordingly. +INFO: You may notice that the parameters of the new `"likealgo"` contains the same fields as `"als"`. It is just because the `LikeAlgorithm` class extends the original `ALSAlgorithm` class and shares the same algorithm parameter class definition. If the other algorithm you add has its own parameter class, you just need to specify them inside its `params` field accordingly. That's it! Now you have a engine configured with two algorithms. http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/examples/scala-parallel-ecommercerecommendation/adjust-score/src/main/scala/ECommAlgorithm.scala ---------------------------------------------------------------------- diff --git a/examples/scala-parallel-ecommercerecommendation/adjust-score/src/main/scala/ECommAlgorithm.scala b/examples/scala-parallel-ecommercerecommendation/adjust-score/src/main/scala/ECommAlgorithm.scala index ef49dc2..d63b090 100644 --- a/examples/scala-parallel-ecommercerecommendation/adjust-score/src/main/scala/ECommAlgorithm.scala +++ b/examples/scala-parallel-ecommercerecommendation/adjust-score/src/main/scala/ECommAlgorithm.scala @@ -250,7 +250,7 @@ class ECommAlgorithm(val ap: ECommAlgorithmParams) ) val finalBlackList: Set[Int] = genBlackList(query = query) - // convert seen Items list from String ID to interger Index + // convert seen Items list from String ID to integer Index .flatMap(x => model.itemStringIntMap.get(x)) // ADDED http://git-wip-us.apache.org/repos/asf/predictionio/blob/6c607aa2/examples/scala-parallel-ecommercerecommendation/train-with-rate-event/src/main/scala/ECommAlgorithm.scala ---------------------------------------------------------------------- diff --git a/examples/scala-parallel-ecommercerecommendation/train-with-rate-event/src/main/scala/ECommAlgorithm.scala b/examples/scala-parallel-ecommercerecommendation/train-with-rate-event/src/main/scala/ECommAlgorithm.scala index 597bb50..e2c7224 100644 --- a/examples/scala-parallel-ecommercerecommendation/train-with-rate-event/src/main/scala/ECommAlgorithm.scala +++ b/examples/scala-parallel-ecommercerecommendation/train-with-rate-event/src/main/scala/ECommAlgorithm.scala @@ -251,7 +251,7 @@ class ECommAlgorithm(val ap: ECommAlgorithmParams) ) val finalBlackList: Set[Int] = genBlackList(query = query) - // convert seen Items list from String ID to interger Index + // convert seen Items list from String ID to integer Index .flatMap(x => model.itemStringIntMap.get(x)) val userFeature: Option[Array[Double]] =