Ananda Verma created AMBARI-18622:
Summary: Integrate PredictionIO (Machine Learning Engine) With
Issue Type: New Feature
Affects Versions: 2.1.0
Reporter: Ananda Verma
Feature includes adding support for apache predictionIO cluster provisioning
In general, pio can be defined as a service in HDP which has following
1) Event Server - stores events (data)
2) Engine - Engine is responsible for making prediction. It contains one or
more machine learning algorithms. An engine reads training data and build
predictive model(s). It is then deployed as a web service. A deployed engine
responds to prediction queries from your application through REST API in
PredictionIO also has external dependencies on following -
1. HBase: Event Server uses Apache HBase as the data store. It stores imported
events. If you are not using the PredictionIO Event Server, you do not need to
2. Apache Spark: Spark is a large-scale data processing engine that powers the
algorithm, training, and serving processing.
3. HDFS: The output of training has two parts: a model and its meta-data. The
model is then stored in HDFS or a local file system.
4. Elasticsearch: It stores metadata such as model versions, engine versions,
access key and app id mappings, evaluation results, etc.
This message was sent by Atlassian JIRA