[DOC] Merge doc source

* Merge doc source from: https://github.com/anyway1021/eagle-doc
* Uploaded to http://eagle.apache.org/docs/latest/

So that contributors could send patch about code along with related docs in 
single PR.

Author: Hao Chen <[email protected]>

Closes #901 from haoch/MergeDocSource.


Project: http://git-wip-us.apache.org/repos/asf/eagle/repo
Commit: http://git-wip-us.apache.org/repos/asf/eagle/commit/ee55054a
Tree: http://git-wip-us.apache.org/repos/asf/eagle/tree/ee55054a
Diff: http://git-wip-us.apache.org/repos/asf/eagle/diff/ee55054a

Branch: refs/heads/master
Commit: ee55054a7a5de8e20abecf0dc233c312a70d2e98
Parents: dd2c098
Author: Hao Chen <[email protected]>
Authored: Mon Apr 3 19:46:19 2017 +0800
Committer: Hao Chen <[email protected]>
Committed: Mon Apr 3 19:46:19 2017 +0800

----------------------------------------------------------------------
 .gitignore                                      |    1 +
 docs/README.md                                  |    2 +
 docs/bin/demo-service.sh                        |  127 ++
 docs/bin/doc-env.conf                           |    7 +
 docs/docs/applications.md                       |  378 ++++
 docs/docs/developing-application.md             |  285 +++
 docs/docs/getting-started.md                    |  233 +++
 docs/docs/hadoop-jmx-metrics-list.txt           | 1936 ++++++++++++++++++
 docs/docs/include/images/add_publisher.png      |  Bin 0 -> 24212 bytes
 docs/docs/include/images/alert_alerts.png       |  Bin 0 -> 153760 bytes
 .../docs/include/images/alert_define_policy.png |  Bin 0 -> 109258 bytes
 docs/docs/include/images/alert_details.png      |  Bin 0 -> 72058 bytes
 docs/docs/include/images/alert_engine.png       |  Bin 0 -> 142662 bytes
 .../images/alert_engine_coordination.png        |  Bin 0 -> 141897 bytes
 .../include/images/alert_engine_policy_spec.png |  Bin 0 -> 146211 bytes
 docs/docs/include/images/alert_policies.png     |  Bin 0 -> 52873 bytes
 docs/docs/include/images/configure_site.png     |  Bin 0 -> 91752 bytes
 docs/docs/include/images/dashboard.png          |  Bin 0 -> 35788 bytes
 .../include/images/define_jmx_alert_policy.png  |  Bin 0 -> 318973 bytes
 docs/docs/include/images/delete_icon.png        |  Bin 0 -> 4330 bytes
 docs/docs/include/images/eagle_arch_v0.5.0.png  |  Bin 0 -> 408403 bytes
 docs/docs/include/images/eagle_ecosystem.png    |  Bin 0 -> 274201 bytes
 .../docs/include/images/eagle_web_interface.png |  Bin 0 -> 72605 bytes
 docs/docs/include/images/edit_icon.png          |  Bin 0 -> 4224 bytes
 docs/docs/include/images/favicon.png            |  Bin 0 -> 4209 bytes
 .../include/images/hadoop_queue_monitor_1.png   |  Bin 0 -> 90402 bytes
 .../include/images/hadoop_queue_monitor_2.png   |  Bin 0 -> 250649 bytes
 .../include/images/hadoop_queue_monitor_3.png   |  Bin 0 -> 156044 bytes
 .../include/images/hadoop_queue_monitor_4.png   |  Bin 0 -> 162481 bytes
 .../include/images/hadoop_queue_monitor_5.png   |  Bin 0 -> 156116 bytes
 .../include/images/hadoop_queue_monitor_6.png   |  Bin 0 -> 93578 bytes
 .../include/images/hadoop_queue_monitor_7.png   |  Bin 0 -> 300928 bytes
 docs/docs/include/images/hdfs_audit_log.png     |  Bin 0 -> 183176 bytes
 docs/docs/include/images/hdfs_install_1.png     |  Bin 0 -> 397610 bytes
 docs/docs/include/images/hdfs_install_2.png     |  Bin 0 -> 292470 bytes
 docs/docs/include/images/hdfs_install_3.png     |  Bin 0 -> 264754 bytes
 docs/docs/include/images/hdfs_policy_1.png      |  Bin 0 -> 301293 bytes
 .../images/health_check_installation.png        |  Bin 0 -> 49680 bytes
 .../docs/include/images/health_check_policy.png |  Bin 0 -> 85461 bytes
 .../include/images/health_check_settings.png    |  Bin 0 -> 38682 bytes
 .../docs/include/images/health_check_stream.png |  Bin 0 -> 43105 bytes
 docs/docs/include/images/install_jmx_2.png      |  Bin 0 -> 301008 bytes
 docs/docs/include/images/install_jmx_3.png      |  Bin 0 -> 239960 bytes
 docs/docs/include/images/install_jmx_6.png      |  Bin 0 -> 226890 bytes
 .../include/images/integration_applications.png |  Bin 0 -> 98769 bytes
 docs/docs/include/images/integration_sites.png  |  Bin 0 -> 67355 bytes
 docs/docs/include/images/jpm.jpg                |  Bin 0 -> 40457 bytes
 docs/docs/include/images/jpm_configure.png      |  Bin 0 -> 360754 bytes
 docs/docs/include/images/jpm_define_policy.png  |  Bin 0 -> 769852 bytes
 docs/docs/include/images/jpm_streams.png        |  Bin 0 -> 333994 bytes
 docs/docs/include/images/new_site.png           |  Bin 0 -> 25714 bytes
 docs/docs/include/images/overview.png           |  Bin 0 -> 98222 bytes
 docs/docs/include/images/site_list.png          |  Bin 0 -> 76104 bytes
 docs/docs/include/images/start_icon.png         |  Bin 0 -> 4091 bytes
 docs/docs/include/images/stop_icon.png          |  Bin 0 -> 3872 bytes
 docs/docs/include/images/storage_engine.png     |  Bin 0 -> 70546 bytes
 docs/docs/index.html                            |    6 +
 docs/docs/index.md                              |   93 +
 docs/docs/reference.md                          |  325 +++
 docs/docs/underlying-design.md                  |  231 +++
 docs/docs/using-eagle.md                        |  347 ++++
 docs/eagle-theme/__init__.py                    |    0
 docs/eagle-theme/base.html                      |  118 ++
 docs/eagle-theme/breadcrumbs.html               |   25 +
 docs/eagle-theme/css/highlight.css              |  125 ++
 docs/eagle-theme/css/theme.css                  |   12 +
 docs/eagle-theme/css/theme_extra.css            |  150 ++
 docs/eagle-theme/fonts/fontawesome-webfont.eot  |  Bin 0 -> 37405 bytes
 docs/eagle-theme/fonts/fontawesome-webfont.svg  |  399 ++++
 docs/eagle-theme/fonts/fontawesome-webfont.ttf  |  Bin 0 -> 79076 bytes
 docs/eagle-theme/fonts/fontawesome-webfont.woff |  Bin 0 -> 43572 bytes
 docs/eagle-theme/footer.html                    |   23 +
 docs/eagle-theme/img/favicon.ico                |  Bin 0 -> 1150 bytes
 docs/eagle-theme/js/highlight.pack.js           |    2 +
 docs/eagle-theme/js/jquery-2.1.1.min.js         |    4 +
 docs/eagle-theme/js/modernizr-2.8.3.min.js      |    1 +
 docs/eagle-theme/js/theme.js                    |   55 +
 docs/eagle-theme/license/highlight.js/LICENSE   |   24 +
 docs/eagle-theme/search.html                    |   21 +
 docs/eagle-theme/searchbox.html                 |    5 +
 docs/eagle-theme/toc.html                       |   83 +
 docs/eagle-theme/versions.html                  |   15 +
 docs/mkdocs.yml                                 |   20 +
 83 files changed, 5053 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/eagle/blob/ee55054a/.gitignore
----------------------------------------------------------------------
diff --git a/.gitignore b/.gitignore
index b7bff37..173da0c 100644
--- a/.gitignore
+++ b/.gitignore
@@ -83,3 +83,4 @@ logs/
 **/*.pyc
 
 **/*.db
+docs/site

http://git-wip-us.apache.org/repos/asf/eagle/blob/ee55054a/docs/README.md
----------------------------------------------------------------------
diff --git a/docs/README.md b/docs/README.md
new file mode 100644
index 0000000..0487b61
--- /dev/null
+++ b/docs/README.md
@@ -0,0 +1,2 @@
+# eagle-doc
+Temporarily holding new eagle documentation made by MkDocs.

http://git-wip-us.apache.org/repos/asf/eagle/blob/ee55054a/docs/bin/demo-service.sh
----------------------------------------------------------------------
diff --git a/docs/bin/demo-service.sh b/docs/bin/demo-service.sh
new file mode 100755
index 0000000..b62ccfa
--- /dev/null
+++ b/docs/bin/demo-service.sh
@@ -0,0 +1,127 @@
+#!/bin/bash
+
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+function print_help() {
+       echo "Usage: $0 {start | stop | restart | status}"
+       exit 1
+}
+
+if [ $# != 1 ]
+then
+       print_help
+fi
+
+BASE_DIR="$(dirname $0)"
+ROOT_DIR=$(cd "${BASE_DIR}/../"; pwd)
+BASE_NAME="$(basename $0)"
+SHELL_NAME="${BASE_NAME%.*}"
+CONF_FILE="${BASE_DIR}/doc-env.conf"
+
+if [ ! -f $CONF_FILE ]; then
+       echo "file missing: ${CONF_FILE}"
+fi
+
+source ${CONF_FILE}
+
+LOG_DIR="log"
+TEMP_DIR="temp"
+FULL_NAME="${PROGRAM}-${SHELL_NAME}-${PORT}"
+LOG_FILE="${ROOT_DIR}/${LOG_DIR}/${FULL_NAME}.out"
+PID_FILE="${ROOT_DIR}/${TEMP_DIR}/${FULL_NAME}-pid"
+
+CURR_USER="$(whoami)"
+echo -n "[sudo] password for ${CURR_USER}: "
+read -s PWD
+echo 
+
+if [ ! -e ${ROOT_DIR}/${LOG_DIR} ]; then
+       echo ${PWD} | sudo mkdir -p ${ROOT_DIR}/${LOG_DIR}
+       echo ${PWD} | sudo chown -R ${USER}:${GROUP} ${ROOT_DIR}/${LOG_DIR}
+       echo ${PWD} | sudo chmod -R ${FILE_MOD} ${ROOT_DIR}/${LOG_DIR}
+fi
+
+if [ ! -e ${ROOT_DIR}/${TEMP_DIR} ]; then
+       echo ${PWD} | sudo mkdir -p ${ROOT_DIR}/${TEMP_DIR}
+       echo ${PWD} | sudo chown -R ${USER}:${GROUP} ${ROOT_DIR}/${TEMP_DIR}
+       echo ${PWD} | sudo chmod -R ${FILE_MOD} ${ROOT_DIR}/${TEMP_DIR}
+fi
+
+cd ${ROOT_DIR}
+
+start() {
+       echo "Starting ${FULL_NAME} ..."
+       nohup ${COMMAND} 1> ${LOG_FILE} & echo $! > $PID_FILE
+       if [ $? != 0 ];then
+               echo "Error: failed starting"
+               exit 1
+       fi
+       echo "Started successfully"
+}
+
+stop() {
+    echo "Stopping ${FULL_NAME} ..."
+       if [[ ! -f ${PID_FILE} ]];then
+           echo "No ${PROGRAM} running"
+       exit 1
+    fi
+
+    PID=`cat ${PID_FILE}`
+       kill ${PID}
+       if [ $? != 0 ];then
+               echo "Error: failed stopping"
+               rm -rf ${PID_FILE}
+               exit 1
+       fi
+
+       rm ${PID_FILE}
+       echo "Stopped successfully"
+}
+
+case $1 in
+"start")
+    start;
+       ;;
+"stop")
+    stop;
+       ;;
+"restart")
+       echo "Restarting ${FULL_NAME} ..."
+    stop; sleep 1; start;
+       echo "Restarting completed"
+       ;;
+"status")
+       echo "Checking ${FULL_NAME} status ..."
+       if [[ -e ${PID_FILE} ]]; then
+           PID=`cat $PID_FILE`
+       fi
+       if [[ -z ${PID} ]];then
+           echo "Error: ${FULL_NAME} is not running (missing PID)"
+           exit 0
+       elif ps -p ${PID} > /dev/null; then
+           echo "${FULL_NAME} is running with PID: ${PID}"
+           exit 0
+    else
+        echo "${FULL_NAME} is not running (tested PID: ${PID})"
+        exit 0
+    fi
+       ;;
+*)
+       print_help
+       ;;
+esac
+
+exit 0

http://git-wip-us.apache.org/repos/asf/eagle/blob/ee55054a/docs/bin/doc-env.conf
----------------------------------------------------------------------
diff --git a/docs/bin/doc-env.conf b/docs/bin/doc-env.conf
new file mode 100755
index 0000000..e1b0caa
--- /dev/null
+++ b/docs/bin/doc-env.conf
@@ -0,0 +1,7 @@
+export GROUP=jenkins
+export USER=jenkins
+export FILE_MOD=770
+export PROGRAM=mkdocs
+export ADDRESS=0.0.0.0
+export PORT=8000
+export COMMAND="${PROGRAM} serve -a ${ADDRESS}:${PORT}"
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/eagle/blob/ee55054a/docs/docs/applications.md
----------------------------------------------------------------------
diff --git a/docs/docs/applications.md b/docs/docs/applications.md
new file mode 100644
index 0000000..74efcc6
--- /dev/null
+++ b/docs/docs/applications.md
@@ -0,0 +1,378 @@
+# HDFS Data Activity Monitoring
+
+## Monitor Requirements
+
+This application aims to monitor user activities on HDFS via the hdfs audit 
log. Once any abnormal user activity is detected, an alert is sent in several 
seconds. The whole pipeline of this application is
+
+* Kafka ingest: this application consumes data from Kafka. In other words, 
users have to stream the log into Kafka first. 
+
+* Data re-procesing, which includes raw log parser, ip zone joiner, 
sensitivity information joiner. 
+
+* Kafka sink: parsed data will flows into Kafka again, which will be consumed 
by the alert engine. 
+
+* Policy evaluation: the alert engine (hosted in Alert Engine app) evaluates 
each data event to check if the data violate the user defined policy. An alert 
is generated if the data matches the policy.
+
+![HDFSAUDITLOG](include/images/hdfs_audit_log.png)
+
+
+## Setup & Installation
+
+* Choose a site to install this application. For example 'sandbox'
+
+* Install "Hdfs Audit Log Monitor" app step by step
+
+    ![Install Step 2](include/images/hdfs_install_1.png)
+
+    ![Install Step 3](include/images/hdfs_install_2.png)
+
+    ![Install Step 4](include/images/hdfs_install_3.png)
+
+
+## How to collect the log
+
+To collect the raw audit log on namenode servers, a log collector is needed. 
Users can choose any tools they like. There are some common solutions 
available: 
[logstash](https://www.elastic.co/guide/en/logstash/current/getting-started-with-logstash.html),
 
[filebeat](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-getting-started.html),
 log4j appender, etcs. 
+
+For detailed instruction, refer to: [How to stream audit log into 
Kafka](using-eagle/#how-to-stream-audit-log-into-kafka)
+
+## Sample policies
+
+### 1. monitor file/folder operations 
+
+Delete a file/folder on HDFS. 
+
+```
+from 
HDFS_AUDIT_LOG_ENRICHED_STREAM_SANDBOX[str:contains(src,'/tmp/test/subtest') 
and ((cmd=='rename' and str:contains(dst, '.Trash')) or cmd=='delete')] select 
* group by user insert into hdfs_audit_log_enriched_stream_out
+```
+
+HDFS_AUDIT_LOG_ENRICHED_STREAM_SANDBOX is the input stream name, and 
hdfs_audit_log_enriched_stream_out is the output stream name, the content 
between [] is the monitoring conditions. `cmd`, `src` and `dst` is the fields 
of hdfs audit logs.
+
+   ![Policy 1](include/images/hdfs_policy_1.png)
+
+### 2. classify the file/folder on HDFS
+
+Users may want to mark some folders/files on HDFS as sensitive content. For 
example, by marking '/sys/soj' as "SOJ", users can monitor any operations they 
care about on 'sys/soj' and its subfolders/files.
+
+```
+from HDFS_AUDIT_LOG_ENRICHED_STREAM_SANDBOX[sensitivityType=='SOJ' and 
cmd=='delete')] select * group by user insert into 
hdfs_audit_log_enriched_stream_out
+```
+The example policy monitors the 'delete' operation on files/subfolders under 
/sys/soj. 
+
+### 3. Classify the IP Zone 
+
+In some cases, the ips are classified into different zones. For some zone, it 
may have higher secrecy. Eagle providers ways to monitor user activities on IP 
level. 
+
+```
+from HDFS_AUDIT_LOG_ENRICHED_STREAM_SANDBOX[securityZone=='SECURITY' and 
cmd=='delete')] select * group by user insert into 
hdfs_audit_log_enriched_stream_out
+```
+
+The example policy monitors the 'delete' operation on hosts in 'SECURITY' 
zone. 
+
+## Questions on this application
+
+---
+
+# JMX Monitoring
+
+* Application "**HADOOP_JMX_METRIC_MONITOR**" provide embedded collector 
script to ingest hadoop/hbase jmx metric as eagle stream and provide ability to 
define alert policy and detect anomaly in real-time from metric.
+
+    |   Fields   ||
+    | :---: | :---: |
+    | **Type**    | *HADOOP_JMX_METRIC_MONITOR* |
+    | **Version** | *0.5.0-version* |
+    | **Description** | *Collect JMX Metric and monitor in real-time* |
+    | **Streams** | *HADOOP_JMX_METRIC_STREAM* |
+    | **Configuration** | *JMX Metric Kafka Topic (default: 
hadoop_jmx_metric_{SITE_ID})*<br/><br/>*Kafka Broker List (default: 
localhost:6667)* |
+
+## Setup & Installation
+
+* Make sure already setup a site (here use a demo site named "sandbox").
+
+* Install "Hadoop JMX Monitor" app in eagle server.
+
+    ![Install Step 2](include/images/install_jmx_2.png)
+
+* Configure Application settings.
+
+    ![Install Step 3](include/images/install_jmx_3.png)
+
+* Ensure a kafka topic named hadoop_jmx_metric_{SITE_ID} (In current guide, it 
should be hadoop_jmx_metric_sandbox)
+
+* Setup metric collector for monitored Hadoop/HBase using hadoop_jmx_collector 
and modify the configuration.
+
+    * Collector scripts: 
[hadoop_jmx_collector](https://github.com/apache/incubator-eagle/tree/master/eagle-external/hadoop_jmx_collector)
+
+    * Rename config-sample.json to config.json: 
[config-sample.json](https://github.com/apache/incubator-eagle/blob/master/eagle-external/hadoop_jmx_collector/config-sample.json)
+
+            {
+                env: {
+                    site: "sandbox",
+                    name_node: {
+                        hosts: [
+                            "sandbox.hortonworks.com"
+                        ],
+                        port: 50070,
+                        https: false
+                    },
+                    resource_manager: {
+                        hosts: [
+                            "sandbox.hortonworks.com"
+                        ],
+                        port: 50030,
+                        https: false
+                    }
+                },
+                inputs: [{
+                    component: "namenode",
+                    host: "server.eagle.apache.org",
+                    port: "50070",
+                    https: false,
+                    kafka_topic: "nn_jmx_metric_sandbox"
+                }, {
+                    component: "resourcemanager",
+                    host: "server.eagle.apache.org",
+                    port: "8088",
+                    https: false,
+                    kafka_topic: "rm_jmx_metric_sandbox"
+                }, {
+                    component: "datanode",
+                    host: "server.eagle.apache.org",
+                    port: "50075",
+                    https: false,
+                    kafka_topic: "dn_jmx_metric_sandbox"
+                }],
+                filter: {
+                    monitoring.group.selected: [
+                        "hadoop",
+                        "java.lang"
+                    ]
+                },
+                output: {
+                    kafka: {
+                        brokerList: [
+                            "localhost:9092"
+                        ]
+                    }
+                }
+            }
+
+
+* Click "Install" button then you will see the 
HADOOP_JMX_METRIC_STREAM_{SITE_ID} in Streams.
+
+    ![Install Step 6](include/images/install_jmx_6.png)
+
+## Define JMX Alert Policy
+
+1. Go to "Define Policy".
+
+2. Select HADOOP_JMX_METRIC_MONITOR related streams.
+
+3. Define SQL-Like policy, for example
+
+        from HADOOP_JMX_METRIC_STREAM_SANDBOX[metric=="cpu.usage" and value > 
0.9]
+        select site,host,component,value
+        insert into HADOOP_CPU_USAGE_GT_90_ALERT;
+
+    As seen in below screenshot:
+
+![Define JMX Alert Policy](include/images/define_jmx_alert_policy.png)
+
+## Stream Schema
+
+* Schema
+
+    | Stream Name | Stream Schema | Time Series |
+    | :---------: | :-----------: | :---------: |
+    | HADOOP_JMX_METRIC_MONITOR | **host**: STRING<br/><br/>**timestamp**: 
LONG<br/><br/>**metric**: STRING<br/><br/>**component**: 
STRING<br/><br/>**site**: STRING<br/><br/>**value**: DOUBLE | True |
+
+## Metrics List
+
+* Please refer to the [Hadoop JMX Metrics List](hadoop-jmx-metrics-list.txt) 
and see which metrics you're interested in.
+
+---
+
+# Job Performance Monitoring
+
+## Monitor Requirements
+
+* Finished/Running Job Details
+* Job Metrics(Job Counter/Statistics) Aggregation
+* Alerts(Job failure/Job slow)
+
+## Applications
+
+* Application Table
+
+    | application | responsibility |
+    | :---: | :---: |
+    | Map Reduce History Job Monitoring | parse mr history job logs from hdfs |
+    | Map Reduce Running Job Monitoring | get mr running job details from 
resource manager |
+    | Map Reduce Metrics Aggregation | aggregate metrics generated by 
applications above |
+
+## Data Ingestion And Process
+
+* We build storm topology to fulfill requirements for each application.
+
+    ![topology figures](include/images/jpm.jpg)
+
+* Map Reduce History Job Monitoring (Figure 1)
+    * **Read Spout**
+        * read/parse history job logs from HDFS and flush to eagle 
service(storage is Hbase)
+    * **Sink Bolt**
+        * convert parsed jobs to streams and write to data sink
+* Map Reduce Running Job Monitoring (Figure 2)
+    * **Read Spout**
+        * fetch running job list from resource manager and emit to Parse Bolt
+    * **Parse Bolt**
+        * for each running job, fetch job detail/job counter/job 
configure/tasks from resource manager
+* Map Reduce Metrics Aggregation (Figure 3)
+    * **Divide Spout**
+        * divide time period(need to be aggregated) to small pieces and emit 
to Aggregate Bolt
+    * **Aggregate Bolt**
+        * aggregate metrics for given time period received from Divide Spout
+
+## Setup & Installation
+* Make sure already setup a site (here use a demo site named "sandbox").
+
+* Install "Map Reduce History Job" app in eagle server(Take this application 
as an example).
+
+* Configure Application settings
+
+    ![application configures](include/images/jpm_configure.png)
+
+* Ensure a kafka topic named {SITE_ID}_map_reduce_failed_job (In current 
guide, it should be sandbox_map_reduce_failed_job) will be created.
+
+* Click "Install" button then you will see the 
MAP_REDUCE_FAILED_JOB_STREAM_{SITE_ID} in Alert->Streams.
+    ![application configures](include/images/jpm_streams.png)
+  This application will write stream data to kafka topic(created by last step)
+  
+## Integration With Alert Engine
+
+In order to integrate applications with alert engine and send alerts, follow 
below steps(Take Map Reduce History Job application as an example):
+
+* **define stream and configure data sink**
+    * define stream in resource/META-INF/providers/xxxProviders.xml
+    For example, MAP_REDUCE_FAILED_JOB_STREAM_{SITE_ID}
+    * configure data sink
+    For example, create kafka topic {SITE_ID}_map_reduce_failed_job
+
+* **define policy**
+
+For example, if you want to receive map reduce job failure alerts, you can 
define policies (SiddhiQL) as the following:
+```sql
+from map_reduce_failed_job_stream[site=="sandbox" and currentState=="FAILED"]
+select site, queue, user, jobType, jobId, submissionTime, trackingUrl, 
startTime, endTime
+group by jobId insert into map_reduce_failed_job_stream_out
+```
+    
+   ![define policy](include/images/jpm_define_policy.png)
+   
+* **view alerts**
+
+You can view alerts in Alert->alerts page.
+
+## Stream Schema
+All columns above are predefined in stream map_reduce_failed_job_stream 
defined in
+
+    
eagle-jpm/eagle-jpm-mr-history/src/main/resources/META-INF/providers/org.apache.eagle.jpm.mr.history.MRHistoryJobApplicationProvider.xml
+
+Then, enable the policy in web ui after it's created. Eagle will schedule it 
automatically.
+
+---
+
+# Topology Health Check
+
+* Application "TOPOLOGY HEALTH CHECK" aims to monior those servies with a 
master-slave structured topology and provide metrics at host level.
+
+    |   Fields   ||
+    | :---: | :---: |
+    | **Type**    | *TOPOLOGY_HEALTH_CHECK* |
+    | **Version** | *0.5.0-version* |
+    | **Description** | *Collect MR,HBASE,HDFS node status and cluster ratio* |
+    | **Streams** | *TOPOLOGY_HEALTH_CHECK_STREAM* |
+    | **Configuration** | *Topology Health Check Topic (default: 
topology_health_check)*<br/><br/>*Kafka Broker List (default: 
sandobox.hortonworks.com:6667)* |
+
+## Setup & Installation
+
+* Make sure already setup a site (here use a demo site named "sandbox").
+
+* Install "Topology Health Check" app in eagle server.
+
+    ![Health Check Installation](include/images/health_check_installation.png)
+
+* Configure Application settings.
+
+    ![Health Check Settings](include/images/health_check_settings.png)
+
+* Ensure the existence of a kafka topic named topology_health_check (In 
current guide, it should be topology_health_check).
+
+* Click "Install" button then you will see the 
TOPOLOGY_HEALTH_CHECK_STREAM_{SITE_ID} on "Streams" page (Streams could be 
navigated in left-nav).
+
+    ![Health Check Stream](include/images/health_check_stream.png)
+
+## Define Health Check Alert Policy
+
+* Go to "Define Policy".
+
+* Select TOPOLOGY_HEALTH_CHECK related streams.
+
+* Define SQL-Like policy, for example
+
+        from TOPOLOGY_HEALTH_CHECK_STREAM_SANDBOX[status=='dead'] select * 
insert into topology_health_check_stream_out;
+
+    ![Health Check Policy](include/images/health_check_policy.png)
+
+---
+
+# Hadoop Queue Monitoring
+
+* This application collects metrics of Resource Manager in the following 
aspects:
+
+    * Scheduler Info of the cluster: 
http://{RM_HTTP_ADDRESS}:{PORT}/ws/v1/cluster/scheduler
+
+    * Applications of the cluster: 
http://{RM_HTTP_ADDRESS}:{PORT}/ws/v1/cluster/apps
+
+    * Overall metrics of the cluster: 
http://{RM_HTTP_ADDRESS}:{PORT}/ws/v1/cluster/metrics
+
+            by version 0.5-incubating, mainly focusing at metrics
+             - `appsPending`
+             - `allocatedMB`
+             - `totalMB`
+             - `availableMB`
+             - `reservedMB`
+             - `allocatedVirtualCores`.
+
+## Setup & Installation
+
+* Make sure already setup a site (here use a demo site named "sandbox").
+
+* From left-nav list, navigate to application managing page by 
"**Integration**" > "**Sites**", and hit link "**sandbox**" on right.
+
+    ![Navigate to app mgmt](include/images/hadoop_queue_monitor_1.png)
+
+* Install "Hadoop Queue Monitor" by clicking "install" button of the 
application.
+
+    ![Install Hadoop Queue Monitor 
App](include/images/hadoop_queue_monitor_2.png)
+
+* In the pop-up layout, select running mode as `Local` or `Cluster`.
+
+    ![Select Running Mode](include/images/hadoop_queue_monitor_3.png)
+
+* Set the target jar of eagle's topology assembly that has existed in eagle 
server, indicating the absolute path ot it. As in the following screenshot:
+
+    ![Set Jar Path](include/images/hadoop_queue_monitor_4.png)
+
+* Set Resource Manager endpoint urls field, separate values with comma if 
there are more than 1 url (e.g. a secondary node for HA).
+
+    ![Set RM Endpoint](include/images/hadoop_queue_monitor_5.png)
+
+* Set fields "**Storm Worker Number**", "**Parallel Tasks Per Bolt**", and 
"**Fetching Metric Interval in Seconds**", or leave them as default if they fit 
your needs.
+
+    ![Set Advanced Fields](include/images/hadoop_queue_monitor_6.png)
+
+* Finally, hit "**Install**" button to complete it.
+
+## Use of the application
+
+* There is no need to define policies for this applicatoin to work, it could 
be integrated with "**Job Performance Monitoring Web**" application and 
consequently seen on cluster dashboard, as long as the latter application is 
installed too. See an exmple in the following screenshot:
+
+    ![In Dashboard](include/images/hadoop_queue_monitor_7.png)

http://git-wip-us.apache.org/repos/asf/eagle/blob/ee55054a/docs/docs/developing-application.md
----------------------------------------------------------------------
diff --git a/docs/docs/developing-application.md 
b/docs/docs/developing-application.md
new file mode 100644
index 0000000..ec6833e
--- /dev/null
+++ b/docs/docs/developing-application.md
@@ -0,0 +1,285 @@
+# Introduction
+
+[Applications](applications) in Eagle include process component and view 
component. Process component normally refers to storm topology or spark stream 
job which processes incoming data, while viewing component normally refers to 
GUI hosted in Eagle UI. 
+
+[Application Framework](getting-started/#eagle-framework) targets at solving 
the problem of managing application lifecycle and presenting uniform views to 
end users.
+ 
+Eagle application framework is designed for end-to-end lifecycle of 
applications including:
+
+* **Development**: application development and framework development
+
+* **Testing**.
+
+* **Installation**: package management with SPI/Providers.xml
+
+* **Management**: manage applications through REST API
+
+---
+
+# Quick Start
+
+* Fork and clone eagle source code repository using GIT.
+
+        git clone https://github.com/apache/incubator-eagle.git
+
+* Run Eagle Server : execute “org.apache.eagle.server.ServerDebug” under 
eagle-server in IDE or with maven command line.
+
+        org.apache.eagle.server.ServerDebug
+
+* Access current available applications through API.
+
+        curl -XGET  http://localhost:9090/rest/apps/providers
+
+* Create Site through API.
+
+        curl -H "Content-Type: application/json" -X POST  
http://localhost:9090/rest/sites --data '{
+             "siteId":"test_site",
+             "siteName":"Test Site",
+             "description":"This is a sample site for test",
+             "context":{
+                  "type":"FAKE_CLUSTER",
+                  "url":"http://localhost:9090";,
+                  "version":"2.6.4",
+                  "additional_attr":"Some information about the face cluster 
site"
+             }
+        }'
+
+* Install Application through API.
+
+        curl -H "Content-Type: application/json" -X POST 
http://localhost:9090/rest/apps/install --data '{
+             "siteId":"test_site",
+             "appType":"EXAMPLE_APPLICATION",
+             "mode":"LOCAL"
+        }'
+
+* Start Application  (uuid means installed application uuid).
+
+        curl -H "Content-Type: application/json" –X POST 
http://localhost:9090/rest/apps/start --data '{
+             "uuid":"9acf6792-60e8-46ea-93a6-160fb6ef0b3f"
+        }'
+
+* Stop Application (uuid means installed application uuid).
+
+        curl -XPOST http://localhost:9090/rest/apps/stop '{
+         "uuid": "9acf6792-60e8-46ea-93a6-160fb6ef0b3f"
+        }'
+
+* Uninstall Application (uuid means installed application uuid).
+
+        curl -XDELETE http://localhost:9090/rest/apps/uninstall '{
+         "uuid": "9acf6792-60e8-46ea-93a6-160fb6ef0b3f"
+        }'
+
+---
+
+# Create Application
+
+Each application should be developed under independent modules (including 
backend code and front-end code).
+
+Here is a typical code structure of a new application as following:
+
+```
+eagle-app-example/
+├── pom.xml
+├── src
+│   ├── main
+│   │   ├── java
+│   │   │   └── org
+│   │   │       └── apache
+│   │   │           └── eagle
+│   │   │               └── app
+│   │   │                   └── example
+│   │   │                       ├── ExampleApplicationProvider.java
+│   │   │                       ├── ExampleStormApplication.java
+│   │   ├── resources
+│   │   │   └── META-INF
+│   │   │       ├── providers
+│   │   │       │   └── 
org.apache.eagle.app.example.ExampleApplicationProvider.xml
+│   │   │       └── services
+│   │   │           └── 
org.apache.eagle.app.spi.ApplicationProvider
+│   │   └── webapp
+│   │       ├── app
+│   │       │   └── apps
+│   │       │       └── example
+│   │       │           └── index.html
+│   │       └── package.json
+│   └── test
+│       ├── java
+│       │   └── org
+│       │       └── apache
+│       │           └── eagle
+│       │               └── app
+│       │                   ├── example
+│       │                   │   ├── 
ExampleApplicationProviderTest.java
+│       │                   │   └── ExampleApplicationTest.java
+│       └── resources
+│           └── application.conf
+```
+
+**Eagle Example Application** - 
[eagle-app-example](https://github.com/haoch/incubator-eagle/tree/master/eagle-examples/eagle-app-example)
+
+**Description** - A typical eagle application is mainly consisted of:
+
+* **Application**: define core execution process logic inheriting from 
org.apache.eagle.app.Application, which is also implemented ApplicationTool to 
support Application to run as standalone process like a Storm topology  through 
command line.
+
+* **ApplicationProvider**: the interface to package application with 
descriptor metadata, also used as application SPI to dynamically load new 
application types.
+
+* **META-INF/providers/${APP_PROVIDER_CLASS_NAME}.xml**: support to easily 
describe application’s descriptor with declarative XML like:
+
+        <application>
+           <type>EXAMPLE_APPLICATION</type>
+           <name>Example Monitoring Application</name>
+           <version>0.5.0-incubating</version>
+           <configuration>
+               <property>
+                   <name>message</name>
+                   <displayName>Message</displayName>
+                   <value>Hello, example application!</value>
+                   <description>Just an sample configuration 
property</description>
+               </property>
+           </configuration>
+           <streams>
+               <stream>
+                   <streamId>SAMPLE_STREAM_1</streamId>
+                   <description>Sample output stream #1</description>
+                   <validate>true</validate>
+                   <timeseries>true</timeseries>
+                   <columns>
+                       <column>
+                           <name>metric</name>
+                           <type>string</type>
+                       </column>
+                       <column>
+                           <name>source</name>
+                           <type>string</type>
+                       </column>
+                       <column>
+                           <name>value</name>
+                           <type>double</type>
+                           <defaultValue>0.0</defaultValue>
+                       </column>
+                   </columns>
+               </stream>
+               <stream>
+                   <streamId>SAMPLE_STREAM_2</streamId>
+                   <description>Sample output stream #2</description>
+                   <validate>true</validate>
+                   <timeseries>true</timeseries>
+                   <columns>
+                       <column>
+                           <name>metric</name>
+                           <type>string</type>
+                       </column>
+                       <column>
+                           <name>source</name>
+                           <type>string</type>
+                       </column>
+                       <column>
+                           <name>value</name>
+                           <type>double</type>
+                           <defaultValue>0.0</defaultValue>
+                       </column>
+                   </columns>
+               </stream>
+           </streams>
+        </application>
+
+* **META-INF/services/org.apache.eagle.app.spi.ApplicationProvider**: support 
to dynamically scan and load extensible application provider using java service 
provider.
+
+* **webapp/app/apps/${APP_TYPE}**: if the application has web portal, then it 
could add more web code under this directory and make sure building as 
following in pom.xml
+
+        <build>
+           <resources>
+               <resource>
+                   <directory>src/main/webapp/app</directory>
+                   <targetPath>assets/</targetPath>
+               </resource>
+               <resource>
+                   <directory>src/main/resources</directory>
+               </resource>
+           </resources>
+           <testResources>
+               <testResource>
+                   <directory>src/test/resources</directory>
+               </testResource>
+           </testResources>
+        </build>
+
+---
+
+# Test Application
+
+* Extend **org.apache.eagle.app.test.ApplicationTestBase** and initialize 
injector context.
+
+* Access shared service with **@Inject**.
+
+* Test application lifecycle with related web resource.
+
+        @Inject private SiteResource siteResource;
+        @Inject private ApplicationResource applicationResource;
+
+        // Create local site
+        SiteEntity siteEntity = new SiteEntity();
+        siteEntity.setSiteId("test_site");
+        siteEntity.setSiteName("Test Site");
+        siteEntity.setDescription("Test Site for 
ExampleApplicationProviderTest");
+        siteResource.createSite(siteEntity);
+        Assert.assertNotNull(siteEntity.getUuid());
+
+        ApplicationOperations.InstallOperation installOperation = new 
ApplicationOperations.InstallOperation(
+               "test_site", 
+               "EXAMPLE_APPLICATION", 
+               ApplicationEntity.Mode.LOCAL);
+        installOperation.setConfiguration(getConf());
+        // Install application
+        ApplicationEntity applicationEntity = applicationResource
+            .installApplication(installOperation)
+            .getData();
+        // Start application
+        applicationResource.startApplication(new 
ApplicationOperations.StartOperation(applicationEntity.getUuid()));
+        // Stop application
+        applicationResource.stopApplication(new 
ApplicationOperations.StopOperation(applicationEntity.getUuid()));
+        // Uninstall application
+        applicationResource.uninstallApplication(
+               new 
ApplicationOperations.UninstallOperation(applicationEntity.getUuid()));
+        try {
+           
applicationResource.getApplicationEntityByUUID(applicationEntity.getUuid());
+           Assert.fail("Application instance (UUID: " + 
applicationEntity.getUuid() + ") should have been uninstalled");
+        } catch (Exception ex) {
+           // Expected exception
+        }
+
+---
+
+# Management & REST API
+
+## ApplicationProviderSPILoader
+
+Default behavior - automatically loading from class path using SPI:
+
+* By default, eagle will load application providers from current class loader.
+
+* If application.provider.dir defined, it will load from external jars’ 
class loader.
+
+## Application REST API
+
+* API Table
+
+    | Type       | Uri + Class |
+    | :--------: | :---------- |
+    | **DELETE** | /rest/sites 
(org.apache.eagle.metadata.resource.SiteResource) |
+    | **DELETE** | /rest/sites/{siteId} 
(org.apache.eagle.metadata.resource.SiteResource) |
+    | **GET**    | /rest/sites 
(org.apache.eagle.metadata.resource.SiteResource) |
+    | **GET**    | /rest/sites/{siteId} 
(org.apache.eagle.metadata.resource.SiteResource) |
+    | **POST**   | /rest/sites 
(org.apache.eagle.metadata.resource.SiteResource) |
+    | **PUT**    | /rest/sites 
(org.apache.eagle.metadata.resource.SiteResource) |
+    | **PUT**    | /rest/sites/{siteId} 
(org.apache.eagle.metadata.resource.SiteResource) |
+    | **DELETE** | /rest/apps/uninstall 
(org.apache.eagle.app.resource.ApplicationResource) |
+    | **GET**    | /rest/apps 
(org.apache.eagle.app.resource.ApplicationResource) |
+    | **GET**    | /rest/apps/providers 
(org.apache.eagle.app.resource.ApplicationResource) |
+    | **GET**    | /rest/apps/providers/{type} 
(org.apache.eagle.app.resource.ApplicationResource) |
+    | **GET**    | /rest/apps/{appUuid} 
(org.apache.eagle.app.resource.ApplicationResource) |
+    | **POST**   | /rest/apps/install 
(org.apache.eagle.app.resource.ApplicationResource) |
+    | **POST**   | /rest/apps/start 
(org.apache.eagle.app.resource.ApplicationResource) |
+    | **POST**   | /rest/apps/stop 
(org.apache.eagle.app.resource.ApplicationResource) |
+    | **PUT**    | /rest/apps/providers/reload 
(org.apache.eagle.app.resource.ApplicationResource) |

http://git-wip-us.apache.org/repos/asf/eagle/blob/ee55054a/docs/docs/getting-started.md
----------------------------------------------------------------------
diff --git a/docs/docs/getting-started.md b/docs/docs/getting-started.md
new file mode 100644
index 0000000..0799934
--- /dev/null
+++ b/docs/docs/getting-started.md
@@ -0,0 +1,233 @@
+# Architecture
+
+![Eagle 0.5.0 Architecture](include/images/eagle_arch_v0.5.0.png)
+
+### Eagle Apps
+
+* Security
+* Hadoop
+* Operational Intelligence
+
+For more applications, see [Applications](applications).
+
+### Eagle Interface
+
+* REST Service
+* Management UI
+* Customizable Analytics Visualization
+
+### Eagle Integration
+
+* [Apache Ambari](https://ambari.apache.org)
+* [Docker](https://www.docker.com)
+* [Apache Ranger](http://ranger.apache.org)
+* [Dataguise](https://www.dataguise.com)
+
+### Eagle Framework
+
+Eagle has multiple distributed real-time frameworks for efficiently developing 
highly scalable monitoring applications.
+       
+#### Alert Engine
+
+![Eagle Alert Engine](include/images/alert_engine.png)
+
+* Real-time: Apache Storm (Execution Engine) + Kafka (Message Bus)
+* Declarative Policy: SQL (CEP) on Streaming
+               from hadoopJmxMetricEventStream
+               [metric == "hadoop.namenode.fsnamesystemstate.capacityused" and 
value > 0.9] 
+               select metric, host, value, timestamp, component, site 
+               insert into alertStream;
+
+* Dynamical onboarding & correlation
+* No downtime migration and upgrading
+
+#### Storage Engine
+
+![Eagle Storage Engine](include/images/storage_engine.png)
+
+
+* Light-weight ORM Framework for HBase/RDMBS
+    
+       @Table("HbaseTableName")
+               @ColumnFamily("ColumnFamily")
+               @Prefix("RowkeyPrefix")
+               @Service("UniqueEntitytServiceName")
+               @JsonIgnoreProperties(ignoreUnknown = true)
+               @TimeSeries(false)
+               @Indexes({
+                       @Index(name="Index_1_alertExecutorId", columns = { 
"alertExecutorID" }, unique = true)})
+               public class AlertDefinitionAPIEntity extends 
TaggedLogAPIEntity{
+               @Column("a")
+               private String desc;
+
+* Full-function SQL-Like REST Query 
+
+               Query=UniqueEntitytServiceName[@site="sandbox"]{*}
+
+* Optimized Rowkey design for time-series data, optimized for 
metric/entity/log, etc. different storage types
+       
+               Rowkey ::= Prefix | Partition Keys | timestamp | tagName | 
tagValue | …  
+       
+
+* Secondary Index Support
+               @Indexes({@Index(name="INDEX_NAME", columns = { 
"SECONDARY_INDEX_COLUMN_NAME" }, unique = true/false)})
+               
+* Native HBase Coprocessor
+               
org.apache.eagle.storage.hbase.query.coprocessor.AggregateProtocolEndPoint
+
+
+#### UI Framework
+
+Eagle UI is consist of following parts:
+
+* Eagle Main UI
+* Eagle App Portal/Dashboard/Widgets
+* Eagle Customized Dashboard 
+
+#### Application Framework
+
+##### Application
+
+An "Application" or "App" is composed of data integration, policies and 
insights for one data source.
+
+##### Application Descriptor 
+
+An "Application Descriptor" is a static packaged metadata information consist 
of basic information like type, name, version, description, and application 
process, configuration, streams, docs, policies and so on. 
+
+Here is an example ApplicationDesc of `JPM_WEB_APP`
+
+        {
+        type: "JPM_WEB_APP",
+        name: "Job Performance Monitoring Web ",
+        version: "0.5.0-incubating",
+        description: null,
+        appClass: "org.apache.eagle.app.StaticApplication",
+        jarPath: 
"/opt/eagle/0.5.0-incubating-SNAPSHOT-build-20161103T0332/eagle-0.5.0-incubating-SNAPSHOT/lib/eagle-topology-0.5.0-incubating-SNAPSHOT-hadoop-2.4.1-11-assembly.jar",
+        viewPath: "/apps/jpm",
+        providerClass: "org.apache.eagle.app.jpm.JPMWebApplicationProvider",
+        configuration: {
+            properties: [{
+                name: "service.host",
+                displayName: "Eagle Service Host",
+                value: "localhost",
+                description: "Eagle Service Host, default: localhost",
+                required: false
+            }, {
+                name: "service.port",
+                displayName: "Eagle Service Port",
+                value: "8080",
+                description: "Eagle Service Port, default: 8080",
+                required: false
+            }]
+        },
+        streams: null,
+        docs: null,
+        executable: false,
+        dependencies: [{
+            type: "MR_RUNNING_JOB_APP",
+            version: "0.5.0-incubating",
+            required: true
+        }, {
+            type: "MR_HISTORY_JOB_APP",
+            version: "0.5.0-incubating",
+            required: true
+        }]
+        }
+    
+
+##### Application Provider
+
+Appilcation Provider is a package management and loading mechanism leveraging 
[Java SPI](https://docs.oracle.com/javase/tutorial/ext/basics/spi.html).
+       
+For example, in file 
`META-INF/services/org.apache.eagle.app.spi.ApplicationProvider`, place the 
full class name of an application provider:
+
+       org.apache.eagle.app.jpm.JPMWebApplicationProvider
+
+
+---
+
+# Concepts
+
+* Here are some terms we are using in Apache Eagle (incubating, called Eagle 
in the following), please check them for your reference. They are basic 
knowledge of Eagle which also will help to well understand Eagle.
+
+## Site
+
+* A site can be considered as a physical data center. Big data platform e.g. 
Hadoop may be deployed to multiple data centers in an enterprise.
+
+## Application
+
+* An "Application" or "App" is composed of data integration, policies and 
insights for one data source.
+
+## Policy
+
+* A "Policy" defines the rule to alert. Policy can be simply a filter 
expression or a complex window based aggregation rules etc.
+
+## Alerts
+
+* An "Alert" is an real-time event detected with certain alert policy or 
correlation logic, with different severity levels like INFO/WARNING/DANGER.
+
+## Data Source
+
+* A "Data Source" is a monitoring target data. Eagle supports many data 
sources HDFS audit logs, Hive2 query, MapReduce job etc.
+
+## Stream
+
+* A "Stream" is the streaming data from a data source. Each data source has 
its own stream.
+
+---
+
+# Quick Start
+
+## Deployment
+
+### Prerequisites
+
+Eagle requires the following dependencies:
+
+* For streaming platform dependencies
+    * Storm: 0.9.3 or later
+    * Hadoop: 2.6.x or later
+    * Hbase: 0.98.x or later
+    * Kafka: 0.8.x or later
+    * Zookeeper: 3.4.6 or later
+    * Java: 1.8.x
+* For metadata database dependencies (Choose one of them)
+    * MangoDB 3.2.2 or later
+        * Installation is required
+    * Mysql 5.1.x or later
+        * Installation is required
+
+Notice:  
+>     Storm 0.9.x does NOT support JDK8. You can replace asm-4.0.jar with 
asm-all-5.0.jar in the storm lib directory. 
+>     Then restart other services(nimbus/ui/supervisor). 
+
+
+### Installation
+
+##### Build Eagle
+
+* Download the latest version of Eagle source code.
+
+        git clone https://github.com/apache/incubator-eagle.git
+        
+* Build the source code, and a tar.gz package will be generated under 
eagle-server-assembly/target
+
+        mvn clean install -DskipTests
+        
+##### Deploy Eagle
+* Copy binary package to your server machine. In the package, you should find:
+    * __bin/__: scripts used for start eagle server
+    * __conf/__: default configurations for eagle server setup.
+    * __lib/__ : all included software packages for eagle server
+* Change configurations under `conf/`
+       * __eagle.conf__
+    * __server.yml__
+* Run eagle-server.sh
+    
+       ./bin/eagle-server.sh start
+
+* Check eagle server
+    * Visit http://host:port/ in your web browser.
+
+## Setup Your Monitoring Case
+`Placeholder for topic: Setup Your Monitoring Case`
\ No newline at end of file

Reply via email to