incubator-griffin git commit: [GRIFFIN-138] update readme.md, highlighted docker guide

guoyp Sat, 07 Apr 2018 21:53:11 -0700

Repository: incubator-griffin
Updated Branches:
  refs/heads/master 95e45dca4 -> 4e0f25d2c



[GRIFFIN-138] update readme.md, highlighted docker guide

update readme.md, describe docker guide, debug guide and deploy guide in order 
for specific users

Author: Lionel Liu <bhlx3l...@163.com>

Closes #248 from bhlx3lyx7/tmst.


Project: http://git-wip-us.apache.org/repos/asf/incubator-griffin/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-griffin/commit/4e0f25d2
Tree: http://git-wip-us.apache.org/repos/asf/incubator-griffin/tree/4e0f25d2
Diff: http://git-wip-us.apache.org/repos/asf/incubator-griffin/diff/4e0f25d2

Branch: refs/heads/master
Commit: 4e0f25d2c9fd64c56a128e3ddde7c5c7addd916c
Parents: 95e45dc
Author: Lionel Liu <bhlx3l...@163.com>
Authored: Sun Apr 8 12:52:21 2018 +0800
Committer: Lionel Liu <bhlx3l...@163.com>
Committed: Sun Apr 8 12:52:21 2018 +0800

----------------------------------------------------------------------
 README.md                          | 174 ++++----------------------------
 griffin-doc/deploy/deploy-guide.md | 160 +++++++++++++++++++++++++++++
 2 files changed, 179 insertions(+), 155 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-griffin/blob/4e0f25d2/README.md
----------------------------------------------------------------------
diff --git a/README.md b/README.md
index 5bc0e1c..37987d0 100644
--- a/README.md
+++ b/README.md
@@ -27,176 +27,40 @@ Apache Griffin is a model driven data quality solution for 
modern data systems.
 
 ## Getting Started
 
+### First Try of Griffin
 
-You can try Griffin in docker following the [docker 
guide](https://github.com/apache/incubator-griffin/blob/master/griffin-doc/docker/griffin-docker-guide.md).
-
-To run Griffin at local, you can follow instructions below.
-
-### Prerequisites
-You need to install following items 
-- jdk (1.8 or later versions).
-- mysql.
-- Postgresql.
-- npm (version 6.0.0+).
-- 
[Hadoop](http://apache.claz.org/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz) 
(2.6.0 or later), you can get some help 
[here](https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SingleCluster.html).
--  [Spark](http://spark.apache.org/downloads.html) (version 1.6.x, griffin 
does not support 2.0.x at current), if you want to install Pseudo 
Distributed/Single Node Cluster, you can get some help 
[here](http://why-not-learn-something.blogspot.com/2015/06/spark-installation-pseudo.html).
-- [Hive](http://apache.claz.org/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz) 
(version 1.2.1 or later), you can get some help 
[here](https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive).
-    You need to make sure that your spark cluster could access your 
HiveContext.
-- [Livy](http://archive.cloudera.com/beta/livy/livy-server-0.3.0.zip), you can 
get some help [here](http://livy.io/quickstart.html).
-    Griffin need to schedule spark jobs by server, we use livy to submit our 
jobs.
-    For some issues of Livy for HiveContext, we need to download 3 files, and 
put them into HDFS.
-    ```
-    datanucleus-api-jdo-3.2.6.jar
-    datanucleus-core-3.2.10.jar
-    datanucleus-rdbms-3.2.9.jar
-    ```
-- ElasticSearch. 
-       ElasticSearch works as a metrics collector, Griffin produces metrics to 
it, and our default UI get metrics from it, you can use your own way as well.
-
-### Configuration
-
-Create database 'quartz' in mysql
-```
-mysql -u username -e "create database quartz" -p
-```
-Init quartz tables in mysql by service/src/main/resources/Init_quartz.sql
-```
-mysql -u username -p quartz < service/src/main/resources/Init_quartz.sql
-```
-
-
-You should also modify some configurations of Griffin for your environment.
-
-- <b>service/src/main/resources/application.properties</b>
-
-    ```
-    # jpa
-    spring.datasource.url = jdbc:postgresql://<your 
IP>:5432/quartz?autoReconnect=true&useSSL=false
-    spring.datasource.username = <user name>
-    spring.datasource.password = <password>
-    spring.jpa.generate-ddl=true
-    spring.datasource.driverClassName = org.postgresql.Driver
-    spring.jpa.show-sql = true
-    
-    # hive metastore
-    hive.metastore.uris = thrift://<your IP>:9083
-    hive.metastore.dbname = <hive database name>    # default is "default"
-    
-    # external properties directory location, ignore it if not required
-    external.config.location =
-
-       # login strategy, default is "default"
-       login.strategy = <default or ldap>
-
-       # ldap properties, ignore them if ldap is not enabled
-       ldap.url = ldap://hostname:port
-       ldap.email = @example.com
-       ldap.searchBase = DC=org,DC=example
-       ldap.searchPattern = (sAMAccountName={0})
-
-       # hdfs, ignore it if you do not need predicate job
-       fs.defaultFS = hdfs://<hdfs-default-name>
-
-       # elasticsearch
-       elasticsearch.host = <your IP>
-       elasticsearch.port = <your elasticsearch rest port>
-       # authentication properties, uncomment if basic authentication is 
enabled
-       # elasticsearch.user = user
-       # elasticsearch.password = password
-    ```
-
-- <b>measure/src/main/resources/env.json</b> 
-       ```
-       "persist": [
-           ...
-           {
-                       "type": "http",
-                       "config": {
-                       "method": "post",
-                       "api": "http://<your ES IP>:<ES rest 
port>/griffin/accuracy"
-                       }
-               }
-       ]
-       ```
-       Put the modified env.json file into HDFS.
-       
-- <b>service/src/main/resources/sparkJob.properties</b>
-    ```
-    sparkJob.file = hdfs://<griffin measure path>/griffin-measure.jar
-    sparkJob.args_1 = hdfs://<griffin env path>/env.json
-    
-    sparkJob.jars = hdfs://<datanucleus path>/spark-avro_2.11-2.0.1.jar\
-           hdfs://<datanucleus path>/datanucleus-api-jdo-3.2.6.jar\
-           hdfs://<datanucleus path>/datanucleus-core-3.2.10.jar\
-           hdfs://<datanucleus path>/datanucleus-rdbms-3.2.9.jar
-           
-       spark.yarn.dist.files = hdfs:///<spark conf path>/hive-site.xml
-       
-    livy.uri = http://<your IP>:8998/batches
-    spark.uri = http://<your IP>:8088
-    ```
-    - \<griffin measure path> is the location you should put the jar file of 
measure module.
-    - \<griffin env path> is the location you should put the env.json file.
-    - \<datanucleus path> is the location you should put the 3 jar files of 
livy, and the spark avro jar file if you need.
-    - \<spark conf path> is the location of spark conf directory.
-    
-### Build and Run
-
-Build the whole project and deploy. (NPM should be installed)
-
-  ```
-  mvn clean install
-  ```
- 
-Put jar file of measure module into \<griffin measure path> in HDFS
-
-```
-cp measure/target/measure-<version>-incubating-SNAPSHOT.jar 
measure/target/griffin-measure.jar
-hdfs dfs -put measure/target/griffin-measure.jar <griffin measure path>/
-  ```
-
-After all environment services startup, we can start our server.
-
-  ```
-  java -jar service/target/service.jar
-  ```
-    
-After a few seconds, we can visit our default UI of Griffin (by default the 
port of spring boot is 8080).
-
-  ```
-  http://<your IP>:8080
-  ```
-
-You can use UI following the steps  
[here](https://github.com/apache/incubator-griffin/blob/master/griffin-doc/ui/user-guide.md).
-
-**Note**: The front-end UI is still under development, you can only access 
some basic features currently.
-
-
-### Build and Debug
+You can try Griffin in docker following the [docker 
guide](griffin-doc/docker/griffin-docker-guide.md).
+
+### Environment for Dev
 
 If you want to develop Griffin, please follow [this 
document](griffin-doc/dev/dev-env-build.md), to skip complex environment 
building work.
 
+### Deployment at Local
 
-## Community
+If you want to deploy Griffin in your local environment, please follow [this 
document](griffin-doc/deploy/deploy-guide.md).
 
-You can contact us via email: <a 
href="mailto:d...@griffin.incubator.apache.org";>d...@griffin.incubator.apache.org</a>
+## Community
 
-You can also subscribe this mail by sending a email to 
[here](mailto:dev-subscr...@griffin.incubator.apache.org). 
+You can access [griffin home page](http://griffin.apache.org).
 
-You can access our issues jira page 
[here](https://issues.apache.org/jira/browse/GRIFFIN)
+You can contact us via email:
+- dev-list: <a 
href="mailto:d...@griffin.incubator.apache.org";>d...@griffin.incubator.apache.org</a>
+- user-list: <a 
href="mailto:u...@griffin.incubator.apache.org";>u...@griffin.incubator.apache.org</a>
 
+You can also subscribe this mail by sending a email to [subscribe 
dev-list](mailto:dev-subscr...@griffin.incubator.apache.org) and [subscribe 
user-list](mailto:user-subscr...@griffin.incubator.apache.org).
 
+You can access our issues on [JIRA 
page](https://issues.apache.org/jira/browse/GRIFFIN)
 
 ## Contributing
 
-See [Contributing Guide](./CONTRIBUTING.md) for details on how to contribute 
code, documentation, etc.
+See [How to Contribute](http://griffin.apache.org/2017/03/04/community) for 
details on how to contribute code, documentation, etc.
 
 ## References
 - [Home Page](http://griffin.incubator.apache.org/)
 - [Wiki](https://cwiki.apache.org/confluence/display/GRIFFIN/Apache+Griffin)
 - Documents:
-       - 
[Measure](https://github.com/apache/incubator-griffin/tree/master/griffin-doc/measure)
-       - 
[Service](https://github.com/apache/incubator-griffin/tree/master/griffin-doc/service)
-       - 
[UI](https://github.com/apache/incubator-griffin/tree/master/griffin-doc/ui)
-       - [Docker 
usage](https://github.com/apache/incubator-griffin/tree/master/griffin-doc/docker)
-       - [Postman 
API](https://github.com/apache/incubator-griffin/tree/master/griffin-doc/service/postman)
\ No newline at end of file
+       - [Measure](griffin-doc/measure)
+       - [Service](griffin-doc/service)
+       - [UI](griffin-doc/ui)
+       - [Docker usage](griffin-doc/docker)
+       - [Postman API](griffin-doc/service/postman)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-griffin/blob/4e0f25d2/griffin-doc/deploy/deploy-guide.md
----------------------------------------------------------------------
diff --git a/griffin-doc/deploy/deploy-guide.md 
b/griffin-doc/deploy/deploy-guide.md
new file mode 100644
index 0000000..0693c25
--- /dev/null
+++ b/griffin-doc/deploy/deploy-guide.md
@@ -0,0 +1,160 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Apache Griffin Deployment Guide
+For Griffin users, you can deploy it with some dependencies in your 
environment, you can follow instructions below.
+
+### Prerequisites
+You need to install following items
+- jdk (1.8 or later versions).
+- mysql or Postgresql.
+- npm (version 6.0.0+).
+- 
[Hadoop](http://apache.claz.org/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz) 
(2.6.0 or later), you can get some help 
[here](https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SingleCluster.html).
+-  [Spark](http://spark.apache.org/downloads.html) (version 1.6.x, griffin 
does not support 2.0.x at current), if you want to install Pseudo 
Distributed/Single Node Cluster, you can get some help 
[here](http://why-not-learn-something.blogspot.com/2015/06/spark-installation-pseudo.html).
+- [Hive](http://apache.claz.org/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz) 
(version 1.2.1 or later), you can get some help 
[here](https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive).
+    You need to make sure that your spark cluster could access your 
HiveContext.
+- [Livy](http://archive.cloudera.com/beta/livy/livy-server-0.3.0.zip), you can 
get some help [here](http://livy.io/quickstart.html).
+    Griffin need to schedule spark jobs by server, we use livy to submit our 
jobs.
+    For some issues of Livy for HiveContext, we need to download 3 files or 
get them from Spark lib `$SPARK_HOME/lib/`, and put them into HDFS.
+    ```
+    datanucleus-api-jdo-3.2.6.jar
+    datanucleus-core-3.2.10.jar
+    datanucleus-rdbms-3.2.9.jar
+    ```
+- ElasticSearch.
+       ElasticSearch works as a metrics collector, Griffin produces metrics to 
it, and our default UI get metrics from it, you can use your own way as well.
+
+### Configuration
+
+Create database 'quartz' in mysql
+```
+mysql -u username -e "create database quartz" -p
+```
+Init quartz tables in mysql by service/src/main/resources/Init_quartz.sql
+```
+mysql -u username -p quartz < service/src/main/resources/Init_quartz.sql
+```
+
+
+You should also modify some configurations of Griffin for your environment.
+
+- <b>service/src/main/resources/application.properties</b>
+
+    ```
+    # jpa
+    spring.datasource.url = jdbc:postgresql://<your 
IP>:5432/quartz?autoReconnect=true&useSSL=false
+    spring.datasource.username = <user name>
+    spring.datasource.password = <password>
+    spring.jpa.generate-ddl=true
+    spring.datasource.driverClassName = org.postgresql.Driver
+    spring.jpa.show-sql = true
+
+    # hive metastore
+    hive.metastore.uris = thrift://<your IP>:9083
+    hive.metastore.dbname = <hive database name>    # default is "default"
+
+    # external properties directory location, ignore it if not required
+    external.config.location =
+
+       # login strategy, default is "default"
+       login.strategy = <default or ldap>
+
+       # ldap properties, ignore them if ldap is not enabled
+       ldap.url = ldap://hostname:port
+       ldap.email = @example.com
+       ldap.searchBase = DC=org,DC=example
+       ldap.searchPattern = (sAMAccountName={0})
+
+       # hdfs, ignore it if you do not need predicate job
+       fs.defaultFS = hdfs://<hdfs-default-name>
+
+       # elasticsearch
+       elasticsearch.host = <your IP>
+       elasticsearch.port = <your elasticsearch rest port>
+       # authentication properties, uncomment if basic authentication is 
enabled
+       # elasticsearch.user = user
+       # elasticsearch.password = password
+    ```
+
+- <b>measure/src/main/resources/env.json</b>
+       ```
+       "persist": [
+           ...
+           {
+                       "type": "http",
+                       "config": {
+                       "method": "post",
+                       "api": "http://<your ES IP>:<ES rest 
port>/griffin/accuracy"
+                       }
+               }
+       ]
+       ```
+       Put the modified env.json file into HDFS.
+
+- <b>service/src/main/resources/sparkJob.properties</b>
+    ```
+    sparkJob.file = hdfs://<griffin measure path>/griffin-measure.jar
+    sparkJob.args_1 = hdfs://<griffin env path>/env.json
+
+    sparkJob.jars = hdfs://<datanucleus path>/spark-avro_2.11-2.0.1.jar\
+           hdfs://<datanucleus path>/datanucleus-api-jdo-3.2.6.jar\
+           hdfs://<datanucleus path>/datanucleus-core-3.2.10.jar\
+           hdfs://<datanucleus path>/datanucleus-rdbms-3.2.9.jar
+
+       spark.yarn.dist.files = hdfs:///<spark conf path>/hive-site.xml
+
+    livy.uri = http://<your IP>:8998/batches
+    spark.uri = http://<your IP>:8088
+    ```
+    - \<griffin measure path> is the location you should put the jar file of 
measure module.
+    - \<griffin env path> is the location you should put the env.json file.
+    - \<datanucleus path> is the location you should put the 3 jar files of 
livy, and the spark avro jar file if you need to support avro data.
+    - \<spark conf path> is the location of spark conf directory.
+
+### Build and Run
+
+Build the whole project and deploy. (NPM should be installed)
+
+  ```
+  mvn clean install
+  ```
+
+Put jar file of measure module into \<griffin measure path> in HDFS
+
+```
+cp measure/target/measure-<version>-incubating-SNAPSHOT.jar 
measure/target/griffin-measure.jar
+hdfs dfs -put measure/target/griffin-measure.jar <griffin measure path>/
+  ```
+
+After all environment services startup, we can start our server.
+
+  ```
+  java -jar service/target/service.jar
+  ```
+
+After a few seconds, we can visit our default UI of Griffin (by default the 
port of spring boot is 8080).
+
+  ```
+  http://<your IP>:8080
+  ```
+
+You can use UI following the steps  [here](../ui/user-guide.md).
+
+**Note**: The front-end UI is still under development, you can only access 
some basic features currently.
+

incubator-griffin git commit: [GRIFFIN-138] update readme.md, highlighted docker guide

Reply via email to