Repository: zeppelin Updated Branches: refs/heads/master b24491baf -> c7ce709f3
[ZEPPELIN-1279] Zeppelin with CDH5.x docker document. ### What is this PR for? This PR is for the documentation of running zeppelin with CDH docker environment. and This PR is the part of https://issues.apache.org/jira/browse/ZEPPELIN-1198. Tested CDH5.7 on ubuntu. ### What type of PR is it? Documentation ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-1281 ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: astroshim <[email protected]> Author: AhyoungRyu <[email protected]> Author: HyungSung <[email protected]> Closes #1451 from astroshim/ZEPPELIN-1281 and squashes the following commits: 5dcb8c1 [astroshim] move configurations to right path and add excluding rat-plugin 09408e3 [HyungSung] Merge pull request #11 from AhyoungRyu/ZEPPELIN-1281-ahyoung 850119c [AhyoungRyu] Generate TOC & change some sentences e687a53 [AhyoungRyu] Replace zeppelin_with_cdh.png to crop the url part cc9a023 [AhyoungRyu] Remove main title link anchor b525f68 [astroshim] separate cdh doc with spark_cluster_mode.md e66993f [astroshim] fix doc a7b5b2d [astroshim] cdh docker environment Project: http://git-wip-us.apache.org/repos/asf/zeppelin/repo Commit: http://git-wip-us.apache.org/repos/asf/zeppelin/commit/c7ce709f Tree: http://git-wip-us.apache.org/repos/asf/zeppelin/tree/c7ce709f Diff: http://git-wip-us.apache.org/repos/asf/zeppelin/diff/c7ce709f Branch: refs/heads/master Commit: c7ce709f356c5d007e12824ff9214e9e95905d84 Parents: b24491b Author: astroshim <[email protected]> Authored: Tue Sep 27 11:24:34 2016 +0900 Committer: AhyoungRyu <[email protected]> Committed: Thu Sep 29 21:01:31 2016 +0900 ---------------------------------------------------------------------- docs/_includes/themes/zeppelin/_navigation.html | 1 + .../img/docs-img/cdh_yarn_applications.png | Bin 0 -> 124719 bytes .../zeppelin/img/docs-img/zeppelin_with_cdh.png | Bin 0 -> 41727 bytes docs/index.md | 1 + docs/install/cdh.md | 100 +++++++++++++++++++ pom.xml | 1 + .../cdh/hdfs_conf/core-site.xml | 6 ++ .../cdh/hdfs_conf/hdfs-site.xml | 64 ++++++++++++ .../cdh/hdfs_conf/mapred-site.xml | 6 ++ .../cdh/hdfs_conf/yarn-site.xml | 26 +++++ 10 files changed, 205 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/docs/_includes/themes/zeppelin/_navigation.html ---------------------------------------------------------------------- diff --git a/docs/_includes/themes/zeppelin/_navigation.html b/docs/_includes/themes/zeppelin/_navigation.html index e86ffb7..4a7e75b 100644 --- a/docs/_includes/themes/zeppelin/_navigation.html +++ b/docs/_includes/themes/zeppelin/_navigation.html @@ -108,6 +108,7 @@ <li><a href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-standalone-mode">Zeppelin on Spark Cluster Mode (Standalone)</a></li> <li><a href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-on-yarn-mode">Zeppelin on Spark Cluster Mode (YARN)</a></li> <li><a href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-on-mesos-mode">Zeppelin on Spark Cluster Mode (Mesos)</a></li> + <li><a href="{{BASE_PATH}}/install/cdh.html">Zeppelin on CDH</a></li> <li role="separator" class="divider"></li> <li class="title"><span><b>Contibute</b><span></li> <li><a href="{{BASE_PATH}}/development/writingzeppelininterpreter.html">Writing Zeppelin Interpreter</a></li> http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/docs/assets/themes/zeppelin/img/docs-img/cdh_yarn_applications.png ---------------------------------------------------------------------- diff --git a/docs/assets/themes/zeppelin/img/docs-img/cdh_yarn_applications.png b/docs/assets/themes/zeppelin/img/docs-img/cdh_yarn_applications.png new file mode 100644 index 0000000..980ea5b Binary files /dev/null and b/docs/assets/themes/zeppelin/img/docs-img/cdh_yarn_applications.png differ http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/docs/assets/themes/zeppelin/img/docs-img/zeppelin_with_cdh.png ---------------------------------------------------------------------- diff --git a/docs/assets/themes/zeppelin/img/docs-img/zeppelin_with_cdh.png b/docs/assets/themes/zeppelin/img/docs-img/zeppelin_with_cdh.png new file mode 100644 index 0000000..9dae220 Binary files /dev/null and b/docs/assets/themes/zeppelin/img/docs-img/zeppelin_with_cdh.png differ http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/docs/index.md ---------------------------------------------------------------------- diff --git a/docs/index.md b/docs/index.md index 8c2ce95..0f25750 100644 --- a/docs/index.md +++ b/docs/index.md @@ -172,6 +172,7 @@ Join to our [Mailing list](https://zeppelin.apache.org/community.html) and repor * [Zeppelin on Spark Cluster Mode (Standalone via Docker)](./install/spark_cluster_mode.html#spark-standalone-mode) * [Zeppelin on Spark Cluster Mode (YARN via Docker)](./install/spark_cluster_mode.html#spark-on-yarn-mode) * [Zeppelin on Spark Cluster Mode (Mesos via Docker)](./install/spark_cluster_mode.html#spark-on-mesos-mode) + * [Zeppelin on CDH (via Docker)](./install/cdh.html) * Contribute * [Writing Zeppelin Interpreter](./development/writingzeppelininterpreter.html) * [Writing Zeppelin Application (Experimental)](./development/writingzeppelinapplication.html) http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/docs/install/cdh.md ---------------------------------------------------------------------- diff --git a/docs/install/cdh.md b/docs/install/cdh.md new file mode 100644 index 0000000..f661417 --- /dev/null +++ b/docs/install/cdh.md @@ -0,0 +1,100 @@ +--- +layout: page +title: "Apache Zeppelin on CDH" +description: "This document will guide you how you can build and configure the environment on CDH with Apache Zeppelin using docker scripts." +group: install +--- +<!-- +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +--> +{% include JB/setup %} + +# Apache Zeppelin on CDH + +<div id="toc"></div> + +### 1. Import Cloudera QuickStart Docker image + +>[Cloudera](http://www.cloudera.com/) has officially provided CDH Docker Hub in their own container. Please check [this guide page](http://www.cloudera.com/documentation/enterprise/latest/topics/quickstart_docker_container.html#cloudera_docker_container) for more information. + +You can import the Docker image by pulling it from Cloudera Docker Hub. + +``` +docker pull cloudera/quickstart:latest +``` + + +### 2. Run docker + +``` +docker run -it \ + -p 80:80 \ + -p 4040:4040 \ + -p 8020:8020 \ + -p 8022:8022 \ + -p 8030:8030 \ + -p 8032:8032 \ + -p 8033:8033 \ + -p 8040:8040 \ + -p 8042:8042 \ + -p 8088:8088 \ + -p 8480:8480 \ + -p 8485:8485 \ + -p 8888:8888 \ + -p 9083:9083 \ + -p 10020:10020 \ + -p 10033:10033 \ + -p 18088:18088 \ + -p 19888:19888 \ + -p 25000:25000 \ + -p 25010:25010 \ + -p 25020:25020 \ + -p 50010:50010 \ + -p 50020:50020 \ + -p 50070:50070 \ + -p 50075:50075 \ + -h quickstart.cloudera --privileged=true \ + agitated_payne_backup /usr/bin/docker-quickstart; +``` + +### 3. Verify running CDH + +To verify the application is running well, check the web UI for HDFS on `http://<hostname>:50070/` and YARN on `http://<hostname>:8088/cluster`. + + +### 4. Configure Spark interpreter in Zeppelin +Set following configurations to `conf/zeppelin-env.sh`. + +``` +export MASTER=yarn-client +export HADOOP_CONF_DIR=[your_hadoop_conf_path] +export SPARK_HOME=[your_spark_home_path] +``` + +`HADOOP_CONF_DIR`(Hadoop configuration path) is defined in `/scripts/docker/spark-cluster-managers/cdh/hdfs_conf`. + +Don't forget to set Spark `master` as `yarn-client` in Zeppelin **Interpreters** setting page like below. + +<img src="../assets/themes/zeppelin/img/docs-img/zeppelin_yarn_conf.png" /> + +### 5. Run Zeppelin with Spark interpreter +After running a single paragraph with Spark interpreter in Zeppelin, + +<img src="../assets/themes/zeppelin/img/docs-img/zeppelin_with_cdh.png" /> + +<br/> + +browse `http://<hostname>:8088/cluster/apps` to check Zeppelin application is running well or not. + +<img src="../assets/themes/zeppelin/img/docs-img/cdh_yarn_applications.png" /> + http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/pom.xml ---------------------------------------------------------------------- diff --git a/pom.xml b/pom.xml index c93f4b8..03b2263 100644 --- a/pom.xml +++ b/pom.xml @@ -753,6 +753,7 @@ <exclude>.spark-dist/**</exclude> <exclude>**/interpreter-setting.json</exclude> <exclude>**/constants.json</exclude> + <exclude>scripts/**</exclude> <!-- bundled from bootstrap --> <exclude>docs/assets/themes/zeppelin/bootstrap/**</exclude> http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/core-site.xml ---------------------------------------------------------------------- diff --git a/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/core-site.xml b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/core-site.xml new file mode 100644 index 0000000..6cdbc7f --- /dev/null +++ b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/core-site.xml @@ -0,0 +1,6 @@ +<configuration> + <property> + <name>fs.defaultFS</name> + <value>hdfs://0.0.0.0:8020</value> + </property> +</configuration> http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/hdfs-site.xml ---------------------------------------------------------------------- diff --git a/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/hdfs-site.xml b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/hdfs-site.xml new file mode 100644 index 0000000..ce031cf --- /dev/null +++ b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/hdfs-site.xml @@ -0,0 +1,64 @@ +<configuration> + <property> + <name>dfs.replication</name> + <value>1</value> + </property> + + + <property> + <name>dfs.data.dir</name> + <value>/data/hdfs</value> + <final>true</final> + </property> + + <property> + <name>dfs.permissions</name> + <value>false</value> + </property> + + + <property> + <name>dfs.client.use.datanode.hostname</name> + <value>true</value> + <description>Whether clients should use datanode hostnames when + connecting to datanodes. + </description> + </property> + + <property> + <name>dfs.datanode.use.datanode.hostname</name> + <value>true</value> + <description>Whether datanodes should use datanode hostnames when + connecting to other datanodes for data transfer. + </description> + </property> + + <property> + <name>dfs.datanode.address</name> + <value>0.0.0.0:50010</value> + <description> + The address where the datanode server will listen to. + If the port is 0 then the server will start on a free port. + </description> + </property> + + <property> + <name>dfs.datanode.http.address</name> + <value>0.0.0.0:50075</value> + <description> + The datanode http server address and port. + If the port is 0 then the server will start on a free port. + </description> + </property> + + <property> + <name>dfs.datanode.ipc.address</name> + <value>0.0.0.0:50020</value> + <description> + The datanode ipc server address and port. + If the port is 0 then the server will start on a free port. + </description> + </property> + +</configuration> + http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/mapred-site.xml ---------------------------------------------------------------------- diff --git a/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/mapred-site.xml b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/mapred-site.xml new file mode 100644 index 0000000..6dc557d --- /dev/null +++ b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/mapred-site.xml @@ -0,0 +1,6 @@ +<configuration> + <property> + <name>mapreduce.framework.name</name> + <value>yarn</value> + </property> +</configuration> http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/yarn-site.xml ---------------------------------------------------------------------- diff --git a/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/yarn-site.xml b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/yarn-site.xml new file mode 100644 index 0000000..4fce42f9 --- /dev/null +++ b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/yarn-site.xml @@ -0,0 +1,26 @@ +<configuration> + <property> + <name>yarn.resourcemanager.scheduler.address</name> + <value>0.0.0.0:8030</value> + </property> + <property> + <name>yarn.resourcemanager.address</name> + <value>0.0.0.0:8032</value> + </property> + <property> + <name>yarn.resourcemanager.webapp.address</name> + <value>0.0.0.0:8088</value> + </property> + <property> + <name>yarn.resourcemanager.resource-tracker.address</name> + <value>0.0.0.0:8031</value> + </property> + <property> + <name>yarn.resourcemanager.admin.address</name> + <value>0.0.0.0:8033</value> + </property> + <property> + <name>yarn.application.classpath</name> + <value>/usr/local/hadoop/etc/hadoop, /usr/local/hadoop/share/hadoop/common/*, /usr/local/hadoop/share/hadoop/common/lib/*, /usr/local/hadoop/share/hadoop/hdfs/*, /usr/local/hadoop/share/hadoop/hdfs/lib/*, /usr/local/hadoop/share/hadoop/mapreduce/*, /usr/local/hadoop/share/hadoop/mapreduce/lib/*, /usr/local/hadoop/share/hadoop/yarn/*, /usr/local/hadoop/share/hadoop/yarn/lib/*, /usr/local/hadoop/share/spark/*</value> + </property> +</configuration>
