Repository: zeppelin
Updated Branches:
  refs/heads/master c4319b775 -> eccfe0076


[ZEPPELIN-1280][Spark on Yarn] Documents for running zeppelin on production 
environments using docker.

### What is this PR for?
This PR is for the documentation of running zeppelin on production environments 
especially spark on yarn.
Related issue is https://github.com/apache/zeppelin/pull/1227 and I got a lot 
of hints from https://github.com/sequenceiq/hadoop-docker.
Tested on ubuntu.

### What type of PR is it?
Documentation

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1280

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: astroshim <hss...@nflabs.com>
Author: AhyoungRyu <fbdkdu...@hanmail.net>
Author: HyungSung <hss...@nflabs.com>

Closes #1318 from astroshim/ZEPPELIN-1280 and squashes the following commits:

60958cd [astroshim] small changes for doc
6c44b7b [astroshim] Merge branch 'master' into ZEPPELIN-1280
dad297c [astroshim] update version
4c8d72d [astroshim] merge with Ayoung's
8c62cf1 [astroshim] fixed felixcheung pointed out.
86ca513 [HyungSung] Merge pull request #9 from AhyoungRyu/ZEPPELIN-1280-ahyoung
cde5f8d [AhyoungRyu] Modify document description so that this docs can be 
searched
9e9390c [AhyoungRyu] Minor update for spark_cluster_mode.md
633c930 [astroshim] running zeppelin on yarn


Project: http://git-wip-us.apache.org/repos/asf/zeppelin/repo
Commit: http://git-wip-us.apache.org/repos/asf/zeppelin/commit/eccfe007
Tree: http://git-wip-us.apache.org/repos/asf/zeppelin/tree/eccfe007
Diff: http://git-wip-us.apache.org/repos/asf/zeppelin/diff/eccfe007

Branch: refs/heads/master
Commit: eccfe0076b42f65f9b4da2065f734f746b00f2c0
Parents: c4319b7
Author: astroshim <hss...@nflabs.com>
Authored: Thu Aug 18 17:57:12 2016 +0900
Committer: Alexander Bezzubov <b...@apache.org>
Committed: Mon Aug 29 16:05:14 2016 +0900

----------------------------------------------------------------------
 docs/_includes/themes/zeppelin/_navigation.html |   1 +
 .../zeppelin/img/docs-img/yarn_applications.png | Bin 0 -> 97514 bytes
 .../img/docs-img/zeppelin_yarn_conf.png         | Bin 0 -> 216859 bytes
 docs/index.md                                   |   1 +
 docs/install/spark_cluster_mode.md              |  74 ++++++++++++-
 .../spark_standalone/Dockerfile                 |   1 -
 .../spark_yarn_cluster/Dockerfile               | 107 +++++++++++++++++++
 .../spark_yarn_cluster/entrypoint.sh            |  60 +++++++++++
 .../spark_yarn_cluster/hdfs_conf/core-site.xml  |  22 ++++
 .../spark_yarn_cluster/hdfs_conf/hdfs-site.xml  |  78 ++++++++++++++
 .../hdfs_conf/mapred-site.xml                   |  22 ++++
 .../spark_yarn_cluster/hdfs_conf/yarn-site.xml  |  42 ++++++++
 .../spark_yarn_cluster/ssh_config               |  18 ++++
 13 files changed, 422 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/docs/_includes/themes/zeppelin/_navigation.html
----------------------------------------------------------------------
diff --git a/docs/_includes/themes/zeppelin/_navigation.html 
b/docs/_includes/themes/zeppelin/_navigation.html
index a396d56..4bbe64e 100644
--- a/docs/_includes/themes/zeppelin/_navigation.html
+++ b/docs/_includes/themes/zeppelin/_navigation.html
@@ -105,6 +105,7 @@
                 <li class="title"><span><b>Advanced</b><span></li>
                 <li><a 
href="{{BASE_PATH}}/install/virtual_machine.html">Zeppelin on Vagrant 
VM</a></li>
                 <li><a 
href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-standalone-mode">Zeppelin
 on Spark Cluster Mode (Standalone)</a></li>
+                <li><a 
href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-standalone-mode">Zeppelin
 on Spark Cluster Mode (YARN)</a></li>
                 <li role="separator" class="divider"></li>
                 <li class="title"><span><b>Contibute</b><span></li>
                 <li><a 
href="{{BASE_PATH}}/development/writingzeppelininterpreter.html">Writing 
Zeppelin Interpreter</a></li>

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/docs/assets/themes/zeppelin/img/docs-img/yarn_applications.png
----------------------------------------------------------------------
diff --git a/docs/assets/themes/zeppelin/img/docs-img/yarn_applications.png 
b/docs/assets/themes/zeppelin/img/docs-img/yarn_applications.png
new file mode 100644
index 0000000..06c5296
Binary files /dev/null and 
b/docs/assets/themes/zeppelin/img/docs-img/yarn_applications.png differ

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/docs/assets/themes/zeppelin/img/docs-img/zeppelin_yarn_conf.png
----------------------------------------------------------------------
diff --git a/docs/assets/themes/zeppelin/img/docs-img/zeppelin_yarn_conf.png 
b/docs/assets/themes/zeppelin/img/docs-img/zeppelin_yarn_conf.png
new file mode 100644
index 0000000..435193a
Binary files /dev/null and 
b/docs/assets/themes/zeppelin/img/docs-img/zeppelin_yarn_conf.png differ

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/docs/index.md
----------------------------------------------------------------------
diff --git a/docs/index.md b/docs/index.md
index 70931e5..bff5253 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -170,6 +170,7 @@ Join to our [Mailing 
list](https://zeppelin.apache.org/community.html) and repor
 * Advanced
   * [Apache Zeppelin on Vagrant VM](./install/virtual_machine.html)
   * [Zeppelin on Spark Cluster Mode (Standalone via 
Docker)](./install/spark_cluster_mode.html#spark-standalone-mode)
+  * [Zeppelin on Spark Cluster Mode (YARN via 
Docker)](./install/spark_cluster_mode.html#spark-yarn-mode)
 * Contribute
   * [Writing Zeppelin 
Interpreter](./development/writingzeppelininterpreter.html)
   * [Writing Zeppelin Application 
(Experimental)](./development/writingzeppelinapplication.html)

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/docs/install/spark_cluster_mode.md
----------------------------------------------------------------------
diff --git a/docs/install/spark_cluster_mode.md 
b/docs/install/spark_cluster_mode.md
index d2517bd..47f688c 100644
--- a/docs/install/spark_cluster_mode.md
+++ b/docs/install/spark_cluster_mode.md
@@ -1,7 +1,7 @@
 ---
 layout: page
 title: "Apache Zeppelin on Spark cluster mode"
-description: ""
+description: "This document will guide you how you can build and configure the 
environment on 3 types of Spark cluster manager with Apache Zeppelin using 
docker scripts."
 group: install
 ---
 <!--
@@ -56,12 +56,12 @@ spark_standalone bash;
 ```
 
 ### 3. Configure Spark interpreter in Zeppelin
-Set Spark master as `spark://localhost:7077` in Zeppelin **Interpreters** 
setting page.
+Set Spark master as `spark://<hostname>:7077` in Zeppelin **Interpreters** 
setting page.
 
 <img src="../assets/themes/zeppelin/img/docs-img/standalone_conf.png" />
 
 ### 4. Run Zeppelin with Spark interpreter
-After running single paragraph with Spark interpreter in Zeppelin, browse 
`https://localhost:8080` and check whether Spark cluster is running well or not.
+After running single paragraph with Spark interpreter in Zeppelin, browse 
`https://<hostname>:8080` and check whether Spark cluster is running well or 
not.
 
 <img src="../assets/themes/zeppelin/img/docs-img/spark_ui.png" />
 
@@ -72,3 +72,71 @@ ps -ef | grep spark
 ```
 
 
+## Spark on YARN mode
+You can simply set up [Spark on 
YARN](http://spark.apache.org/docs/latest/running-on-yarn.html) docker 
environment with below steps.
+
+> **Note :** Since Apache Zeppelin and Spark use same `8080` port for their 
web UI, you might need to change `zeppelin.server.port` in 
`conf/zeppelin-site.xml`.
+
+### 1. Build Docker file
+You can find docker script files under `scripts/docker/spark-cluster-managers`.
+
+```
+cd $ZEPPELIN_HOME/scripts/docker/spark-cluster-managers/spark_yarn
+docker build -t "spark_yarn" .
+```
+
+### 2. Run docker
+
+```
+docker run -it \
+ -p 5000:5000 \
+ -p 9000:9000 \
+ -p 9001:9001 \
+ -p 8088:8088 \
+ -p 8042:8042 \
+ -p 8030:8030 \
+ -p 8031:8031 \
+ -p 8032:8032 \
+ -p 8033:8033 \
+ -p 8080:8080 \
+ -p 7077:7077 \
+ -p 8888:8888 \
+ -p 8081:8081 \
+ -p 50010:50010 \
+ -p 50075:50075 \
+ -p 50020:50020 \
+ -p 50070:50070 \
+ --name spark_yarn \
+ -h sparkmaster \
+ spark_yarn bash;
+```
+
+### 3. Verify running Spark on YARN.
+
+You can simply verify the processes of Spark and YARN is running well in 
Docker with below command.
+
+```
+ps -ef
+```
+
+You can also check each application web UI for HDFS on 
`http://<hostname>:50070/`, YARN on `http://<hostname>:8088/cluster` and Spark 
on `http://<hostname>:8080/`.
+
+### 4. Configure Spark interpreter in Zeppelin
+Set following configurations to `conf/zeppelin-env.sh`.
+
+```
+export MASTER=yarn-client
+export HADOOP_CONF_DIR=[your_hadoop_conf_path]
+export SPARK_HOME=[your_spark_home_path]
+```
+
+`HADOOP_CONF_DIR`(Hadoop configuration path) is defined in 
`/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf`.
+
+Don't forget to set Spark `master` as `yarn-client` in Zeppelin 
**Interpreters** setting page like below.
+
+<img src="../assets/themes/zeppelin/img/docs-img/zeppelin_yarn_conf.png" />
+
+### 5. Run Zeppelin with Spark interpreter
+After running a single paragraph with Spark interpreter in Zeppelin, browse 
`http://<hostname>:8088/cluster/apps` and check Zeppelin application is running 
well or not.
+
+<img src="../assets/themes/zeppelin/img/docs-img/yarn_applications.png" />

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/scripts/docker/spark-cluster-managers/spark_standalone/Dockerfile
----------------------------------------------------------------------
diff --git a/scripts/docker/spark-cluster-managers/spark_standalone/Dockerfile 
b/scripts/docker/spark-cluster-managers/spark_standalone/Dockerfile
index a7bae23..fa3078b 100644
--- a/scripts/docker/spark-cluster-managers/spark_standalone/Dockerfile
+++ b/scripts/docker/spark-cluster-managers/spark_standalone/Dockerfile
@@ -13,7 +13,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 FROM centos:centos6
-MAINTAINER hss...@nflabs.com
 
 ENV SPARK_PROFILE 1.6
 ENV SPARK_VERSION 1.6.2

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/scripts/docker/spark-cluster-managers/spark_yarn_cluster/Dockerfile
----------------------------------------------------------------------
diff --git 
a/scripts/docker/spark-cluster-managers/spark_yarn_cluster/Dockerfile 
b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/Dockerfile
new file mode 100644
index 0000000..712c5b2
--- /dev/null
+++ b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/Dockerfile
@@ -0,0 +1,107 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+FROM centos:centos6
+
+ENV SPARK_PROFILE 2.0
+ENV SPARK_VERSION 2.0.0
+ENV HADOOP_PROFILE 2.7
+ENV HADOOP_VERSION 2.7.0
+
+# Update the image with the latest packages
+RUN yum update -y; yum clean all
+
+# Get utils
+RUN yum install -y \
+wget \
+tar \
+curl \
+&& \
+yum clean all
+
+# Remove old jdk
+RUN yum remove java; yum remove jdk
+
+# install jdk7
+RUN yum install -y java-1.7.0-openjdk-devel
+ENV JAVA_HOME /usr/lib/jvm/java
+ENV PATH $PATH:$JAVA_HOME/bin
+
+# install hadoop 
+RUN yum install -y curl which tar sudo openssh-server openssh-clients rsync
+
+# hadoop
+RUN curl -s 
https://archive.apache.org/dist/hadoop/core/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
 | tar -xz -C /usr/local/
+RUN cd /usr/local && ln -s ./hadoop-$HADOOP_VERSION hadoop
+
+ENV HADOOP_PREFIX /usr/local/hadoop
+ENV HADOOP_COMMON_HOME /usr/local/hadoop
+ENV HADOOP_HDFS_HOME /usr/local/hadoop
+ENV HADOOP_MAPRED_HOME /usr/local/hadoop
+ENV HADOOP_YARN_HOME /usr/local/hadoop
+ENV HADOOP_CONF_DIR /usr/local/hadoop/etc/hadoop
+
+RUN sed -i '/^export JAVA_HOME/ s:.*:export 
JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64\nexport 
HADOOP_PREFIX=/usr/local/hadoop\nexport HADOOP_HOME=/usr/local/hadoop\n:' 
$HADOOP_PREFIX/etc/hadoop/hadoop-env.sh
+RUN sed -i '/^export HADOOP_CONF_DIR/ s:.*:export 
HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop/:' 
$HADOOP_PREFIX/etc/hadoop/hadoop-env.sh
+
+RUN mkdir $HADOOP_PREFIX/input
+RUN cp $HADOOP_PREFIX/etc/hadoop/*.xml $HADOOP_PREFIX/input
+
+# hadoop configurations
+ADD hdfs_conf/core-site.xml $HADOOP_PREFIX/etc/hadoop/core-site.xml
+ADD hdfs_conf/hdfs-site.xml $HADOOP_PREFIX/etc/hadoop/hdfs-site.xml
+ADD hdfs_conf/mapred-site.xml $HADOOP_PREFIX/etc/hadoop/mapred-site.xml
+ADD hdfs_conf/yarn-site.xml $HADOOP_PREFIX/etc/hadoop/yarn-site.xml
+
+RUN mkdir /data/
+RUN chmod 777 /data/
+RUN $HADOOP_PREFIX/bin/hdfs namenode -format
+
+RUN rm  /usr/local/hadoop/lib/native/*
+RUN curl -Ls 
http://dl.bintray.com/sequenceiq/sequenceiq-bin/hadoop-native-64-$HADOOP_VERSION.tar|tar
 -x -C /usr/local/hadoop/lib/native/
+
+# install spark
+RUN curl -s 
http://archive.apache.org/dist/spark/spark-$SPARK_VERSION/spark-$SPARK_VERSION-bin-hadoop$HADOOP_PROFILE.tgz
 | tar -xz -C /usr/local/
+RUN cd /usr/local && ln -s spark-$SPARK_VERSION-bin-hadoop$HADOOP_PROFILE spark
+ENV SPARK_HOME /usr/local/spark
+
+ENV YARN_CONF_DIR $HADOOP_PREFIX/etc/hadoop
+ENV PATH $PATH:$SPARK_HOME/bin:$HADOOP_PREFIX/bin
+
+# passwordless ssh
+RUN ssh-keygen -q -N "" -t dsa -f /etc/ssh/ssh_host_dsa_key
+RUN ssh-keygen -q -N "" -t rsa -f /etc/ssh/ssh_host_rsa_key
+RUN ssh-keygen -q -N "" -t rsa -f /root/.ssh/id_rsa
+RUN cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys
+
+ADD ssh_config /root/.ssh/config
+RUN chmod 600 /root/.ssh/config
+RUN chown root:root /root/.ssh/config
+RUN chmod +x /usr/local/hadoop/etc/hadoop/*-env.sh
+
+# update boot script
+COPY entrypoint.sh /etc/entrypoint.sh
+RUN chown root.root /etc/entrypoint.sh
+RUN chmod 700 /etc/entrypoint.sh
+
+# Hdfs ports
+EXPOSE 50010 50020 50070 50075 50090
+# Mapred ports
+EXPOSE 9000 9001
+#Yarn ports
+EXPOSE 8030 8031 8032 8033 8040 8042 8088
+#spark
+EXPOSE 8080 7077 8888 8081
+
+ENTRYPOINT ["/etc/entrypoint.sh"]

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/scripts/docker/spark-cluster-managers/spark_yarn_cluster/entrypoint.sh
----------------------------------------------------------------------
diff --git 
a/scripts/docker/spark-cluster-managers/spark_yarn_cluster/entrypoint.sh 
b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/entrypoint.sh
new file mode 100755
index 0000000..85b335d
--- /dev/null
+++ b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/entrypoint.sh
@@ -0,0 +1,60 @@
+#!/bin/bash
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+echo 'hadoop' |passwd root --stdin
+
+: ${HADOOP_PREFIX:=/usr/local/hadoop}
+
+$HADOOP_PREFIX/etc/hadoop/hadoop-env.sh
+
+rm /tmp/*.pid
+
+# installing libraries if any - (resource urls added comma separated to the 
ACP system variable)
+cd $HADOOP_PREFIX/share/hadoop/common ; for cp in ${ACP//,/ }; do  echo == 
$cp; curl -LO $cp ; done; cd -
+
+cp $SPARK_HOME/conf/metrics.properties.template 
$SPARK_HOME/conf/metrics.properties
+
+# start hadoop
+service sshd start
+$HADOOP_PREFIX/sbin/start-dfs.sh
+$HADOOP_PREFIX/sbin/start-yarn.sh
+
+$HADOOP_PREFIX/bin/hdfs dfsadmin -safemode leave && $HADOOP_PREFIX/bin/hdfs 
dfs -put $SPARK_HOME-$SPARK_VERSION-bin-hadoop$HADOOP_PROFILE/lib /spark
+
+# start spark
+export SPARK_MASTER_OPTS="-Dspark.driver.port=7001 -Dspark.fileserver.port=7002
+  -Dspark.broadcast.port=7003 -Dspark.replClassServer.port=7004
+  -Dspark.blockManager.port=7005 -Dspark.executor.port=7006
+  -Dspark.ui.port=4040 
-Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory"
+export SPARK_WORKER_OPTS="-Dspark.driver.port=7001 -Dspark.fileserver.port=7002
+  -Dspark.broadcast.port=7003 -Dspark.replClassServer.port=7004
+  -Dspark.blockManager.port=7005 -Dspark.executor.port=7006
+  -Dspark.ui.port=4040 
-Dspark.broadcast.factory=org.apache.spark.broadcast.HttpBroadcastFactory"
+
+export SPARK_MASTER_PORT=7077
+
+cd /usr/local/spark/sbin
+./start-master.sh
+./start-slave.sh spark://`hostname`:$SPARK_MASTER_PORT
+
+CMD=${1:-"exit 0"}
+if [[ "$CMD" == "-d" ]];
+then
+       service sshd stop
+       /usr/sbin/sshd -D -d
+else
+       /bin/bash -c "$*"
+fi

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/core-site.xml
----------------------------------------------------------------------
diff --git 
a/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/core-site.xml
 
b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/core-site.xml
new file mode 100644
index 0000000..8744633
--- /dev/null
+++ 
b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/core-site.xml
@@ -0,0 +1,22 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+<configuration>
+  <property>
+    <name>fs.defaultFS</name>
+    <value>hdfs://0.0.0.0:9000</value>
+  </property>
+</configuration>

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/hdfs-site.xml
----------------------------------------------------------------------
diff --git 
a/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/hdfs-site.xml
 
b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/hdfs-site.xml
new file mode 100644
index 0000000..b3f88af
--- /dev/null
+++ 
b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/hdfs-site.xml
@@ -0,0 +1,78 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+<configuration>
+  <property>
+    <name>dfs.replication</name>
+    <value>1</value>
+  </property>
+
+  <property>
+    <name>dfs.data.dir</name>
+    <value>/data/hdfs</value>
+    <final>true</final>
+  </property>
+
+  <property>
+    <name>dfs.permissions</name>
+    <value>false</value>
+  </property>
+
+  <property>
+    <name>dfs.client.use.datanode.hostname</name>
+    <value>true</value>
+    <description>Whether clients should use datanode hostnames when
+      connecting to datanodes.
+    </description>
+  </property>
+
+  <property>
+    <name>dfs.datanode.use.datanode.hostname</name>
+    <value>true</value>
+    <description>Whether datanodes should use datanode hostnames when
+      connecting to other datanodes for data transfer.
+    </description>
+  </property>
+
+  <property>
+    <name>dfs.datanode.address</name>
+    <value>0.0.0.0:50010</value>
+    <description>
+      The address where the datanode server will listen to.
+      If the port is 0 then the server will start on a free port.
+    </description>
+  </property>
+
+  <property>
+    <name>dfs.datanode.http.address</name>
+    <value>0.0.0.0:50075</value>
+    <description>
+      The datanode http server address and port.
+      If the port is 0 then the server will start on a free port.
+    </description>
+  </property>
+
+  <property>
+    <name>dfs.datanode.ipc.address</name>
+    <value>0.0.0.0:50020</value>
+    <description>
+      The datanode ipc server address and port.
+      If the port is 0 then the server will start on a free port.
+    </description>
+  </property>
+
+</configuration>
+

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/mapred-site.xml
----------------------------------------------------------------------
diff --git 
a/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/mapred-site.xml
 
b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/mapred-site.xml
new file mode 100644
index 0000000..f8280f7
--- /dev/null
+++ 
b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/mapred-site.xml
@@ -0,0 +1,22 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+<configuration>
+  <property>
+    <name>mapreduce.framework.name</name>
+       <value>yarn</value>
+  </property>
+</configuration>

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/yarn-site.xml
----------------------------------------------------------------------
diff --git 
a/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/yarn-site.xml
 
b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/yarn-site.xml
new file mode 100644
index 0000000..8984816
--- /dev/null
+++ 
b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/hdfs_conf/yarn-site.xml
@@ -0,0 +1,42 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+<configuration>
+  <property>
+    <name>yarn.resourcemanager.scheduler.address</name>
+    <value>0.0.0.0:8030</value>
+  </property>
+  <property>
+    <name>yarn.resourcemanager.address</name>
+    <value>0.0.0.0:8032</value>
+  </property>
+  <property>
+    <name>yarn.resourcemanager.webapp.address</name>
+    <value>0.0.0.0:8088</value>
+  </property>
+  <property>
+    <name>yarn.resourcemanager.resource-tracker.address</name>
+    <value>0.0.0.0:8031</value>
+  </property>
+  <property>
+    <name>yarn.resourcemanager.admin.address</name>
+    <value>0.0.0.0:8033</value>
+  </property>
+  <property>
+      <name>yarn.application.classpath</name>
+      <value>/usr/local/hadoop/etc/hadoop, 
/usr/local/hadoop/share/hadoop/common/*, 
/usr/local/hadoop/share/hadoop/common/lib/*, 
/usr/local/hadoop/share/hadoop/hdfs/*, 
/usr/local/hadoop/share/hadoop/hdfs/lib/*, 
/usr/local/hadoop/share/hadoop/mapreduce/*, 
/usr/local/hadoop/share/hadoop/mapreduce/lib/*, 
/usr/local/hadoop/share/hadoop/yarn/*, 
/usr/local/hadoop/share/hadoop/yarn/lib/*, 
/usr/local/hadoop/share/spark/*</value>
+   </property>
+</configuration>

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/eccfe007/scripts/docker/spark-cluster-managers/spark_yarn_cluster/ssh_config
----------------------------------------------------------------------
diff --git 
a/scripts/docker/spark-cluster-managers/spark_yarn_cluster/ssh_config 
b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/ssh_config
new file mode 100644
index 0000000..537a95f
--- /dev/null
+++ b/scripts/docker/spark-cluster-managers/spark_yarn_cluster/ssh_config
@@ -0,0 +1,18 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+Host *
+  UserKnownHostsFile /dev/null
+  StrictHostKeyChecking no
+  LogLevel quiet
\ No newline at end of file

Reply via email to