install_with_flink_and_spark_cluster.html rss.xml search_data.json

moon Sat, 17 Dec 2016 07:49:14 -0800

Author: moon
Date: Sat Dec 17 15:48:31 2016
New Revision: 1774779

URL: http://svn.apache.org/viewvc?rev=1774779&view=rev
Log:
ZEPPELIN-1798


Modified:
    zeppelin/site/docs/0.7.0-SNAPSHOT/atom.xml
    zeppelin/site/docs/0.7.0-SNAPSHOT/interpreter/flink.html
    
zeppelin/site/docs/0.7.0-SNAPSHOT/quickstart/install_with_flink_and_spark_cluster.html
    zeppelin/site/docs/0.7.0-SNAPSHOT/rss.xml
    zeppelin/site/docs/0.7.0-SNAPSHOT/search_data.json

Modified: zeppelin/site/docs/0.7.0-SNAPSHOT/atom.xml
URL: 
http://svn.apache.org/viewvc/zeppelin/site/docs/0.7.0-SNAPSHOT/atom.xml?rev=1774779&r1=1774778&r2=1774779&view=diff
==============================================================================
--- zeppelin/site/docs/0.7.0-SNAPSHOT/atom.xml (original)
+++ zeppelin/site/docs/0.7.0-SNAPSHOT/atom.xml Sat Dec 17 15:48:31 2016
@@ -4,7 +4,7 @@
  <title>Apache Zeppelin</title>
  <link href="http://zeppelin.apache.org/"; rel="self"/>
  <link href="http://zeppelin.apache.org"/>
- <updated>2016-12-17T21:52:14+09:00</updated>
+ <updated>2016-12-17T07:48:11-08:00</updated>
  <id>http://zeppelin.apache.org</id>
  <author>
    <name>The Apache Software Foundation</name>

Modified: zeppelin/site/docs/0.7.0-SNAPSHOT/interpreter/flink.html
URL: 
http://svn.apache.org/viewvc/zeppelin/site/docs/0.7.0-SNAPSHOT/interpreter/flink.html?rev=1774779&r1=1774778&r2=1774779&view=diff
==============================================================================
--- zeppelin/site/docs/0.7.0-SNAPSHOT/interpreter/flink.html (original)
+++ zeppelin/site/docs/0.7.0-SNAPSHOT/interpreter/flink.html Sat Dec 17 
15:48:31 2016
@@ -250,7 +250,7 @@ wget http://www.gutenberg.org/ebooks/10.
 </code></pre></div>
 <div class="highlight"><pre><code class="scala"><span class="o">%</span><span 
class="n">flink</span>
 <span class="k">case</span> <span class="k">class</span> <span 
class="nc">WordCount</span><span class="o">(</span><span 
class="n">word</span><span class="k">:</span> <span 
class="kt">String</span><span class="o">,</span> <span 
class="n">frequency</span><span class="k">:</span> <span 
class="kt">Int</span><span class="o">)</span>
-<span class="k">val</span> <span class="n">bible</span><span 
class="k">:</span><span class="kt">DataSet</span><span class="o">[</span><span 
class="kt">String</span><span class="o">]</span> <span class="k">=</span> <span 
class="n">env</span><span class="o">.</span><span 
class="n">readTextFile</span><span class="o">(</span><span 
class="s">&quot;10.txt.utf-8&quot;</span><span class="o">)</span>
+<span class="k">val</span> <span class="n">bible</span><span 
class="k">:</span><span class="kt">DataSet</span><span class="o">[</span><span 
class="kt">String</span><span class="o">]</span> <span class="k">=</span> <span 
class="n">benv</span><span class="o">.</span><span 
class="n">readTextFile</span><span class="o">(</span><span 
class="s">&quot;10.txt.utf-8&quot;</span><span class="o">)</span>
 <span class="k">val</span> <span class="n">partialCounts</span><span 
class="k">:</span> <span class="kt">DataSet</span><span class="o">[</span><span 
class="kt">WordCount</span><span class="o">]</span> <span class="k">=</span> 
<span class="n">bible</span><span class="o">.</span><span 
class="n">flatMap</span><span class="o">{</span>
     <span class="n">line</span> <span class="k">=&gt;</span>
         <span 
class="s">&quot;&quot;&quot;\b\w+\b&quot;&quot;&quot;</span><span 
class="o">.</span><span class="n">r</span><span class="o">.</span><span 
class="n">findAllIn</span><span class="o">(</span><span 
class="n">line</span><span class="o">).</span><span class="n">map</span><span 
class="o">(</span><span class="n">word</span> <span class="k">=&gt;</span> 
<span class="nc">WordCount</span><span class="o">(</span><span 
class="n">word</span><span class="o">,</span> <span class="mi">1</span><span 
class="o">))</span>

Modified: 
zeppelin/site/docs/0.7.0-SNAPSHOT/quickstart/install_with_flink_and_spark_cluster.html
URL: 
http://svn.apache.org/viewvc/zeppelin/site/docs/0.7.0-SNAPSHOT/quickstart/install_with_flink_and_spark_cluster.html?rev=1774779&r1=1774778&r2=1774779&view=diff
==============================================================================
--- 
zeppelin/site/docs/0.7.0-SNAPSHOT/quickstart/install_with_flink_and_spark_cluster.html
 (original)
+++ 
zeppelin/site/docs/0.7.0-SNAPSHOT/quickstart/install_with_flink_and_spark_cluster.html
 Sat Dec 17 15:48:31 2016
@@ -277,13 +277,15 @@ Clone Zeppelin.</p>
 <div class="highlight"><pre><code class="text language-text" 
data-lang="text">cd zeppelin
 </code></pre></div>
 <p>Package Zeppelin.</p>
-<div class="highlight"><pre><code class="text language-text" 
data-lang="text">mvn clean package -DskipTests -Pspark-1.6 -Dflink.version=1.1.2
+<div class="highlight"><pre><code class="text language-text" 
data-lang="text">mvn clean package -DskipTests -Pspark-1.6 
-Dflink.version=1.1.3 -Pscala-2.10
 </code></pre></div>
 <p><code>-DskipTests</code> skips build tests- you&#39;re not developing 
(yet), so you don&#39;t need to do tests, the clone version <em>should</em> 
build.</p>
 
 <p><code>-Pspark-1.6</code> tells maven to build a Zeppelin with Spark 1.6.  
This is important because Zeppelin has its own Spark interpreter and the 
versions must be the same.</p>
 
-<p><code>-Dflink.version=1.1.2</code> tells maven specifically to build 
Zeppelin with Flink version 1.1.2.</p>
+<p><code>-Dflink.version=1.1.3</code> tells maven specifically to build 
Zeppelin with Flink version 1.1.3.</p>
+
+<p>-<code>-Pscala-2.10</code> tells maven to build with Scala v2.10.</p>
 
 <p><strong>Note:</strong> You may wish to include additional build flags such 
as <code>-Ppyspark</code> or <code>-Psparkr</code>.  See <a 
href="https://github.com/apache/zeppelin#build";>the build section of github for 
more details</a>.</p>
 
@@ -312,7 +314,7 @@ As long as you didn&#39;t edit any code,
 <p>Create a new notebook named &quot;Flink Test&quot; and copy and paste the 
following code.</p>
 <div class="highlight"><pre><code class="scala language-scala" 
data-lang="scala"><span class="o">%</span><span class="n">flink</span>  <span 
class="c1">// let Zeppelin know what interpreter to use.</span>
 
-<span class="k">val</span> <span class="n">text</span> <span 
class="k">=</span> <span class="n">env</span><span class="o">.</span><span 
class="n">fromElements</span><span class="o">(</span><span class="s">&quot;In 
the time of chimpanzees, I was a monkey&quot;</span><span class="o">,</span>   
<span class="c1">// some lines of text to analyze</span>
+<span class="k">val</span> <span class="n">text</span> <span 
class="k">=</span> <span class="n">benv</span><span class="o">.</span><span 
class="n">fromElements</span><span class="o">(</span><span class="s">&quot;In 
the time of chimpanzees, I was a monkey&quot;</span><span class="o">,</span>   
<span class="c1">// some lines of text to analyze</span>
 <span class="s">&quot;Butane in my veins and I&#39;m out to cut the 
junkie&quot;</span><span class="o">,</span>
 <span class="s">&quot;With the plastic eyeballs, spray paint the 
vegetables&quot;</span><span class="o">,</span>
 <span class="s">&quot;Dog food stalls with the beefcake 
pantyhose&quot;</span><span class="o">,</span>
@@ -393,13 +395,13 @@ As long as you didn&#39;t edit any code,
 <p>Building from source is recommended  where possible, for simplicity in this 
tutorial we will download Flink and Spark Binaries.</p>
 
 <p>To download the Flink Binary use <code>wget</code></p>
-<div class="highlight"><pre><code class="bash language-bash" 
data-lang="bash">wget <span 
class="s2">&quot;http://mirror.cogentco.com/pub/apache/flink/flink-1.0.3/flink-1.0.3-bin-hadoop24-scala_2.10.tgz&quot;</span>
-tar -xzvf flink-1.0.3-bin-hadoop24-scala_2.10.tgz
+<div class="highlight"><pre><code class="bash language-bash" 
data-lang="bash">wget <span 
class="s2">&quot;http://mirror.cogentco.com/pub/apache/flink/flink-1.1.3/flink-1.1.3-bin-hadoop24-scala_2.10.tgz&quot;</span>
+tar -xzvf flink-1.1.3-bin-hadoop24-scala_2.10.tgz
 </code></pre></div>
-<p>This will download Flink 1.0.3, compatible with Hadoop 2.4.  You do not 
have to install Hadoop for this binary to work, but if you are using Hadoop, 
please change <code>24</code> to your appropriate version.</p>
+<p>This will download Flink 1.1.3, compatible with Hadoop 2.4.  You do not 
have to install Hadoop for this binary to work, but if you are using Hadoop, 
please change <code>24</code> to your appropriate version.</p>
 
 <p>Start the Flink Cluster.</p>
-<div class="highlight"><pre><code class="bash language-bash" 
data-lang="bash">flink-1.0.3/bin/start-cluster.sh
+<div class="highlight"><pre><code class="bash language-bash" 
data-lang="bash">flink-1.1.3/bin/start-cluster.sh
 </code></pre></div>
 <h6>Building From source</h6>
 
@@ -407,11 +409,11 @@ tar -xzvf flink-1.0.3-bin-hadoop24-scala
 
 <p>See the <a 
href="https://github.com/apache/flink/blob/master/README.md";>Flink Installation 
guide</a> for more detailed instructions.</p>
 
-<p>Return to the directory where you have been downloading, this tutorial 
assumes that is <code>$HOME</code>. Clone Flink,  check out release-1.0, and 
build.</p>
+<p>Return to the directory where you have been downloading, this tutorial 
assumes that is <code>$HOME</code>. Clone Flink,  check out release-1.1.3-rc2, 
and build.</p>
 <div class="highlight"><pre><code class="text language-text" 
data-lang="text">cd $HOME
 git clone https://github.com/apache/flink.git
 cd flink
-git checkout release-1.0
+git checkout release-1.1.3-rc2
 mvn clean install -DskipTests
 </code></pre></div>
 <p>Start the Flink Cluster in stand-alone mode</p>
@@ -427,8 +429,8 @@ mvn clean install -DskipTests
 
 <p>(if binaries)
 <code>
-flink-1.0.3/bin/stop-cluster.sh
-flink-1.0.3/bin/start-cluster.sh
+flink-1.1.3/bin/stop-cluster.sh
+flink-1.1.3/bin/start-cluster.sh
 </code></p>
 
 <p>(if built from source)
@@ -446,11 +448,11 @@ build-target/bin/start-cluster.sh
 <p>Using binaries is also</p>
 
 <p>To download the Spark Binary use <code>wget</code></p>
-<div class="highlight"><pre><code class="bash language-bash" 
data-lang="bash">wget <span 
class="s2">&quot;http://mirrors.koehn.com/apache/spark/spark-1.6.1/spark-1.6.1-bin-hadoop2.4.tgz&quot;</span>
-tar -xzvf spark-1.6.1-bin-hadoop2.4.tgz
-mv spark-1.6.1-bin-hadoop4.4 spark
+<div class="highlight"><pre><code class="bash language-bash" 
data-lang="bash">wget <span 
class="s2">&quot;http://d3kbcqa49mib13.cloudfront.net/spark-1.6.3-bin-hadoop2.6.tgz&quot;</span>
+tar -xzvf spark-1.6.3-bin-hadoop2.6.tgz
+mv spark-1.6.3-bin-hadoop2.6 spark
 </code></pre></div>
-<p>This will download Spark 1.6.1, compatible with Hadoop 2.4.  You do not 
have to install Hadoop for this binary to work, but if you are using Hadoop, 
please change <code>2.4</code> to your appropriate version.</p>
+<p>This will download Spark 1.6.3, compatible with Hadoop 2.6.  You do not 
have to install Hadoop for this binary to work, but if you are using Hadoop, 
please change <code>2.6</code> to your appropriate version.</p>
 
 <h6>Building From source</h6>
 
@@ -460,7 +462,7 @@ mv spark-1.6.1-bin-hadoop4.4 spark
 
 <p>Return to the directory where you have been downloading, this tutorial 
assumes that is $HOME. Clone Spark, check out branch-1.6, and build.
 <strong>Note:</strong> Recall, we&#39;re only checking out 1.6 because it is 
the most recent Spark for which a Zeppelin profile exists at
-  the time of writing. You are free to check out other version, just make sure 
you build Zeppelin against the correct version of Spark.</p>
+  the time of writing. You are free to check out other version, just make sure 
you build Zeppelin against the correct version of Spark. However if you use 
Spark 2.0, the word count example will need to be changed as Spark 2.0 is not 
compatible with the following examples.</p>
 <div class="highlight"><pre><code class="text language-text" 
data-lang="text">cd $HOME
 </code></pre></div>
 <p>Clone, check out, and build Spark version 1.6.x.</p>

Modified: zeppelin/site/docs/0.7.0-SNAPSHOT/rss.xml
URL: 
http://svn.apache.org/viewvc/zeppelin/site/docs/0.7.0-SNAPSHOT/rss.xml?rev=1774779&r1=1774778&r2=1774779&view=diff
==============================================================================
--- zeppelin/site/docs/0.7.0-SNAPSHOT/rss.xml (original)
+++ zeppelin/site/docs/0.7.0-SNAPSHOT/rss.xml Sat Dec 17 15:48:31 2016
@@ -5,8 +5,8 @@
         <description>Apache Zeppelin - The Apache Software 
Foundation</description>
         <link>http://zeppelin.apache.org</link>
         <link>http://zeppelin.apache.org</link>
-        <lastBuildDate>2016-12-17T21:52:14+09:00</lastBuildDate>
-        <pubDate>2016-12-17T21:52:14+09:00</pubDate>
+        <lastBuildDate>2016-12-17T07:48:11-08:00</lastBuildDate>
+        <pubDate>2016-12-17T07:48:11-08:00</pubDate>
         <ttl>1800</ttl>
 
 

Modified: zeppelin/site/docs/0.7.0-SNAPSHOT/search_data.json
URL: 
http://svn.apache.org/viewvc/zeppelin/site/docs/0.7.0-SNAPSHOT/search_data.json?rev=1774779&r1=1774778&r2=1774779&view=diff
==============================================================================
--- zeppelin/site/docs/0.7.0-SNAPSHOT/search_data.json (original)
+++ zeppelin/site/docs/0.7.0-SNAPSHOT/search_data.json Sat Dec 17 15:48:31 2016
@@ -226,7 +226,7 @@
 
     "/interpreter/flink.html": {
       "title": "Flink Interpreter for Apache Zeppelin",
-      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the 
&quot;License&quot;);you may not use this file except in compliance with the 
License.You may obtain a copy of the License 
athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law 
or agreed to in writing, softwaredistributed under the License is distributed 
on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, 
either express or implied.See the License for the specific language governing 
permissions andlimitations under the License.--&gt;Flink interpreter for Apache 
ZeppelinOverviewApache Flink is an open source platform for distributed stream 
and batch data processing. Flinkâs core is a streaming dataflow engine that 
provides data distribution, communication, and fault tolerance for distributed 
computations over data streams. Flink also builds batch processing on top of 
the streaming engine, overlaying native iteration support, managed memory, and 
program opt
 imization.How to start local Flink cluster, to test the interpreterZeppelin 
comes with pre-configured flink-local interpreter, which starts Flink in a 
local mode on your machine, so you do not need to install anything.How to 
configure interpreter to point to Flink clusterAt the 
&amp;quot;Interpreters&amp;quot; menu, you have to create a new Flink 
interpreter and provide next properties:      property    value    Description  
      host    local    host name of running JobManager. &#39;local&#39; runs 
flink in local mode (default)        port    6123    port of running JobManager 
 For more information about Flink configuration, you can find it here.How to 
test it&amp;#39;s workingIn example, by using the Zeppelin notebook is from 
Till Rohrmann&amp;#39;s presentation Interactive data analysis with Apache 
Flink for Apache Flink Meetup.%shrm 10.txt.utf-8wget 
http://www.gutenberg.org/ebooks/10.txt.utf-8%flinkcase class WordCount(word: 
String, frequency: Int)val bible:DataSet[String] = en
 v.readTextFile(&amp;quot;10.txt.utf-8&amp;quot;)val partialCounts: 
DataSet[WordCount] = bible.flatMap{    line =&amp;gt;        
&amp;quot;&amp;quot;&amp;quot;bw+b&amp;quot;&amp;quot;&amp;quot;.r.findAllIn(line).map(word
 =&amp;gt; WordCount(word, 1))//        line.split(&amp;quot; 
&amp;quot;).map(word =&amp;gt; WordCount(word, 1))}val wordCounts = 
partialCounts.groupBy(&amp;quot;word&amp;quot;).reduce{    (left, right) 
=&amp;gt; WordCount(left.word, left.frequency + right.frequency)}val result10 = 
wordCounts.first(10).collect()",
+      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the 
&quot;License&quot;);you may not use this file except in compliance with the 
License.You may obtain a copy of the License 
athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law 
or agreed to in writing, softwaredistributed under the License is distributed 
on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, 
either express or implied.See the License for the specific language governing 
permissions andlimitations under the License.--&gt;Flink interpreter for Apache 
ZeppelinOverviewApache Flink is an open source platform for distributed stream 
and batch data processing. Flinkâs core is a streaming dataflow engine that 
provides data distribution, communication, and fault tolerance for distributed 
computations over data streams. Flink also builds batch processing on top of 
the streaming engine, overlaying native iteration support, managed memory, and 
program opt
 imization.How to start local Flink cluster, to test the interpreterZeppelin 
comes with pre-configured flink-local interpreter, which starts Flink in a 
local mode on your machine, so you do not need to install anything.How to 
configure interpreter to point to Flink clusterAt the 
&amp;quot;Interpreters&amp;quot; menu, you have to create a new Flink 
interpreter and provide next properties:      property    value    Description  
      host    local    host name of running JobManager. &#39;local&#39; runs 
flink in local mode (default)        port    6123    port of running JobManager 
 For more information about Flink configuration, you can find it here.How to 
test it&amp;#39;s workingIn example, by using the Zeppelin notebook is from 
Till Rohrmann&amp;#39;s presentation Interactive data analysis with Apache 
Flink for Apache Flink Meetup.%shrm 10.txt.utf-8wget 
http://www.gutenberg.org/ebooks/10.txt.utf-8%flinkcase class WordCount(word: 
String, frequency: Int)val bible:DataSet[String] = be
 nv.readTextFile(&amp;quot;10.txt.utf-8&amp;quot;)val partialCounts: 
DataSet[WordCount] = bible.flatMap{    line =&amp;gt;        
&amp;quot;&amp;quot;&amp;quot;bw+b&amp;quot;&amp;quot;&amp;quot;.r.findAllIn(line).map(word
 =&amp;gt; WordCount(word, 1))//        line.split(&amp;quot; 
&amp;quot;).map(word =&amp;gt; WordCount(word, 1))}val wordCounts = 
partialCounts.groupBy(&amp;quot;word&amp;quot;).reduce{    (left, right) 
=&amp;gt; WordCount(left.word, left.frequency + right.frequency)}val result10 = 
wordCounts.first(10).collect()",
       "url": " /interpreter/flink.html",
       "group": "interpreter",
       "excerpt": "Apache Flink is an open source platform for distributed 
stream and batch data processing."
@@ -557,7 +557,7 @@
 
     "/quickstart/install_with_flink_and_spark_cluster.html": {
       "title": "Install Zeppelin with Flink and Spark in cluster mode",
-      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the 
&quot;License&quot;);you may not use this file except in compliance with the 
License.You may obtain a copy of the License 
athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law 
or agreed to in writing, softwaredistributed under the License is distributed 
on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, 
either express or implied.See the License for the specific language governing 
permissions andlimitations under the License.--&gt;Install with flink and spark 
clusterThis tutorial is extremely entry-level. It assumes no prior knowledge of 
Linux, git, or other tools. If you carefully type what I tell you when I tell 
you, you should be able to get Zeppelin running.Installing Zeppelin with Flink 
and Spark in cluster modeThis tutorial assumes the user has a machine (real or 
virtual with a fresh, minimal installation of Ubuntu 14.04.3 Server.Note: On 
the size requ
 irements of the Virtual Machine, some users reported trouble when using the 
default virtual machine sizes, specifically that the hard drive needed to be at 
least 16GB- other users did not have this issue.There are many good tutorials 
on how to install Ubuntu Server on a virtual box, here is one of themRequired 
ProgramsAssuming the minimal install, there are several programs that we will 
need to install before Zeppelin, Flink, and Spark.gitopenssh-serverOpenJDK 
7Maven 3.1+For git, openssh-server, and OpenJDK 7 we will be using the apt 
package manager.gitFrom the command prompt:sudo apt-get install 
gitopenssh-serversudo apt-get install openssh-serverOpenJDK 7sudo apt-get 
install openjdk-7-jdk openjdk-7-jre-libA note for those using Ubuntu 16.04: To 
install openjdk-7 on Ubuntu 16.04, one must add a repository.  Sourcesudo 
add-apt-repository ppa:openjdk-r/ppasudo apt-get updatesudo apt-get install 
openjdk-7-jdk openjdk-7-jre-libMaven 3.1+Zeppelin requires maven version 3.x.  
The version
  available in the repositories at the time of writing is 2.x, so maven must be 
installed manually.Purge any existing versions of maven.sudo apt-get purge 
maven maven2Download the maven 3.3.9 binary.wget 
&amp;quot;http://www.us.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz&amp;quot;Unarchive
 the binary and move to the /usr/local directory.tar -zxvf 
apache-maven-3.3.9-bin.tar.gzsudo mv ./apache-maven-3.3.9 /usr/localCreate 
symbolic links in /usr/bin.sudo ln -s /usr/local/apache-maven-3.3.9/bin/mvn 
/usr/bin/mvnInstalling ZeppelinThis provides a quick overview of Zeppelin 
installation from source, however the reader is encouraged to review the 
Zeppelin Installation GuideFrom the command prompt:Clone Zeppelin.git clone 
https://github.com/apache/zeppelin.gitEnter the Zeppelin root directory.cd 
zeppelinPackage Zeppelin.mvn clean package -DskipTests -Pspark-1.6 
-Dflink.version=1.1.2-DskipTests skips build tests- you&amp;#39;re not 
developing (yet), so you don&am
 p;#39;t need to do tests, the clone version should build.-Pspark-1.6 tells 
maven to build a Zeppelin with Spark 1.6.  This is important because Zeppelin 
has its own Spark interpreter and the versions must be the 
same.-Dflink.version=1.1.2 tells maven specifically to build Zeppelin with 
Flink version 1.1.2.Note: You may wish to include additional build flags such 
as -Ppyspark or -Psparkr.  See the build section of github for more 
details.Note: You can build against any version of Spark that has a Zeppelin 
build profile available. The key is to make sure you check out the matching 
version of Spark to build. At the time of this writing, Spark 1.6 was the most 
recent Spark version available.Note: On build failures. Having installed 
Zeppelin close to 30 times now, I will tell you that sometimes the build fails 
for seemingly no reason.As long as you didn&amp;#39;t edit any code, it is 
unlikely the build is failing because of something you did. What does tend to 
happen, is some dependency 
 that maven is trying to download is unreachable.  If your build fails on this 
step here are some tips:- Don&amp;#39;t get discouraged.- Scroll up and read 
through the logs. There will be clues there.- Retry (that is, run the mvn clean 
package -DskipTests -Pspark-1.6 again)- If there were clues that a dependency 
couldn&amp;#39;t be downloaded wait a few hours or even days and retry again. 
Open source software when compiling is trying to download all of the 
dependencies it needs, if a server is off-line there is nothing you can do but 
wait for it to come back.- Make sure you followed all of the steps carefully.- 
Ask the community to help you. Go here and join the user mailing list. People 
are there to help you. Make sure to copy and paste the build output (everything 
that happened in the console) and include that in your message.Start the 
Zeppelin daemon.bin/zeppelin-daemon.sh startUse ifconfig to determine the host 
machine&amp;#39;s IP address. If you are not familiar with how to do 
 this, a fairly comprehensive post can be found here.Open a web-browser on a 
machine connected to the same network as the host (or in the host operating 
system if using a virtual machine).  Navigate to http://yourip:8080, where 
yourip is the IP address you found in ifconfig.See the Zeppelin tutorial for 
basic Zeppelin usage. It is also advised that you take a moment to check out 
the tutorial notebook that is included with each Zeppelin install, and to 
familiarize yourself with basic notebook functionality.Flink TestCreate a new 
notebook named &amp;quot;Flink Test&amp;quot; and copy and paste the following 
code.%flink  // let Zeppelin know what interpreter to use.val text = 
env.fromElements(&amp;quot;In the time of chimpanzees, I was a 
monkey&amp;quot;,   // some lines of text to analyze&amp;quot;Butane in my 
veins and I&amp;#39;m out to cut the junkie&amp;quot;,&amp;quot;With the 
plastic eyeballs, spray paint the vegetables&amp;quot;,&amp;quot;Dog food 
stalls with the beefcake pantyh
 ose&amp;quot;,&amp;quot;Kill the headlights and put it in 
neutral&amp;quot;,&amp;quot;Stock car flamin&amp;#39; with a loser in the 
cruise control&amp;quot;,&amp;quot;Baby&amp;#39;s in Reno with the Vitamin 
D&amp;quot;,&amp;quot;Got a couple of couches, sleep on the love 
seat&amp;quot;,&amp;quot;Someone came in sayin&amp;#39; I&amp;#39;m insane to 
complain&amp;quot;,&amp;quot;About a shotgun wedding and a stain on my 
shirt&amp;quot;,&amp;quot;Don&amp;#39;t believe everything that you 
breathe&amp;quot;,&amp;quot;You get a parking violation and a maggot on your 
sleeve&amp;quot;,&amp;quot;So shave your face with some mace in the 
dark&amp;quot;,&amp;quot;Savin&amp;#39; all your food stamps and 
burnin&amp;#39; down the trailer park&amp;quot;,&amp;quot;Yo, cut 
it&amp;quot;)/*  The meat and potatoes:        this tells Flink to iterate 
through the elements, in this case strings,        transform the string to 
lower case and split the string at white space into individual words        
then f
 inally aggregate the occurrence of each word.        This creates the count 
variable which is a list of tuples of the form (word, 
occurances)counts.collect().foreach(println(_))  // execute the script and 
print each element in the counts list*/val counts = text.flatMap{ 
_.toLowerCase.split(&amp;quot;W+&amp;quot;) }.map { (_,1) 
}.groupBy(0).sum(1)counts.collect().foreach(println(_))  // execute the script 
and print each element in the counts listRun the code to make sure the built-in 
Zeppelin Flink interpreter is working properly.Spark TestCreate a new notebook 
named &amp;quot;Spark Test&amp;quot; and copy and paste the following 
code.%spark // let Zeppelin know what interpreter to use.val text = 
sc.parallelize(List(&amp;quot;In the time of chimpanzees, I was a 
monkey&amp;quot;,  // some lines of text to analyze&amp;quot;Butane in my veins 
and I&amp;#39;m out to cut the junkie&amp;quot;,&amp;quot;With the plastic 
eyeballs, spray paint the vegetables&amp;quot;,&amp;quot;Dog food stall
 s with the beefcake pantyhose&amp;quot;,&amp;quot;Kill the headlights and put 
it in neutral&amp;quot;,&amp;quot;Stock car flamin&amp;#39; with a loser in the 
cruise control&amp;quot;,&amp;quot;Baby&amp;#39;s in Reno with the Vitamin 
D&amp;quot;,&amp;quot;Got a couple of couches, sleep on the love 
seat&amp;quot;,&amp;quot;Someone came in sayin&amp;#39; I&amp;#39;m insane to 
complain&amp;quot;,&amp;quot;About a shotgun wedding and a stain on my 
shirt&amp;quot;,&amp;quot;Don&amp;#39;t believe everything that you 
breathe&amp;quot;,&amp;quot;You get a parking violation and a maggot on your 
sleeve&amp;quot;,&amp;quot;So shave your face with some mace in the 
dark&amp;quot;,&amp;quot;Savin&amp;#39; all your food stamps and 
burnin&amp;#39; down the trailer park&amp;quot;,&amp;quot;Yo, cut 
it&amp;quot;))/*  The meat and potatoes:        this tells spark to iterate 
through the elements, in this case strings,        transform the string to 
lower case and split the string at white space into ind
 ividual words        then finally aggregate the occurrence of each word.       
 This creates the count variable which is a list of tuples of the form (word, 
occurances)*/val counts = text.flatMap { 
_.toLowerCase.split(&amp;quot;W+&amp;quot;) }                 .map { (_,1) }    
             .reduceByKey(_ + _)counts.collect().foreach(println(_))  // 
execute the script and print each element in the counts listRun the code to 
make sure the built-in Zeppelin Flink interpreter is working properly.Finally, 
stop the Zeppelin daemon.  From the command prompt run:bin/zeppelin-daemon.sh 
stopInstalling ClustersFlink ClusterDownload BinariesBuilding from source is 
recommended  where possible, for simplicity in this tutorial we will download 
Flink and Spark Binaries.To download the Flink Binary use wgetwget 
&amp;quot;http://mirror.cogentco.com/pub/apache/flink/flink-1.0.3/flink-1.0.3-bin-hadoop24-scala_2.10.tgz&amp;quot;tar
 -xzvf flink-1.0.3-bin-hadoop24-scala_2.10.tgzThis will download Flink 1.
 0.3, compatible with Hadoop 2.4.  You do not have to install Hadoop for this 
binary to work, but if you are using Hadoop, please change 24 to your 
appropriate version.Start the Flink 
Cluster.flink-1.0.3/bin/start-cluster.shBuilding From sourceIf you wish to 
build Flink from source, the following will be instructive.  Note that if you 
have downloaded and used the binary version this should be skipped.  The 
changing nature of build tools and versions across platforms makes this section 
somewhat precarious.  For example, Java8 and Maven 3.0.3 are recommended for 
building Flink, which are not recommended for Zeppelin at the time of writing.  
If the user wishes to attempt to build from source, this section will provide 
some reference.  If errors are encountered, please contact the Apache Flink 
community.See the Flink Installation guide for more detailed 
instructions.Return to the directory where you have been downloading, this 
tutorial assumes that is $HOME. Clone Flink,  check out relea
 se-1.0, and build.cd $HOMEgit clone https://github.com/apache/flink.gitcd 
flinkgit checkout release-1.0mvn clean install -DskipTestsStart the Flink 
Cluster in stand-alone modebuild-target/bin/start-cluster.shEnsure the cluster 
is upIn a browser, navigate to http://yourip:8082 to see the Flink Web-UI.  
Click on &amp;#39;Task Managers&amp;#39; in the left navigation bar. Ensure 
there is at least one Task Manager present.If no task managers are present, 
restart the Flink cluster with the following commands:(if 
binaries)flink-1.0.3/bin/stop-cluster.shflink-1.0.3/bin/start-cluster.sh(if 
built from 
source)build-target/bin/stop-cluster.shbuild-target/bin/start-cluster.shSpark 
1.6 ClusterDownload BinariesBuilding from source is recommended  where 
possible, for simplicity in this tutorial we will download Flink and Spark 
Binaries.Using binaries is alsoTo download the Spark Binary use wgetwget 
&amp;quot;http://mirrors.koehn.com/apache/spark/spark-1.6.1/spark-1.6.1-bin-hadoop2.4.tgz&amp;quot;t
 ar -xzvf spark-1.6.1-bin-hadoop2.4.tgzmv spark-1.6.1-bin-hadoop4.4 sparkThis 
will download Spark 1.6.1, compatible with Hadoop 2.4.  You do not have to 
install Hadoop for this binary to work, but if you are using Hadoop, please 
change 2.4 to your appropriate version.Building From sourceSpark is an 
extraordinarily large project, which takes considerable time to download and 
build. It is also prone to build failures for similar reasons listed in the 
Flink section.  If the user wishes to attempt to build from source, this 
section will provide some reference.  If errors are encountered, please contact 
the Apache Spark community.See the Spark Installation guide for more detailed 
instructions.Return to the directory where you have been downloading, this 
tutorial assumes that is $HOME. Clone Spark, check out branch-1.6, and 
build.Note: Recall, we&amp;#39;re only checking out 1.6 because it is the most 
recent Spark for which a Zeppelin profile exists at  the time of writing. You 
are free to
  check out other version, just make sure you build Zeppelin against the 
correct version of Spark.cd $HOMEClone, check out, and build Spark version 
1.6.x.git clone https://github.com/apache/spark.gitcd sparkgit checkout 
branch-1.6mvn clean package -DskipTestsStart the Spark clusterReturn to the 
$HOME directory.cd $HOMEStart the Spark cluster in stand alone mode, specifying 
the webui-port as some port other than 8080 (the webui-port of 
Zeppelin).spark/sbin/start-master.sh --webui-port 8082Note: Why --webui-port 
8082? There is a digression toward the end of this document that explains 
this.Open a browser and navigate to http://yourip:8082 to ensure the Spark 
master is running.Toward the top of the page there will be a URL: 
spark://yourhost:7077.  Note this URL, the Spark Master URI, it will be needed 
in subsequent steps.Start the slave using the URI from the Spark master 
WebUI:spark/sbin/start-slave.sh spark://yourhostname:7077Return to the root 
directory and start the Zeppelin daemon.
 cd $HOMEzeppelin/bin/zeppelin-daemon.sh startConfigure InterpretersOpen a web 
browser and go to the Zeppelin web-ui at http://yourip:8080.Now go back to the 
Zeppelin web-ui at http://yourip:8080 and this time click on anonymous at the 
top right, which will open a drop-down menu, select Interpreters to enter 
interpreter configuration.In the Spark section, click the edit button in the 
top right corner to make the property values editable (looks like a pencil).The 
only field that needs to be edited in the Spark interpreter is the master 
field. Change this value from local[*] to the URL you used to start the slave, 
mine was spark://ubuntu:7077.Click Save to update the parameters, and click OK 
when it asks you about restarting the interpreter.Now scroll down to the Flink 
section. Click the edit button and change the value of host from local to 
localhost. Click Save again.Reopen the examples and execute them again (I.e. 
you need to click the play button at the top of the screen, or the bu
 tton on the paragraph .You should be able check the Flink and Spark webuis (at 
something like http://yourip:8081, http://yourip:8082, http://yourip:8083) and 
see that jobs have been run against the clusters.Digression Sorry to be vague 
and use terms such as &amp;#39;something like&amp;#39;, but exactly what web-ui 
is at what port is going to depend on what order you started things. What is 
really going on here is you are pointing your browser at specific ports, namely 
8081, 8082, and 8083.  Flink and Spark all want to put their web-ui on port 
8080, but are well behaved and will take the next port available. Since 
Zeppelin started first, it will get port 8080.  When Flink starts (assuming you 
started Flink first), it will try to bind to port 8080, see that it is already 
taken, and go to the next one available, hopefully 8081.  Spark has a webui for 
the master and the slave, so when they start they will try to bind to 8080   
already taken by Zeppelin), then 8081 (already taken by Flin
 k&amp;#39;s webui), then 8082. If everything goes smoothy and you followed the 
directions precisely, the webuis should be 8081 and 8082.     It is possible to 
specify the port you want the webui to bind to (at the command line by passing 
the --webui-port &amp;lt;port&amp;gt; flag when you start the Flink and Spark, 
where &amp;lt;port&amp;gt; is the port     you want to see that webui on.  You 
can also set the default webui port of Spark and Flink (and Zeppelin) in the 
configuration files, but this is a tutorial for novices and slightly out of 
scope.Next StepsCheck out the tutorial for more cool things you can do with 
your new toy!Join the community, ask questions and contribute! Every little bit 
helps.",
+      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the 
&quot;License&quot;);you may not use this file except in compliance with the 
License.You may obtain a copy of the License 
athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law 
or agreed to in writing, softwaredistributed under the License is distributed 
on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, 
either express or implied.See the License for the specific language governing 
permissions andlimitations under the License.--&gt;Install with flink and spark 
clusterThis tutorial is extremely entry-level. It assumes no prior knowledge of 
Linux, git, or other tools. If you carefully type what I tell you when I tell 
you, you should be able to get Zeppelin running.Installing Zeppelin with Flink 
and Spark in cluster modeThis tutorial assumes the user has a machine (real or 
virtual with a fresh, minimal installation of Ubuntu 14.04.3 Server.Note: On 
the size requ
 irements of the Virtual Machine, some users reported trouble when using the 
default virtual machine sizes, specifically that the hard drive needed to be at 
least 16GB- other users did not have this issue.There are many good tutorials 
on how to install Ubuntu Server on a virtual box, here is one of themRequired 
ProgramsAssuming the minimal install, there are several programs that we will 
need to install before Zeppelin, Flink, and Spark.gitopenssh-serverOpenJDK 
7Maven 3.1+For git, openssh-server, and OpenJDK 7 we will be using the apt 
package manager.gitFrom the command prompt:sudo apt-get install 
gitopenssh-serversudo apt-get install openssh-serverOpenJDK 7sudo apt-get 
install openjdk-7-jdk openjdk-7-jre-libA note for those using Ubuntu 16.04: To 
install openjdk-7 on Ubuntu 16.04, one must add a repository.  Sourcesudo 
add-apt-repository ppa:openjdk-r/ppasudo apt-get updatesudo apt-get install 
openjdk-7-jdk openjdk-7-jre-libMaven 3.1+Zeppelin requires maven version 3.x.  
The version
  available in the repositories at the time of writing is 2.x, so maven must be 
installed manually.Purge any existing versions of maven.sudo apt-get purge 
maven maven2Download the maven 3.3.9 binary.wget 
&amp;quot;http://www.us.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz&amp;quot;Unarchive
 the binary and move to the /usr/local directory.tar -zxvf 
apache-maven-3.3.9-bin.tar.gzsudo mv ./apache-maven-3.3.9 /usr/localCreate 
symbolic links in /usr/bin.sudo ln -s /usr/local/apache-maven-3.3.9/bin/mvn 
/usr/bin/mvnInstalling ZeppelinThis provides a quick overview of Zeppelin 
installation from source, however the reader is encouraged to review the 
Zeppelin Installation GuideFrom the command prompt:Clone Zeppelin.git clone 
https://github.com/apache/zeppelin.gitEnter the Zeppelin root directory.cd 
zeppelinPackage Zeppelin.mvn clean package -DskipTests -Pspark-1.6 
-Dflink.version=1.1.3 -Pscala-2.10-DskipTests skips build tests- you&amp;#39;re 
not developing (yet), 
 so you don&amp;#39;t need to do tests, the clone version should 
build.-Pspark-1.6 tells maven to build a Zeppelin with Spark 1.6.  This is 
important because Zeppelin has its own Spark interpreter and the versions must 
be the same.-Dflink.version=1.1.3 tells maven specifically to build Zeppelin 
with Flink version 1.1.3.--Pscala-2.10 tells maven to build with Scala 
v2.10.Note: You may wish to include additional build flags such as -Ppyspark or 
-Psparkr.  See the build section of github for more details.Note: You can build 
against any version of Spark that has a Zeppelin build profile available. The 
key is to make sure you check out the matching version of Spark to build. At 
the time of this writing, Spark 1.6 was the most recent Spark version 
available.Note: On build failures. Having installed Zeppelin close to 30 times 
now, I will tell you that sometimes the build fails for seemingly no reason.As 
long as you didn&amp;#39;t edit any code, it is unlikely the build is failing 
because of
  something you did. What does tend to happen, is some dependency that maven is 
trying to download is unreachable.  If your build fails on this step here are 
some tips:- Don&amp;#39;t get discouraged.- Scroll up and read through the 
logs. There will be clues there.- Retry (that is, run the mvn clean package 
-DskipTests -Pspark-1.6 again)- If there were clues that a dependency 
couldn&amp;#39;t be downloaded wait a few hours or even days and retry again. 
Open source software when compiling is trying to download all of the 
dependencies it needs, if a server is off-line there is nothing you can do but 
wait for it to come back.- Make sure you followed all of the steps carefully.- 
Ask the community to help you. Go here and join the user mailing list. People 
are there to help you. Make sure to copy and paste the build output (everything 
that happened in the console) and include that in your message.Start the 
Zeppelin daemon.bin/zeppelin-daemon.sh startUse ifconfig to determine the host 
mach
 ine&amp;#39;s IP address. If you are not familiar with how to do this, a 
fairly comprehensive post can be found here.Open a web-browser on a machine 
connected to the same network as the host (or in the host operating system if 
using a virtual machine).  Navigate to http://yourip:8080, where yourip is the 
IP address you found in ifconfig.See the Zeppelin tutorial for basic Zeppelin 
usage. It is also advised that you take a moment to check out the tutorial 
notebook that is included with each Zeppelin install, and to familiarize 
yourself with basic notebook functionality.Flink TestCreate a new notebook 
named &amp;quot;Flink Test&amp;quot; and copy and paste the following 
code.%flink  // let Zeppelin know what interpreter to use.val text = 
benv.fromElements(&amp;quot;In the time of chimpanzees, I was a 
monkey&amp;quot;,   // some lines of text to analyze&amp;quot;Butane in my 
veins and I&amp;#39;m out to cut the junkie&amp;quot;,&amp;quot;With the 
plastic eyeballs, spray paint the veget
 ables&amp;quot;,&amp;quot;Dog food stalls with the beefcake 
pantyhose&amp;quot;,&amp;quot;Kill the headlights and put it in 
neutral&amp;quot;,&amp;quot;Stock car flamin&amp;#39; with a loser in the 
cruise control&amp;quot;,&amp;quot;Baby&amp;#39;s in Reno with the Vitamin 
D&amp;quot;,&amp;quot;Got a couple of couches, sleep on the love 
seat&amp;quot;,&amp;quot;Someone came in sayin&amp;#39; I&amp;#39;m insane to 
complain&amp;quot;,&amp;quot;About a shotgun wedding and a stain on my 
shirt&amp;quot;,&amp;quot;Don&amp;#39;t believe everything that you 
breathe&amp;quot;,&amp;quot;You get a parking violation and a maggot on your 
sleeve&amp;quot;,&amp;quot;So shave your face with some mace in the 
dark&amp;quot;,&amp;quot;Savin&amp;#39; all your food stamps and 
burnin&amp;#39; down the trailer park&amp;quot;,&amp;quot;Yo, cut 
it&amp;quot;)/*  The meat and potatoes:        this tells Flink to iterate 
through the elements, in this case strings,        transform the string to 
lower case and s
 plit the string at white space into individual words        then finally 
aggregate the occurrence of each word.        This creates the count variable 
which is a list of tuples of the form (word, 
occurances)counts.collect().foreach(println(_))  // execute the script and 
print each element in the counts list*/val counts = text.flatMap{ 
_.toLowerCase.split(&amp;quot;W+&amp;quot;) }.map { (_,1) 
}.groupBy(0).sum(1)counts.collect().foreach(println(_))  // execute the script 
and print each element in the counts listRun the code to make sure the built-in 
Zeppelin Flink interpreter is working properly.Spark TestCreate a new notebook 
named &amp;quot;Spark Test&amp;quot; and copy and paste the following 
code.%spark // let Zeppelin know what interpreter to use.val text = 
sc.parallelize(List(&amp;quot;In the time of chimpanzees, I was a 
monkey&amp;quot;,  // some lines of text to analyze&amp;quot;Butane in my veins 
and I&amp;#39;m out to cut the junkie&amp;quot;,&amp;quot;With the plastic eyeba
 lls, spray paint the vegetables&amp;quot;,&amp;quot;Dog food stalls with the 
beefcake pantyhose&amp;quot;,&amp;quot;Kill the headlights and put it in 
neutral&amp;quot;,&amp;quot;Stock car flamin&amp;#39; with a loser in the 
cruise control&amp;quot;,&amp;quot;Baby&amp;#39;s in Reno with the Vitamin 
D&amp;quot;,&amp;quot;Got a couple of couches, sleep on the love 
seat&amp;quot;,&amp;quot;Someone came in sayin&amp;#39; I&amp;#39;m insane to 
complain&amp;quot;,&amp;quot;About a shotgun wedding and a stain on my 
shirt&amp;quot;,&amp;quot;Don&amp;#39;t believe everything that you 
breathe&amp;quot;,&amp;quot;You get a parking violation and a maggot on your 
sleeve&amp;quot;,&amp;quot;So shave your face with some mace in the 
dark&amp;quot;,&amp;quot;Savin&amp;#39; all your food stamps and 
burnin&amp;#39; down the trailer park&amp;quot;,&amp;quot;Yo, cut 
it&amp;quot;))/*  The meat and potatoes:        this tells spark to iterate 
through the elements, in this case strings,        transform the
  string to lower case and split the string at white space into individual 
words        then finally aggregate the occurrence of each word.        This 
creates the count variable which is a list of tuples of the form (word, 
occurances)*/val counts = text.flatMap { 
_.toLowerCase.split(&amp;quot;W+&amp;quot;) }                 .map { (_,1) }    
             .reduceByKey(_ + _)counts.collect().foreach(println(_))  // 
execute the script and print each element in the counts listRun the code to 
make sure the built-in Zeppelin Flink interpreter is working properly.Finally, 
stop the Zeppelin daemon.  From the command prompt run:bin/zeppelin-daemon.sh 
stopInstalling ClustersFlink ClusterDownload BinariesBuilding from source is 
recommended  where possible, for simplicity in this tutorial we will download 
Flink and Spark Binaries.To download the Flink Binary use wgetwget 
&amp;quot;http://mirror.cogentco.com/pub/apache/flink/flink-1.1.3/flink-1.1.3-bin-hadoop24-scala_2.10.tgz&amp;quot;tar
 -xzvf 
 flink-1.1.3-bin-hadoop24-scala_2.10.tgzThis will download Flink 1.1.3, 
compatible with Hadoop 2.4.  You do not have to install Hadoop for this binary 
to work, but if you are using Hadoop, please change 24 to your appropriate 
version.Start the Flink Cluster.flink-1.1.3/bin/start-cluster.shBuilding From 
sourceIf you wish to build Flink from source, the following will be 
instructive.  Note that if you have downloaded and used the binary version this 
should be skipped.  The changing nature of build tools and versions across 
platforms makes this section somewhat precarious.  For example, Java8 and Maven 
3.0.3 are recommended for building Flink, which are not recommended for 
Zeppelin at the time of writing.  If the user wishes to attempt to build from 
source, this section will provide some reference.  If errors are encountered, 
please contact the Apache Flink community.See the Flink Installation guide for 
more detailed instructions.Return to the directory where you have been 
downloading, 
 this tutorial assumes that is $HOME. Clone Flink,  check out 
release-1.1.3-rc2, and build.cd $HOMEgit clone 
https://github.com/apache/flink.gitcd flinkgit checkout release-1.1.3-rc2mvn 
clean install -DskipTestsStart the Flink Cluster in stand-alone 
modebuild-target/bin/start-cluster.shEnsure the cluster is upIn a browser, 
navigate to http://yourip:8082 to see the Flink Web-UI.  Click on &amp;#39;Task 
Managers&amp;#39; in the left navigation bar. Ensure there is at least one Task 
Manager present.If no task managers are present, restart the Flink cluster with 
the following commands:(if 
binaries)flink-1.1.3/bin/stop-cluster.shflink-1.1.3/bin/start-cluster.sh(if 
built from 
source)build-target/bin/stop-cluster.shbuild-target/bin/start-cluster.shSpark 
1.6 ClusterDownload BinariesBuilding from source is recommended  where 
possible, for simplicity in this tutorial we will download Flink and Spark 
Binaries.Using binaries is alsoTo download the Spark Binary use wgetwget 
&amp;quot;http://d3kbc
 qa49mib13.cloudfront.net/spark-1.6.3-bin-hadoop2.6.tgz&amp;quot;tar -xzvf 
spark-1.6.3-bin-hadoop2.6.tgzmv spark-1.6.3-bin-hadoop2.6 sparkThis will 
download Spark 1.6.3, compatible with Hadoop 2.6.  You do not have to install 
Hadoop for this binary to work, but if you are using Hadoop, please change 2.6 
to your appropriate version.Building From sourceSpark is an extraordinarily 
large project, which takes considerable time to download and build. It is also 
prone to build failures for similar reasons listed in the Flink section.  If 
the user wishes to attempt to build from source, this section will provide some 
reference.  If errors are encountered, please contact the Apache Spark 
community.See the Spark Installation guide for more detailed 
instructions.Return to the directory where you have been downloading, this 
tutorial assumes that is $HOME. Clone Spark, check out branch-1.6, and 
build.Note: Recall, we&amp;#39;re only checking out 1.6 because it is the most 
recent Spark for which a
  Zeppelin profile exists at  the time of writing. You are free to check out 
other version, just make sure you build Zeppelin against the correct version of 
Spark. However if you use Spark 2.0, the word count example will need to be 
changed as Spark 2.0 is not compatible with the following examples.cd 
$HOMEClone, check out, and build Spark version 1.6.x.git clone 
https://github.com/apache/spark.gitcd sparkgit checkout branch-1.6mvn clean 
package -DskipTestsStart the Spark clusterReturn to the $HOME directory.cd 
$HOMEStart the Spark cluster in stand alone mode, specifying the webui-port as 
some port other than 8080 (the webui-port of 
Zeppelin).spark/sbin/start-master.sh --webui-port 8082Note: Why --webui-port 
8082? There is a digression toward the end of this document that explains 
this.Open a browser and navigate to http://yourip:8082 to ensure the Spark 
master is running.Toward the top of the page there will be a URL: 
spark://yourhost:7077.  Note this URL, the Spark Master URI, it w
 ill be needed in subsequent steps.Start the slave using the URI from the Spark 
master WebUI:spark/sbin/start-slave.sh spark://yourhostname:7077Return to the 
root directory and start the Zeppelin daemon.cd 
$HOMEzeppelin/bin/zeppelin-daemon.sh startConfigure InterpretersOpen a web 
browser and go to the Zeppelin web-ui at http://yourip:8080.Now go back to the 
Zeppelin web-ui at http://yourip:8080 and this time click on anonymous at the 
top right, which will open a drop-down menu, select Interpreters to enter 
interpreter configuration.In the Spark section, click the edit button in the 
top right corner to make the property values editable (looks like a pencil).The 
only field that needs to be edited in the Spark interpreter is the master 
field. Change this value from local[*] to the URL you used to start the slave, 
mine was spark://ubuntu:7077.Click Save to update the parameters, and click OK 
when it asks you about restarting the interpreter.Now scroll down to the Flink 
section. Click the
  edit button and change the value of host from local to localhost. Click Save 
again.Reopen the examples and execute them again (I.e. you need to click the 
play button at the top of the screen, or the button on the paragraph .You 
should be able check the Flink and Spark webuis (at something like 
http://yourip:8081, http://yourip:8082, http://yourip:8083) and see that jobs 
have been run against the clusters.Digression Sorry to be vague and use terms 
such as &amp;#39;something like&amp;#39;, but exactly what web-ui is at what 
port is going to depend on what order you started things. What is really going 
on here is you are pointing your browser at specific ports, namely 8081, 8082, 
and 8083.  Flink and Spark all want to put their web-ui on port 8080, but are 
well behaved and will take the next port available. Since Zeppelin started 
first, it will get port 8080.  When Flink starts (assuming you started Flink 
first), it will try to bind to port 8080, see that it is already taken, and go t
 o the next one available, hopefully 8081.  Spark has a webui for the master 
and the slave, so when they start they will try to bind to 8080   already taken 
by Zeppelin), then 8081 (already taken by Flink&amp;#39;s webui), then 8082. If 
everything goes smoothy and you followed the directions precisely, the webuis 
should be 8081 and 8082.     It is possible to specify the port you want the 
webui to bind to (at the command line by passing the --webui-port 
&amp;lt;port&amp;gt; flag when you start the Flink and Spark, where 
&amp;lt;port&amp;gt; is the port     you want to see that webui on.  You can 
also set the default webui port of Spark and Flink (and Zeppelin) in the 
configuration files, but this is a tutorial for novices and slightly out of 
scope.Next StepsCheck out the tutorial for more cool things you can do with 
your new toy!Join the community, ask questions and contribute! Every little bit 
helps.",
       "url": " /quickstart/install_with_flink_and_spark_cluster.html",
       "group": "tutorial",
       "excerpt": "Tutorial is valid for Spark 1.6.x and Flink 1.1.2"

svn commit: r1774779 - in /zeppelin/site/docs/0.7.0-SNAPSHOT: atom.xml interpreter/flink.html quickstart/install_with_flink_and_spark_cluster.html rss.xml search_data.json

Reply via email to