Re: Yarn + Spark + Zepplin ?

Christian Tzolov Mon, 03 Aug 2015 10:16:15 -0700

ÐΞ€ρ@Ҝ (๏̯͡๏),

I've successfully run Zeppelin with Spark on YARN. I'm using Ambari and
PivotalHD30. PHD30 is ODP compliant so you should be able to repeat the
configuration for HDP (e.g. hortonworks).

1. Before you start with Zeppelin, make sure that your Spark/YARN env.
works from the command line (e.g run Pi test). If it doesn't work make sure
that the hdp.version is set correctly or you can hardcode the stack.name
and stack.version properties as Ambari Custom yarn-site properties (that is
what i did).

2. Your Zeppelin should be build with proper Spark and Hadoop versions and
YARN support enabled. In my case used this build command:

mvn clean package -Pspark-1.4 -Dspark.version=1.4.1 -Dhadoop.version=2.6.0
-Phadoop-2.6 -Pyarn -DskipTests -Pbuild-distr

3. Open the Spark interpreter configuration and set 'master' property to
'yarn-client' ( e.g. master=yarn-client). then press Save.

4. In (conf/zeppelin-env.sh) set HADOOP_CONF_DIR for PHD and HDP it will
look like this:
export HADOOP_CONF_DIR=/etc/hadoop/conf

5. (optional) i've restarted the zeppelin daemon but i don't think this is
required.

6. Make sure that HDFS has /user/<zeppelin user>  folder exists and has
HDFS write permissions. Otherwise you can create it like this:
  sudo -u hdfs hdfs dfs -mkdir /user/<zeppelin user>
  sudo -u hdfs hdfs dfs -chown -R <zeppelin user>t:hdfs /user/<zeppelin
user>

Good to go!

Cheers,
Christian

On 3 August 2015 at 17:50, Vadla, Karthik <karthik.va...@intel.com> wrote:

> Hi Deepak,
>
>
>
> I have documented everything here.
>
> Please check published document.
>
>
> https://software.intel.com/sites/default/files/managed/bb/bf/Apache-Zeppelin.pdf
>
>
>
> Thanks
>
> Karthik Vadla
>
>
>
> *From:* ÐΞ€ρ@Ҝ (๏̯͡๏) [mailto:deepuj...@gmail.com]
> *Sent:* Sunday, August 2, 2015 9:25 PM
> *To:* users@zeppelin.incubator.apache.org
> *Subject:* Yarn + Spark + Zepplin ?
>
>
>
> Hello,
>
> I would like to try out Zepplin and hence i got a 7 node Hadoop cluster
> with spark history server setup. I am able to run sample spark applications
> on my YARN cluster.
>
>
>
> I have no clue how to get zepplin to connect to this YARN cluster. Under
> https://zeppelin.incubator.apache.org/docs/install/install.html i see MASTER
> to point to spark master. I do not have a spark master running.
>
>
>
> How do i get Zepplin to be able to read data from YARN cluster ? Please
> share documentation.
>
>
>
> Regards,
>
> Deepak
>
>
>

-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517

Re: Yarn + Spark + Zepplin ?

Reply via email to