Hi,

yes, you are doing everything right and this is an expected behavior - the
"Application" that you see in YARN UI is a Spark Master, it has UI and a
history of executed jobs, scheduled from Zeppelin.


On Tue, Jun 9, 2015 at 5:57 AM, prateek arora <[email protected]>
wrote:

>
>
> Hi
>
>
>
> I am running zeppelin with yarn-client mode.
>
> my hadoop cluster running remotely  with CDH5.4.0  ( spark1.3.0 ) and my
> spark cluster  is  yarn based.
>
>
>
> zeppelin Installation steps:
>
> git clone https://github.com/apache/incubator-zeppelin
>
> mvn clean package -Pspark-1.3   -Dhadoop.version=2.6.0-cdh5.4.0
> -Phadoop-2.6  -Pyarn -DskipTests
>
>
>
> Added  below lines in conf/zeppelin-env.sh :
>
>
>
> export MASTER=yarn-client
>
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop/
>
>
>
>
>
> so when I run sample program
>
>
>
> %spark
>
> val textFile = sc.textFile("hdfs://master:8020/user/prateek/bigdata.csv",
> 1)
>
> textFile.count
>
>
>
> its show result:
>
> textFile: org.apache.spark.rdd.RDD[String] =
> hdfs://master:8020/user/prateek/bigdata.csv MapPartitionsRDD[1] at textFile
> at <console>:23
>
> res0: Long = 114955604
>
>
>
> Also zeppelin application entry shows on resource manager (
> http://ip-address:8080/)
>
>
>
> I observe below scenario after execution of sample program:
>
>
>
>    - At zeppelin web page show application status is finished but
>    resource manager always show status is running.
>
>
>
>
>
> [image: Inline image 1]
>
>
>
>
>    - If I run other sample program like
>
> %spark
>
>
>
> val textFile =
> sc.textFile("hdfs://master:8020/user/ubuntu/Amalgam_row.csv")
>
> textFile.count
>
>
>
> then there is no new application entry show in resource manager and
> zeppelin execute program and show result.
>
>
>
>
> *Is above scenario's are default behavior of zeppelin or am I doing
> anything wrong, please suggest?*
>
>
>
> Regards
>
> Prateek
>
>
>
>
>
>
>



-- 
--
Kind regards,
Alexander.

Reply via email to