Ahyoung created ZEPPELIN-1332:
---------------------------------
Summary: Removing spark-dependencies
Key: ZEPPELIN-1332
URL: https://issues.apache.org/jira/browse/ZEPPELIN-1332
Project: Zeppelin
Issue Type: Improvement
Reporter: Ahyoung
Assignee: Ahyoung
Fix For: 0.7.0
*Why?*
The latest version of Zeppelin whole package size is over 500MB. More and more
interpreters are added, the size becomes bigger. Comparing to Spark binary
package size(spark-2.0.0-bin-hadoop2.7.tgz is 178MB &
spark-2.0.0-bin-without-hadoop.tzg is 109MB), Zeppelin package size is quite
huge. And ㅡany Spark interpreter users are using their own Spark not Zeppelin's
embedded one. So they don't need to include spark-dependencies. Actually the
first possibility was suggested in
[PR#1115|https://github.com/apache/zeppelin/pull/1115] by [~jongyoul] regarding
this issue.
*New suggestion*
I know Zeppelin's embedded Spark is very useful to Zeppelin beginner. Because
they don't need to download Spark or set SPARK_HOME by themselves when they
want to use Spark interpreter in Zeppelin. So I would like to suggest to
download Spark binary package(maybe spark-2.0.0-bin-hadoop2.7.tgz?) from mirror
site using shell script instead of just removing spark-dependencies/pom.xml.
This shell script will check the existence of SPARK_HOME. If SPARK_HOME isn't
set yet, then download Spark binary package when users start Zeppelin daemon.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)