[GitHub] zeppelin pull request #1339: [ZEPPELIN-1332] Remove spark-dependencies & sug...

AhyoungRyu Tue, 08 Nov 2016 03:09:10 -0800

GitHub user AhyoungRyu reopened a pull request:

    https://github.com/apache/zeppelin/pull/1339


    [ZEPPELIN-1332] Remove spark-dependencies & suggest new way

    ### What is this PR for?
    Currently, Zeppelin's embedded Spark is located under `interpreter/spark/`. 
    For whom **builds Zeppelin from source**, this Spark is downloaded when 
they build the source with [build 
profiles](https://github.com/apache/zeppelin#spark-interpreter). I think this 
various build profiles are useful to customize the embedded Spark, but many 
Spark users are using their own Spark not Zeppelin's embedded one. Nowadays 
only Spark&Zeppelin beginners use this embedded Spark. For them, there are too 
many build profiles(it's so complicated i think). 
    In case of **Zeppelin binary package**, it's included by default under 
`interpreter/spark/`. That's why Zeppelin package size is so huge. 
    
    #### New suggestions
    This PR will change the embedded Spark binary downloading mechanism like 
below.
    
    1. `./bin/zeppelin-daemon.sh get-spark` or `./bin/zeppelin.sh get-spark`
    2. create `ZEPPELIN_HOME/local-spark/` and will download 
`spark-2.0.1-hadoop2.7.bin.tgz` and untar 
    3. we can use this local spark without any configuration like before (e.g. 
setting `SPARK_HOME`)
    
    ### What type of PR is it?
    Improvement
    
    ### Todos
    - [x] - trap `ctrl+c` & `ctrl+z` key interruption during downloading Spark
    - [x] - test in the different OS 
    - [x] - update related document pages again after get feedbacks
    
    ### What is the Jira issue?
    [ZEPPELIN-1332](https://issues.apache.org/jira/browse/ZEPPELIN-1332)
    
    ### How should this be tested?
    1. `rm -r spark-dependencies` 
    2.  Apply this patch and build with `mvn clean package -DskipTests`
    3. try`bin/zeppelin-daemon.sh get-spark` or `bin/zeppelin.sh get-spark`
    
    ### Screenshots (if appropriate)
    - `./bin/zeppelin-daemon.sh get-spark`
    ```
    $ ./bin/zeppelin-daemon.sh get-spark
    Download spark-2.0.1-bin-hadoop2.7.tgz from mirror ...
    
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  
Current
                                     Dload  Upload   Total   Spent    Left  
Speed
    100  178M  100  178M    0     0  10.4M      0  0:00:17  0:00:17 --:--:-- 
10.2M
    
    spark-2.0.1-bin-hadoop2.7 is successfully downloaded and saved under 
/Users/ahyoungryu/Dev/zeppelin-development/zeppelin/local-spark
    ```
    - if `ZEPPELIN_HOME/local-spark/spark-2.0.1-hadoop2.7` already exists
    ```
    $ ./bin/zeppelin-daemon.sh get-spark
    spark-2.0.1-bin-hadoop2.7 already exists under local-spark.
    ```
    
    ### Questions:
    - Does the licenses files need update? no
    - Is there breaking changes for older versions? no
    - Does this needs documentation? Need to update some related documents 
(e.g. README.md, spark.md and install.md ?)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/AhyoungRyu/zeppelin ZEPPELIN-1332

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/zeppelin/pull/1339.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1339
    
----
commit cf91a45420ea3047522998238beba274db9a5fca
Author: AhyoungRyu <[email protected]>
Date:   2016-08-16T15:08:19Z

    Fix typo comment in interpreter.sh

commit 6c08f5207cb2286b6072b3dcd5cc882b4dbca39b
Author: AhyoungRyu <[email protected]>
Date:   2016-08-17T01:52:06Z

    Remove spark-dependencies

commit a36702f8b35d7ee0d269190fe42ac8a2ff5d5b6e
Author: AhyoungRyu <[email protected]>
Date:   2016-08-17T07:14:35Z

    Add spark-2.*-bin-hadoop* to .gitignore

commit 31b04f58491502d5b4ea7c1800f7606013a8ae74
Author: AhyoungRyu <[email protected]>
Date:   2016-08-17T15:22:25Z

    Add download-spark.sh file

commit fd87a09d83000c94ced1b04b4254de9b35e4ccc5
Author: AhyoungRyu <[email protected]>
Date:   2016-08-17T15:28:51Z

    Remove useless comment line in common.sh

commit e0fc280de061f7ee06603d5bc9ab41b5219a749d
Author: AhyoungRyu <[email protected]>
Date:   2016-08-18T03:32:11Z

    Remove zeppelin-spark-dependencies from r/pom.xml

commit bf06931b988aee4d9dfc3c173cac18a740666e36
Author: AhyoungRyu <[email protected]>
Date:   2016-08-21T05:38:55Z

    Change SPARK_HOME with proper message

commit dceb74fff19eac2071eed0d661c0571eceeada54
Author: AhyoungRyu <[email protected]>
Date:   2016-09-06T08:55:20Z

    Check interpreter/spark/ instead of SPARK_HOME

commit e2a078ab87deba8cdf4a99f6c3642e3d4b41f3d8
Author: AhyoungRyu <[email protected]>
Date:   2016-09-06T08:55:40Z

    Refactor download-spark.sh

commit 3c792d07c6d6b55896ae5b0e3e2b0d08f70fafb1
Author: AhyoungRyu <[email protected]>
Date:   2016-09-07T07:48:15Z

    Revert: remove spark-dependencies

commit 1071566f442b9cf01c7145fd9dcbb48eb343f81a
Author: AhyoungRyu <[email protected]>
Date:   2016-09-07T13:23:11Z

    Remove useless ZEPPELIN_HOME

commit 0c7e1b73299634f9fc5c579c54d4cff49449f910
Author: AhyoungRyu <[email protected]>
Date:   2016-09-08T05:51:40Z

    Change dir of Spark bin to 'local-spark'

commit 787cec50ce796ddae9a1302e1ce376b2f3e5c5be
Author: AhyoungRyu <[email protected]>
Date:   2016-09-08T06:07:20Z

    Set timeout for travis test

commit b5fc541a96d17db513dcd7d5c1ec5671e85733f0
Author: AhyoungRyu <[email protected]>
Date:   2016-09-08T06:16:54Z

    Add license header to download-spark.cmd

commit c4d39f1df4dfeed1ad8544fe75621dc1aac693da
Author: AhyoungRyu <[email protected]>
Date:   2016-09-08T11:48:43Z

    Fix wrong check condition in common.sh

commit 5c631477133253506490744abb54a0582a066f6c
Author: AhyoungRyu <[email protected]>
Date:   2016-09-08T13:14:29Z

    Add travis condition to download-spark.sh

commit e91e7f83da0c44d91f53ae94a9d2f7f8117b86ae
Author: AhyoungRyu <[email protected]>
Date:   2016-09-12T05:42:29Z

    Remove bin/download-spark.cmd again

commit f40fd2f13071647e812ac54278b1fbff87b808e7
Author: AhyoungRyu <[email protected]>
Date:   2016-09-12T16:25:31Z

    Remove spark-dependency profiles & reorganize some titles in README.md

commit 31ebd191203139ee6f6bd794375c64c4f66cd28a
Author: AhyoungRyu <[email protected]>
Date:   2016-09-12T18:30:41Z

    Update spark.md to add a guide for local-spark mode

commit 803f21cbfff07deaa7cd5d5be1e423b4db4802c7
Author: AhyoungRyu <[email protected]>
Date:   2016-09-12T18:49:49Z

    Remove '-Ppyspark' build options

commit d5882554562e0244cb063186630b9e952fdf1c1c
Author: AhyoungRyu <[email protected]>
Date:   2016-09-13T08:09:18Z

    Remove useless creating .bak file process

commit b7a91453255cced389bc639e27dc8b2232afd19f
Author: AhyoungRyu <[email protected]>
Date:   2016-09-13T11:21:10Z

    Update install.md & spark.md

commit 63f29e91c3bd22df44aae91929468ea6a9516474
Author: AhyoungRyu <[email protected]>
Date:   2016-09-14T09:35:37Z

    Resolve 'sed' command issue between OSX & Linux

commit 6e329a7b832cf9e526b72aa5e3eb32ab697ebfd7
Author: AhyoungRyu <[email protected]>
Date:   2016-09-14T11:20:31Z

    Trap ctrl+c during downloading Spark

commit 1205f2d67f8116353f30128b367cebe2d35fd344
Author: AhyoungRyu <[email protected]>
Date:   2016-09-14T11:26:56Z

    Remove useless condition

commit ff069af7023132ea4478ad38ea1164176b753ab9
Author: AhyoungRyu <[email protected]>
Date:   2016-09-20T17:05:16Z

    Make local spark mode with zero-configuration as @moon suggested

commit c818cf766a60ef432d5310a664aeded7d9a58ab3
Author: AhyoungRyu <[email protected]>
Date:   2016-09-22T06:47:05Z

    Put 'autodetect HADOOP_CONF_HOME by heuristic' back code blocks

commit b2dca36e25b03ac56a9ee221c1dff2d1ed105c95
Author: AhyoungRyu <[email protected]>
Date:   2016-09-22T14:20:31Z

    Modify SparkRInterpreter.java to enable SparkR without SPARK_HOME

commit 310d607564e156b85a687e37c2c1d14d00ad1348
Author: AhyoungRyu <[email protected]>
Date:   2016-09-22T17:01:40Z

    Remove duplicated variable declaration

commit 1ee4325aea1f761765018e15396860e9ca2bc538
Author: AhyoungRyu <[email protected]>
Date:   2016-09-22T17:02:01Z

    Update related docs again

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin pull request #1339: [ZEPPELIN-1332] Remove spark-dependencies & sug...

Reply via email to