I think making the 'zeppelin-bin-netinst' is greate idea and more make sense than voting which interpreter should be included to zeppelin-bin-min.
2016-06-18 1:15 GMT+09:00 moon soo Lee <m...@apache.org>: > In case of no internet access, how about > > a. download 'zeppelin-bin-netinst' and run 'bin/install-interpreter.sh', > and then copy the package to production env. > b. download 'zeppelin-bin-all' and copy the package to production env. > > ? > > Thanks, > moon > > > On Fri, Jun 17, 2016 at 9:10 AM Mohit Jaggi <mohitja...@gmail.com> wrote: > >> Many production environments have no internet access. A script like this >> can be useful to some but it should not replace the proposed min binary. >> >> Sent from my iPhone >> >> On Jun 17, 2016, at 9:20 PM, moon soo Lee <m...@apache.org> wrote: >> >> Hi, >> >> Thanks for bringing this discussion. >> it's great idea minimize binary package size. >> >> Can we set a policy to decide which interpreter goes to >> 'zeppelin-bin-min', which is not? >> >> One alternative is, instead of making 'zeppelin-bin-min', we can make >> 'zeppelin-bin-netinst'. >> We can provide a shell script such as, 'bin/install-interpreter.sh' and >> the script will download interpreters and their dependencies from maven >> repository and store under /interpreter dir. By leveraging >> DependencyResolver[1], >> i think we can make this feature in couple of hours. >> >> Only spark interpreter can not be installed in simple way, while it >> requires some python and R packages under /interpreter dir and they're not >> available on maven repository, so it'll need special treatment, but all >> other interpreters can be installed in the simple way. >> >> Then, 'zeppelin-bin-netinst' version can have minimal package size, and >> still gives easy way to install all the interpreters. >> Also 'bin/install-interpreter.sh' will still useful even if we have >> dynamic interpreter loading feature [2], to build offline package. >> >> what do you think? >> >> [1] >> https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/dep/DependencyResolver.java >> [2] https://issues.apache.org/jira/browse/ZEPPELIN-598 >> >> >> On Fri, Jun 17, 2016 at 1:02 AM mina lee <mina...@apache.org> wrote: >> >>> Hi all! >>> >>> Zeppelin just started release process. Prior to creating release >>> candidate I want to ask users' opinion about how you want it to be packaged. >>> >>> For the last release(0.5.6), we have released one binary package which >>> includes all interpreters. >>> The concern with providing one type of binary package is that package >>> size will be quite big(~600MB). >>> So I am planning to provide two binary packages: >>> - zeppelin-0.6.0-bin-all.tgz (includes all interpreters) >>> - zeppelin-0.6.0-bin-min.tgz (includes only most used interpreters) >>> >>> I am thinking about putting *spark(pyspark, sparkr, sql), python, jdbc, >>> shell, markdown, angular* in minimized package. >>> Could you give your opinion on whether these sets are enough, or some of >>> them are ok to be excluded? >>> >>> Community's opinion will be helpful to make decision not only for 0.6.0 >>> but also for 0.7.0 release since we are planning to provide only minimized >>> package from 0.7.0 release. From the 0.7.0 version, interpreters those are >>> not included in binary package will be able to use dynamic interpreter >>> feature [1] which is in progress under [2]. >>> >>> Thanks, >>> Mina >>> >>> [1] >>> http://zeppelin.apache.org/docs/0.6.0-SNAPSHOT/manual/dynamicinterpreterload.html >>> [2] https://github.com/apache/zeppelin/pull/908 >>> >>