I think making the 'zeppelin-bin-netinst' is greate idea and more make
sense than voting which interpreter should be included to zeppelin-bin-min.


2016-06-18 1:15 GMT+09:00 moon soo Lee <m...@apache.org>:

> In case of no internet access, how about
>
> a. download 'zeppelin-bin-netinst' and run 'bin/install-interpreter.sh',
> and then copy the package to production env.
> b. download 'zeppelin-bin-all' and copy the package to production env.
>
> ?
>
> Thanks,
> moon
>
>
> On Fri, Jun 17, 2016 at 9:10 AM Mohit Jaggi <mohitja...@gmail.com> wrote:
>
>> Many production environments have no internet access. A script like  this
>> can be useful to some but it should not replace the proposed min binary.
>>
>> Sent from my iPhone
>>
>> On Jun 17, 2016, at 9:20 PM, moon soo Lee <m...@apache.org> wrote:
>>
>> Hi,
>>
>> Thanks for bringing this discussion.
>> it's great idea minimize binary package size.
>>
>> Can we set a policy to decide which interpreter goes to
>> 'zeppelin-bin-min', which is not?
>>
>> One alternative is, instead of making 'zeppelin-bin-min', we can make
>> 'zeppelin-bin-netinst'.
>> We can provide a shell script such as, 'bin/install-interpreter.sh' and
>> the script will download interpreters and their dependencies from maven
>> repository and store under /interpreter dir. By leveraging 
>> DependencyResolver[1],
>> i think we can make this feature in couple of hours.
>>
>> Only spark interpreter can not be installed in simple way, while it
>> requires some python and R packages under /interpreter dir and they're not
>> available on maven repository, so it'll need special treatment, but all
>> other interpreters can be installed in the simple way.
>>
>> Then, 'zeppelin-bin-netinst' version can have minimal package size, and
>> still gives easy way to install all the interpreters.
>> Also 'bin/install-interpreter.sh' will still useful even if we have
>> dynamic interpreter loading feature [2], to build offline package.
>>
>> what do you think?
>>
>> [1]
>> https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/dep/DependencyResolver.java
>> [2] https://issues.apache.org/jira/browse/ZEPPELIN-598
>>
>>
>> On Fri, Jun 17, 2016 at 1:02 AM mina lee <mina...@apache.org> wrote:
>>
>>> Hi all!
>>>
>>> Zeppelin just started release process. Prior to creating release
>>> candidate I want to ask users' opinion about how you want it to be packaged.
>>>
>>> For the last release(0.5.6), we have released one binary package which
>>> includes all interpreters.
>>> The concern with providing one type of binary package is that package
>>> size will be quite big(~600MB).
>>> So I am planning to provide two binary packages:
>>>   - zeppelin-0.6.0-bin-all.tgz (includes all interpreters)
>>>   - zeppelin-0.6.0-bin-min.tgz (includes only most used interpreters)
>>>
>>> I am thinking about putting *spark(pyspark, sparkr, sql), python, jdbc,
>>> shell, markdown, angular* in minimized package.
>>> Could you give your opinion on whether these sets are enough, or some of
>>> them are ok to be excluded?
>>>
>>> Community's opinion will be helpful to make decision not only for 0.6.0
>>> but also for 0.7.0 release since we are planning to provide only minimized
>>> package from 0.7.0 release. From the 0.7.0 version, interpreters those are
>>> not included in binary package will be able to use dynamic interpreter
>>> feature [1] which is in progress under [2].
>>>
>>> Thanks,
>>> Mina
>>>
>>> [1]
>>> http://zeppelin.apache.org/docs/0.6.0-SNAPSHOT/manual/dynamicinterpreterload.html
>>> [2] https://github.com/apache/zeppelin/pull/908
>>>
>>

Reply via email to