Re: HA for Zeppelin

ashish rawat Fri, 08 Apr 2016 00:10:10 -0700

Sounds great. How long have you been using glusterfs in prod? and have you
encountered any challenges. The only difficulty for me to use it, would be
a lack of expertise to fix broken things, so hope it's stability isn't
something to be concerned about.


Regards,
Ashish

On Fri, Apr 8, 2016 at 12:20 PM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> use fuse interface. Gluster volume is directly accessible as local storage
> on all nodes but performance is only 200 Mb/s. More than enough for
> notebooks. For data prefer tachyon/alluxio on top of gluster...
> Le 8 avr. 2016 6:35 AM, "ashish rawat" <dceash...@gmail.com> a écrit :
>
>> Thanks Eran and Vincent.
>> Eran, I would definitely like to try it out, since it won't add to the
>> complexity of my deployment. Would see the S3 implementation, to figure out
>> how complex it would be.
>>
>> Vincent,
>> I haven't explored glusterfs at all. Would it also require to write an
>> implementation of storage interface? Or zeppelin can work with it, out of
>> the box?
>>
>> Regards,
>> Ashish
>>
>> On Wed, Apr 6, 2016 at 12:53 PM, vincent gromakowski <
>> vincent.gromakow...@gmail.com> wrote:
>>
>>> For 1 marathon on mesos restart zeppelin daemon In case of failure.
>>> For 2 glusterfs fuse mount allows to share notebooks on all mesos nodes.
>>> For 3 not available right now In our  design but a manual restart In
>>> zeppelin config page is acceptable for US.
>>> Le 6 avr. 2016 8:18 AM, "Eran Witkon" <eranwit...@gmail.com> a écrit :
>>>
>>>> Yes this is correct.
>>>> For HA disk, if you don't have HA storage and no access to S3 then
>>>> AFAIK you don't have other option at the moment.
>>>> If you like to save notebooks to elastic then I suggest you look at the
>>>> storage interface and implementation for git and s3 and implement that
>>>> yourself. It does sound like an interesting feature
>>>> Best
>>>> Eran
>>>> On Wed, 6 Apr 2016 at 08:57 ashish rawat <dceash...@gmail.com> wrote:
>>>>
>>>>> Thanks Eran. So 3, seems to be something external to Zeppelin, and
>>>>> hopefully 1 only means running "zeppelin-daemon.sh start" on a slave
>>>>> machine, when master become inaccessible. Is that correct?
>>>>>
>>>>> My main concern still remains on the storage front. And I don't really
>>>>> have high availability disks or even hdfs in my setup. I have been using
>>>>> elastic search cluster for data high availability, but was hoping that
>>>>> zeppelin can save notebooks to a Elastic Search (like kibana) or maybe a
>>>>> document store.
>>>>>
>>>>> Any idea if anything is planned in that direction. Don't want to
>>>>> fallback to 'rsync' like options.
>>>>>
>>>>> Regards,
>>>>> Ashish
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Apr 5, 2016 at 11:17 PM, Eran Witkon <eranwit...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> For 1 you need to have both zeppelin web HA and zeppelin deamon HA
>>>>>> For 2 I guess you can use HDFS if you implement the storage interface
>>>>>> for HDFS. But i am not sure.
>>>>>> For 3 I mean that if you connect to an external cluster for example a
>>>>>> spark cluster you need to make sure your spark cluster is HA. Otherwise 
>>>>>> you
>>>>>> will have zeppelin running but your notebook will fail as no spark 
>>>>>> cluster
>>>>>> available.
>>>>>> HTH
>>>>>> Eran
>>>>>>
>>>>>>
>>>>>> On Tue, 5 Apr 2016 at 20:20 ashish rawat <dceash...@gmail.com> wrote:
>>>>>>
>>>>>>> Thanks Eran for your reply.
>>>>>>> For 1) I am assuming that it would similar to HA of any other web
>>>>>>> application, i.e. running multiple instances and switching to the backup
>>>>>>> server when master is down, is it not the case?
>>>>>>> For 2) is it also possible to save it on hdfs?
>>>>>>> Can you please explain 3, are you referring to interpreter config?
>>>>>>> If I am using Spark interpreter and submitting jobs to it, and if 
>>>>>>> zeppelin
>>>>>>> master node goes down, then what could be the problem in slave node
>>>>>>> pointing to the same cluster and submitting jobs?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Ashish
>>>>>>>
>>>>>>> On Tue, Apr 5, 2016 at 10:08 PM, Eran Witkon <eranwit...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I would say you need to account for these things
>>>>>>>> 1) availability of the zeppelin deamon
>>>>>>>> 2) availability of the notebookd files
>>>>>>>> 3) availability of the interpreters used.
>>>>>>>>
>>>>>>>> For 1 i don't know of out-of-box solution
>>>>>>>> For 2 any ha storage will do, s3 or any ha external mounted disk
>>>>>>>> For 3 it is up to the interpreter and your big data ha solution
>>>>>>>>
>>>>>>>> On Tue, 5 Apr 2016 at 19:29 ashish rawat <dceash...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Is there a suggested architecture to run Zeppelin in high
>>>>>>>>> availability mode. The only option I could find was by saving 
>>>>>>>>> notebooks to
>>>>>>>>> S3. Are there any options if one is not using AWS?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Ashish
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>

Re: HA for Zeppelin

Reply via email to