Re: HA for Zeppelin

Johnny W. Mon, 11 Apr 2016 17:38:05 -0700

Thanks John for your insights.

For 2., one solution we have experimented is spark dynamic resource
allocation. We could define a timer to scale down. Hope that helps.


J.

On Mon, Apr 11, 2016 at 4:24 PM, John Omernik <j...@omernik.com> wrote:

> 1. Things launch pretty fast for me, however, it depends if the docker
> container I am running Zeppelin in is cached on the node mesos wants to run
> it on. If not, it pulls from a local docker registry, so worst case, up to
> a minute to get things running if the image isn't cached.
>
> 2. No, if the user logs out it stays running.  Ideally I would want to
> setup some sort of timer that could scale down an instance if left unused.
> I have some ideas here, but haven't put them into practice yet.   I wanted
> to play with Nginx to see if I could do something there (lack of activity
> causes Nginx to shutdown Zeppelin for example). With spark resources, one
> thing I wanted to play with using fine grain scaling with mesos, to only
> use resources if queries were actually running.  Lots of tools to fit the
> bill here, just need to identify the right ones.
>
> 3. Dns resolution is handed for me with mesos-dns.  Each instance has its
> own Id  and the dns name auto updates in mesos dns based on mesos tasks so
> I always know where Zeppelin is running.
>
> On Monday, April 11, 2016, Johnny W. <jzw.ser...@gmail.com> wrote:
>
>> John & Vincent, I am interested in the per instance per user approach. I
>> have some questions about this approach:
>> --
>> 1. how long will it take to launch a Zeppelin instance (and initialize
>> SparkContext) when user log in?
>> 2. will the instance be destroyed when user log out? if not, how do you
>> deal with the resource assigned to Zeppelin/SparkContext?
>> 3. for auto failover through marathon, how do you deal with the DNS
>> resolve for clients?
>>
>> Thanks!
>> J.
>>
>> On Fri, Apr 8, 2016 at 10:09 AM, John Omernik <j...@omernik.com> wrote:
>>
>>> So for us, we are doing something similar to Vincent, however, instead
>>> of Gluster, we are using MapR-FS and the NFS mount. Basically, this gives
>>> us a shared filesystem that is running on all nodes, with strong security
>>> (Filesystem ACEs for fine grained permissions) built in auditing, Posix
>>> compliance, true random read/write (as opposed to HDFS), snapshots, and
>>> cluster to cluster replication. There are also some neat things with
>>> Volumes and Volume placement we are doing . That provides our storage
>>> layer. Then we have docker for actually running Zeppelin, and since it's a
>>> instance per User, that helps organize who has access to what (Still
>>> hashing out the details on that).  Marathon on Mesos is how we ensure that
>>> Zeppelin is actually available, and then when it comes to spark, we are
>>> just submitting to Mesos, which is right there. Since everything is on one
>>> cluster, the user has a home directory (on a volume) where I store all
>>> configs for each instance of Zeppelin, and they can also put adhoc data in
>>> their home directory. Spark and Apache Drill can both query anything in
>>> MapR FS, making it a pretty powerful combination.
>>>
>>>
>>>
>>> On Fri, Apr 8, 2016 at 6:33 AM, vincent gromakowski <
>>> vincent.gromakow...@gmail.com> wrote:
>>>
>>>> Using it for 3 months without any incident
>>>> Le 8 avr. 2016 9:09 AM, "ashish rawat" <dceash...@gmail.com> a écrit :
>>>>
>>>>> Sounds great. How long have you been using glusterfs in prod? and have
>>>>> you encountered any challenges. The only difficulty for me to use it, 
>>>>> would
>>>>> be a lack of expertise to fix broken things, so hope it's stability isn't
>>>>> something to be concerned about.
>>>>>
>>>>> Regards,
>>>>> Ashish
>>>>>
>>>>> On Fri, Apr 8, 2016 at 12:20 PM, vincent gromakowski <
>>>>> vincent.gromakow...@gmail.com> wrote:
>>>>>
>>>>>> use fuse interface. Gluster volume is directly accessible as local
>>>>>> storage on all nodes but performance is only 200 Mb/s. More than enough 
>>>>>> for
>>>>>> notebooks. For data prefer tachyon/alluxio on top of gluster...
>>>>>> Le 8 avr. 2016 6:35 AM, "ashish rawat" <dceash...@gmail.com> a
>>>>>> écrit :
>>>>>>
>>>>>>> Thanks Eran and Vincent.
>>>>>>> Eran, I would definitely like to try it out, since it won't add to
>>>>>>> the complexity of my deployment. Would see the S3 implementation, to 
>>>>>>> figure
>>>>>>> out how complex it would be.
>>>>>>>
>>>>>>> Vincent,
>>>>>>> I haven't explored glusterfs at all. Would it also require to write
>>>>>>> an implementation of storage interface? Or zeppelin can work with it, 
>>>>>>> out
>>>>>>> of the box?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Ashish
>>>>>>>
>>>>>>> On Wed, Apr 6, 2016 at 12:53 PM, vincent gromakowski <
>>>>>>> vincent.gromakow...@gmail.com> wrote:
>>>>>>>
>>>>>>>> For 1 marathon on mesos restart zeppelin daemon In case of failure.
>>>>>>>> For 2 glusterfs fuse mount allows to share notebooks on all mesos
>>>>>>>> nodes.
>>>>>>>> For 3 not available right now In our  design but a manual restart
>>>>>>>> In zeppelin config page is acceptable for US.
>>>>>>>> Le 6 avr. 2016 8:18 AM, "Eran Witkon" <eranwit...@gmail.com> a
>>>>>>>> écrit :
>>>>>>>>
>>>>>>>>> Yes this is correct.
>>>>>>>>> For HA disk, if you don't have HA storage and no access to S3 then
>>>>>>>>> AFAIK you don't have other option at the moment.
>>>>>>>>> If you like to save notebooks to elastic then I suggest you look
>>>>>>>>> at the storage interface and implementation for git and s3 and 
>>>>>>>>> implement
>>>>>>>>> that yourself. It does sound like an interesting feature
>>>>>>>>> Best
>>>>>>>>> Eran
>>>>>>>>> On Wed, 6 Apr 2016 at 08:57 ashish rawat <dceash...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks Eran. So 3, seems to be something external to Zeppelin,
>>>>>>>>>> and hopefully 1 only means running "zeppelin-daemon.sh start" on a 
>>>>>>>>>> slave
>>>>>>>>>> machine, when master become inaccessible. Is that correct?
>>>>>>>>>>
>>>>>>>>>> My main concern still remains on the storage front. And I don't
>>>>>>>>>> really have high availability disks or even hdfs in my setup. I have 
>>>>>>>>>> been
>>>>>>>>>> using elastic search cluster for data high availability, but was 
>>>>>>>>>> hoping
>>>>>>>>>> that zeppelin can save notebooks to a Elastic Search (like kibana) 
>>>>>>>>>> or maybe
>>>>>>>>>> a document store.
>>>>>>>>>>
>>>>>>>>>> Any idea if anything is planned in that direction. Don't want to
>>>>>>>>>> fallback to 'rsync' like options.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Ashish
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Apr 5, 2016 at 11:17 PM, Eran Witkon <
>>>>>>>>>> eranwit...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> For 1 you need to have both zeppelin web HA and zeppelin deamon
>>>>>>>>>>> HA
>>>>>>>>>>> For 2 I guess you can use HDFS if you implement the storage
>>>>>>>>>>> interface for HDFS. But i am not sure.
>>>>>>>>>>> For 3 I mean that if you connect to an external cluster for
>>>>>>>>>>> example a spark cluster you need to make sure your spark cluster is 
>>>>>>>>>>> HA.
>>>>>>>>>>> Otherwise you will have zeppelin running but your notebook will 
>>>>>>>>>>> fail as no
>>>>>>>>>>> spark cluster available.
>>>>>>>>>>> HTH
>>>>>>>>>>> Eran
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, 5 Apr 2016 at 20:20 ashish rawat <dceash...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thanks Eran for your reply.
>>>>>>>>>>>> For 1) I am assuming that it would similar to HA of any other
>>>>>>>>>>>> web application, i.e. running multiple instances and switching to 
>>>>>>>>>>>> the
>>>>>>>>>>>> backup server when master is down, is it not the case?
>>>>>>>>>>>> For 2) is it also possible to save it on hdfs?
>>>>>>>>>>>> Can you please explain 3, are you referring to interpreter
>>>>>>>>>>>> config? If I am using Spark interpreter and submitting jobs to it, 
>>>>>>>>>>>> and if
>>>>>>>>>>>> zeppelin master node goes down, then what could be the problem in 
>>>>>>>>>>>> slave
>>>>>>>>>>>> node pointing to the same cluster and submitting jobs?
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Ashish
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Apr 5, 2016 at 10:08 PM, Eran Witkon <
>>>>>>>>>>>> eranwit...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I would say you need to account for these things
>>>>>>>>>>>>> 1) availability of the zeppelin deamon
>>>>>>>>>>>>> 2) availability of the notebookd files
>>>>>>>>>>>>> 3) availability of the interpreters used.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For 1 i don't know of out-of-box solution
>>>>>>>>>>>>> For 2 any ha storage will do, s3 or any ha external mounted
>>>>>>>>>>>>> disk
>>>>>>>>>>>>> For 3 it is up to the interpreter and your big data ha
>>>>>>>>>>>>> solution
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, 5 Apr 2016 at 19:29 ashish rawat <dceash...@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is there a suggested architecture to run Zeppelin in high
>>>>>>>>>>>>>> availability mode. The only option I could find was by saving 
>>>>>>>>>>>>>> notebooks to
>>>>>>>>>>>>>> S3. Are there any options if one is not using AWS?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Ashish
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>
>>>
>>
>
> --
> Sent from my iThing
>

Re: HA for Zeppelin

Reply via email to