Re: Zeppelin distributed architecture design

2018-08-24 Thread liuxun
hi,

I have submitted the first module of the zeppline cluster upgrade, please help 
me review the code, thank you!
https://github.com/apache/zeppelin/pull/3156 


I updated the atomix algorithm library module in the system design 
documentation, please click on the link below to browse.
https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit#heading=h.qbcgqhd0wwh8
 




> 在 2018年8月11日,上午10:36,liuxun  写道:
> 
> hi,
> 
> After 2 weeks of development, I have completed the development of upgrading 
> copycat to the atomix algorithm library.
> The reason for the increased workload is the need to resolve the problem of 
> netty package conflicts. Now it has been used on our intra-company clusters 
> using the atomix algorithm.
> 
> Because atomix uses the 4.1.27-Final version of the netty JAR package.
> If you put the high version of the netty package directly in ./zeppelin/lib 
> or the ./zeppelin/interpreter path, it will conflict with the netty package 
> version of spark, causing the spark-interpreter to fail.
> Need to be isolated in zeppelin-server and interpreter-process by loading the 
> atomix netty JAR and the netty package in the classpath through the custom 
> classloader.
> 
> I updated the atomix algorithm library module in the system design 
> documentation, please click on the link below to browse.
> 
> Atomix Raft algorithm library
> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit#heading=h.qbcgqhd0wwh8
>  
> 
> 
> I will send a new code to submit the pull, please help me merge it, thank you.
> 
> Thanks,
> Xun Liu
> 
>> 在 2018年7月24日,下午12:57,liuxun mailto:neliu...@163.com>> 写道:
>> 
>> @Jongyoul Lee:
>> Thank you for your attention.
>> 
>> Indeed, as you said, the `Copycat` project has been closed and has been 
>> migrated to `https://github.com/atomix/atomix` 
>> .
>> 
>> I also considered this issue during development.
>> The main reason was that it was enough to realize Raft using `Copycat` at 
>> the time, and it was not considered too long.
>> 
>> Today, I took a look at the documentation of atomix, 
>> https://atomix.io/docs/latest/user-manual/ 
>>  , 
>> which has a lot of features, such as broadcasting messages in the cluster, 
>> detecting cluster events... ,
>> From the perspective of zeppelin's long-term development, it is better to 
>> use atomix.
>> So, I will switch the Raft protocol algorithm library to atomix, which is 
>> not difficult to modify.
>> 
>> Struggle for zeppelin!!! :-)
>> 
>> 
>>> 在 2018年7月24日,上午9:35,Jongyoul Lee >> > 写道:
>>> 
>>> First of all, thank you for your effort and contribution.
>>> 
>>> I read it carefully today, and personally, it's a very nice feature and
>>> idea.
>>> 
>>> Let's discuss it and improve more concretely. I also left comments on the
>>> doc.
>>> 
>>> And I have a simple question.
>>> 
>>> `Copycat`, which you used to implement it, is deprecated by owner[1] and
>>> moved under https://github.com/atomix/atomix/ 
>>> . I'm afraid of it. Do you
>>> have any reason to use this library? It's even SNAPSHOT version.
>>> 
>>> Regards,
>>> JL
>>> 
>>> [1]: https://github.com/atomix/copycat 
>>> 
>>> On Sat, Jul 21, 2018 at 2:07 AM, liuxun >> > wrote:
>>> 
 HI:
 
 In order to more intuitively express the actual use of distributed
 zeppelin clusters.
 I updated this design document, starting with the 16th page of the
 document, adding 2 GIF animations showing the operation record screen of
 the zeppelin cluster we are using now.
 https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu 
 
 VDKCRRBm-Qa3Bw/edit# 
 1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit#>
 
 Distributed clustered zeppelin is already in use at our company, and the
 recorded screens are all real.
 The first recorded screens GIF shows the following
 Create a cluster of three zeppelin servers
 Add 234, 235, 236 to the zeppelin.cluster.addr attribute in
 zeppelin-site.xml to create a cluster
 Start these 3 servers at the same time
 Open the web pages of these 3 servers and prepare for the notebook
 operation.
 
 
 The second recorded screens GIF shows the following
 Create an interpreter process in the cluster
 Create a notebook on host234 and execute it, This action will create an
 interpreter process

Re: Zeppelin distributed architecture design

2018-08-10 Thread liuxun
hi,

After 2 weeks of development, I have completed the development of upgrading 
copycat to the atomix algorithm library.
The reason for the increased workload is the need to resolve the problem of 
netty package conflicts. Now it has been used on our intra-company clusters 
using the atomix algorithm.

Because atomix uses the 4.1.27-Final version of the netty JAR package.
If you put the high version of the netty package directly in ./zeppelin/lib or 
the ./zeppelin/interpreter path, it will conflict with the netty package 
version of spark, causing the spark-interpreter to fail.
Need to be isolated in zeppelin-server and interpreter-process by loading the 
atomix netty JAR and the netty package in the classpath through the custom 
classloader.

I updated the atomix algorithm library module in the system design 
documentation, please click on the link below to browse.

Atomix Raft algorithm library
https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit#heading=h.qbcgqhd0wwh8
 


I will send a new code to submit the pull, please help me merge it, thank you.

Thanks,
Xun Liu

> 在 2018年7月24日,下午12:57,liuxun  写道:
> 
> @Jongyoul Lee:
> Thank you for your attention.
> 
> Indeed, as you said, the `Copycat` project has been closed and has been 
> migrated to `https://github.com/atomix/atomix` 
> .
> 
> I also considered this issue during development.
> The main reason was that it was enough to realize Raft using `Copycat` at the 
> time, and it was not considered too long.
> 
> Today, I took a look at the documentation of atomix, 
> https://atomix.io/docs/latest/user-manual/ 
>  , 
> which has a lot of features, such as broadcasting messages in the cluster, 
> detecting cluster events... ,
> From the perspective of zeppelin's long-term development, it is better to use 
> atomix.
> So, I will switch the Raft protocol algorithm library to atomix, which is not 
> difficult to modify.
> 
> Struggle for zeppelin!!! :-)
> 
> 
>> 在 2018年7月24日,上午9:35,Jongyoul Lee > > 写道:
>> 
>> First of all, thank you for your effort and contribution.
>> 
>> I read it carefully today, and personally, it's a very nice feature and
>> idea.
>> 
>> Let's discuss it and improve more concretely. I also left comments on the
>> doc.
>> 
>> And I have a simple question.
>> 
>> `Copycat`, which you used to implement it, is deprecated by owner[1] and
>> moved under https://github.com/atomix/atomix/ 
>> . I'm afraid of it. Do you
>> have any reason to use this library? It's even SNAPSHOT version.
>> 
>> Regards,
>> JL
>> 
>> [1]: https://github.com/atomix/copycat 
>> 
>> On Sat, Jul 21, 2018 at 2:07 AM, liuxun > > wrote:
>> 
>>> HI:
>>> 
>>> In order to more intuitively express the actual use of distributed
>>> zeppelin clusters.
>>> I updated this design document, starting with the 16th page of the
>>> document, adding 2 GIF animations showing the operation record screen of
>>> the zeppelin cluster we are using now.
>>> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu 
>>> 
>>> VDKCRRBm-Qa3Bw/edit# >> 1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit#>
>>> 
>>> Distributed clustered zeppelin is already in use at our company, and the
>>> recorded screens are all real.
>>> The first recorded screens GIF shows the following
>>> Create a cluster of three zeppelin servers
>>> Add 234, 235, 236 to the zeppelin.cluster.addr attribute in
>>> zeppelin-site.xml to create a cluster
>>> Start these 3 servers at the same time
>>> Open the web pages of these 3 servers and prepare for the notebook
>>> operation.
>>> 
>>> 
>>> The second recorded screens GIF shows the following
>>> Create an interpreter process in the cluster
>>> Create a notebook on host234 and execute it, This action will create an
>>> interpreter process in the server with free resources in the cluster.
>>> You can then continue editing this notebook on host235 and execute it, You
>>> can return results immediately without waiting for the time to create an
>>> interpreter process.
>>> Again, you can continue to edit this notebook on host236. And execute it,
>>> you can return results immediately without waiting for the time to create
>>> the interpreter process
>>> The same notebook will reuse the first created interpreter process, so you
>>> can get the execution result immediately on any server.
>>> By looking at the background server process, you will find that host234,
>>> host235, and host235 use the same interpreter process for the same notebook.
>>> 
>>> Originally, I wanted to record the interpreter process exception. The
>>> cluste

Re: Zeppelin distributed architecture design

2018-07-24 Thread Jongyoul Lee
Thank you.

I fully agree with you that we need a framework to support distributed
version. IMHO, we cannot afford to develop our own. I'll dig into atomix as
well.



On Tue, Jul 24, 2018 at 1:57 PM, liuxun  wrote:

> @Jongyoul Lee:
> Thank you for your attention.
>
> Indeed, as you said, the `Copycat` project has been closed and has been
> migrated to `https://github.com/atomix/atomix`
> .
>
> I also considered this issue during development.
> The main reason was that it was enough to realize Raft using `Copycat` at
> the time, and it was not considered too long.
>
> Today, I took a look at the documentation of atomix,
> https://atomix.io/docs/latest/user-manual/ ,
> which has a lot of features, such as broadcasting messages in the cluster,
> detecting cluster events... ,
> From the perspective of zeppelin's long-term development, it is better to
> use atomix.
> So, I will switch the Raft protocol algorithm library to atomix, which is
> not difficult to modify.
>
> Struggle for zeppelin!!! :-)
>
>
> 在 2018年7月24日,上午9:35,Jongyoul Lee  写道:
>
> First of all, thank you for your effort and contribution.
>
> I read it carefully today, and personally, it's a very nice feature and
> idea.
>
> Let's discuss it and improve more concretely. I also left comments on the
> doc.
>
> And I have a simple question.
>
> `Copycat`, which you used to implement it, is deprecated by owner[1] and
> moved under https://github.com/atomix/atomix/. I'm afraid of it. Do you
> have any reason to use this library? It's even SNAPSHOT version.
>
> Regards,
> JL
>
> [1]: https://github.com/atomix/copycat
>
> On Sat, Jul 21, 2018 at 2:07 AM, liuxun  wrote:
>
> HI:
>
> In order to more intuitively express the actual use of distributed
> zeppelin clusters.
> I updated this design document, starting with the 16th page of the
> document, adding 2 GIF animations showing the operation record screen of
> the zeppelin cluster we are using now.
> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu
> VDKCRRBm-Qa3Bw/edit#  1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit#>
>
> Distributed clustered zeppelin is already in use at our company, and the
> recorded screens are all real.
> The first recorded screens GIF shows the following
> Create a cluster of three zeppelin servers
> Add 234, 235, 236 to the zeppelin.cluster.addr attribute in
> zeppelin-site.xml to create a cluster
> Start these 3 servers at the same time
> Open the web pages of these 3 servers and prepare for the notebook
> operation.
>
>
> The second recorded screens GIF shows the following
> Create an interpreter process in the cluster
> Create a notebook on host234 and execute it, This action will create an
> interpreter process in the server with free resources in the cluster.
> You can then continue editing this notebook on host235 and execute it, You
> can return results immediately without waiting for the time to create an
> interpreter process.
> Again, you can continue to edit this notebook on host236. And execute it,
> you can return results immediately without waiting for the time to create
> the interpreter process
> The same notebook will reuse the first created interpreter process, so you
> can get the execution result immediately on any server.
> By looking at the background server process, you will find that host234,
> host235, and host235 use the same interpreter process for the same
> notebook.
>
> Originally, I wanted to record the interpreter process exception. The
> cluster re-created the screenshot of the interpreter process in the idle
> server, but I am too tired now.
> There is time to record later.
>
>
> 在 2018年7月19日,上午7:36,Ruslan Dautkhanov  写道:
>
> Thank you luxun,
>
> I left a couple of comments in that google document.
>
> --
> Ruslan Dautkhanov
>
>
> On Tue, Jul 17, 2018 at 11:30 PM liuxun 
> neliu...@163.com>> wrote:
>
> hi,Ruslan Dautkhanov
>
> Thank you very much for your question. according to your advice, I added
>
> 3 schematics to illustrate.
>
> 1. Distributed Zeppelin Deployment architecture diagram.
> 2. Distributed zeppelin Server fault tolerance diagram.
> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
>
>
> The email attachment exceeded the size limit, so I reorganized the
>
> document and updated it with Google Docs.
>
> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu
>
> VDKCRRBm-Qa3Bw/edit?usp=sharing  1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit?usp=sharing>
>
>
>
> 在 2018年7月18日,下午1:03,liuxun mailto:neliu...@163.com>>
>
> 写道:
>
>
> hi,Ruslan Dautkhanov
>
> Thank you very much for your question. according to your advice, I
>
> added 3 schematics to illustrate.
>
> 1. Zeppelin Cluster architecture diagram.
> 2. Distributed zeppelin Server fault tolerance diagram.
> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
>
> Later, I will merge the schematic i

Re: Zeppelin distributed architecture design

2018-07-23 Thread liuxun
@Jongyoul Lee:
Thank you for your attention.

Indeed, as you said, the `Copycat` project has been closed and has been 
migrated to `https://github.com/atomix/atomix`.

I also considered this issue during development.
The main reason was that it was enough to realize Raft using `Copycat` at the 
time, and it was not considered too long.

Today, I took a look at the documentation of atomix, 
https://atomix.io/docs/latest/user-manual/ 
 , 
which has a lot of features, such as broadcasting messages in the cluster, 
detecting cluster events... ,
From the perspective of zeppelin's long-term development, it is better to use 
atomix.
So, I will switch the Raft protocol algorithm library to atomix, which is not 
difficult to modify.

Struggle for zeppelin!!! :-)


> 在 2018年7月24日,上午9:35,Jongyoul Lee  写道:
> 
> First of all, thank you for your effort and contribution.
> 
> I read it carefully today, and personally, it's a very nice feature and
> idea.
> 
> Let's discuss it and improve more concretely. I also left comments on the
> doc.
> 
> And I have a simple question.
> 
> `Copycat`, which you used to implement it, is deprecated by owner[1] and
> moved under https://github.com/atomix/atomix/. I'm afraid of it. Do you
> have any reason to use this library? It's even SNAPSHOT version.
> 
> Regards,
> JL
> 
> [1]: https://github.com/atomix/copycat
> 
> On Sat, Jul 21, 2018 at 2:07 AM, liuxun  wrote:
> 
>> HI:
>> 
>> In order to more intuitively express the actual use of distributed
>> zeppelin clusters.
>> I updated this design document, starting with the 16th page of the
>> document, adding 2 GIF animations showing the operation record screen of
>> the zeppelin cluster we are using now.
>> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu
>> VDKCRRBm-Qa3Bw/edit# > 1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit#>
>> 
>> Distributed clustered zeppelin is already in use at our company, and the
>> recorded screens are all real.
>> The first recorded screens GIF shows the following
>> Create a cluster of three zeppelin servers
>> Add 234, 235, 236 to the zeppelin.cluster.addr attribute in
>> zeppelin-site.xml to create a cluster
>> Start these 3 servers at the same time
>> Open the web pages of these 3 servers and prepare for the notebook
>> operation.
>> 
>> 
>> The second recorded screens GIF shows the following
>> Create an interpreter process in the cluster
>> Create a notebook on host234 and execute it, This action will create an
>> interpreter process in the server with free resources in the cluster.
>> You can then continue editing this notebook on host235 and execute it, You
>> can return results immediately without waiting for the time to create an
>> interpreter process.
>> Again, you can continue to edit this notebook on host236. And execute it,
>> you can return results immediately without waiting for the time to create
>> the interpreter process
>> The same notebook will reuse the first created interpreter process, so you
>> can get the execution result immediately on any server.
>> By looking at the background server process, you will find that host234,
>> host235, and host235 use the same interpreter process for the same notebook.
>> 
>> Originally, I wanted to record the interpreter process exception. The
>> cluster re-created the screenshot of the interpreter process in the idle
>> server, but I am too tired now.
>> There is time to record later.
>> 
>> 
>>> 在 2018年7月19日,上午7:36,Ruslan Dautkhanov  写道:
>>> 
>>> Thank you luxun,
>>> 
>>> I left a couple of comments in that google document.
>>> 
>>> --
>>> Ruslan Dautkhanov
>>> 
>>> 
>>> On Tue, Jul 17, 2018 at 11:30 PM liuxun > neliu...@163.com>> wrote:
>>> hi,Ruslan Dautkhanov
>>> 
>>> Thank you very much for your question. according to your advice, I added
>> 3 schematics to illustrate.
>>> 1. Distributed Zeppelin Deployment architecture diagram.
>>> 2. Distributed zeppelin Server fault tolerance diagram.
>>> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
>>> 
>>> 
>>> The email attachment exceeded the size limit, so I reorganized the
>> document and updated it with Google Docs.
>>> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu
>> VDKCRRBm-Qa3Bw/edit?usp=sharing > 1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit?usp=sharing>
>>> 
>>> 
 在 2018年7月18日,下午1:03,liuxun mailto:neliu...@163.com>>
>> 写道:
 
 hi,Ruslan Dautkhanov
 
 Thank you very much for your question. according to your advice, I
>> added 3 schematics to illustrate.
 1. Zeppelin Cluster architecture diagram.
 2. Distributed zeppelin Server fault tolerance diagram.
 3. Distributed zeppelin Server & intp process fault tolerance diagram.
 
 Later, I will merge the schematic into the system design document.
 
 
 
 
 
 
 
 
 
 
 
 
> 

Re: Zeppelin distributed architecture design

2018-07-23 Thread Jongyoul Lee
First of all, thank you for your effort and contribution.

I read it carefully today, and personally, it's a very nice feature and
idea.

Let's discuss it and improve more concretely. I also left comments on the
doc.

And I have a simple question.

`Copycat`, which you used to implement it, is deprecated by owner[1] and
moved under https://github.com/atomix/atomix/. I'm afraid of it. Do you
have any reason to use this library? It's even SNAPSHOT version.

Regards,
JL

[1]: https://github.com/atomix/copycat

On Sat, Jul 21, 2018 at 2:07 AM, liuxun  wrote:

> HI:
>
> In order to more intuitively express the actual use of distributed
> zeppelin clusters.
> I updated this design document, starting with the 16th page of the
> document, adding 2 GIF animations showing the operation record screen of
> the zeppelin cluster we are using now.
> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu
> VDKCRRBm-Qa3Bw/edit#  1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit#>
>
> Distributed clustered zeppelin is already in use at our company, and the
> recorded screens are all real.
> The first recorded screens GIF shows the following
> Create a cluster of three zeppelin servers
> Add 234, 235, 236 to the zeppelin.cluster.addr attribute in
> zeppelin-site.xml to create a cluster
> Start these 3 servers at the same time
> Open the web pages of these 3 servers and prepare for the notebook
> operation.
>
>
> The second recorded screens GIF shows the following
> Create an interpreter process in the cluster
> Create a notebook on host234 and execute it, This action will create an
> interpreter process in the server with free resources in the cluster.
> You can then continue editing this notebook on host235 and execute it, You
> can return results immediately without waiting for the time to create an
> interpreter process.
> Again, you can continue to edit this notebook on host236. And execute it,
> you can return results immediately without waiting for the time to create
> the interpreter process
> The same notebook will reuse the first created interpreter process, so you
> can get the execution result immediately on any server.
> By looking at the background server process, you will find that host234,
> host235, and host235 use the same interpreter process for the same notebook.
>
> Originally, I wanted to record the interpreter process exception. The
> cluster re-created the screenshot of the interpreter process in the idle
> server, but I am too tired now.
> There is time to record later.
>
>
> > 在 2018年7月19日,上午7:36,Ruslan Dautkhanov  写道:
> >
> > Thank you luxun,
> >
> > I left a couple of comments in that google document.
> >
> > --
> > Ruslan Dautkhanov
> >
> >
> > On Tue, Jul 17, 2018 at 11:30 PM liuxun  neliu...@163.com>> wrote:
> > hi,Ruslan Dautkhanov
> >
> > Thank you very much for your question. according to your advice, I added
> 3 schematics to illustrate.
> > 1. Distributed Zeppelin Deployment architecture diagram.
> > 2. Distributed zeppelin Server fault tolerance diagram.
> > 3. Distributed zeppelin Server & intp process fault tolerance diagram.
> >
> >
> > The email attachment exceeded the size limit, so I reorganized the
> document and updated it with Google Docs.
> > https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu
> VDKCRRBm-Qa3Bw/edit?usp=sharing  1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit?usp=sharing>
> >
> >
> >> 在 2018年7月18日,下午1:03,liuxun mailto:neliu...@163.com>>
> 写道:
> >>
> >> hi,Ruslan Dautkhanov
> >>
> >> Thank you very much for your question. according to your advice, I
> added 3 schematics to illustrate.
> >> 1. Zeppelin Cluster architecture diagram.
> >> 2. Distributed zeppelin Server fault tolerance diagram.
> >> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
> >>
> >> Later, I will merge the schematic into the system design document.
> >>
> >> 
> >>
> >>
> >> 
> >>
> >>
> >>
> >> 
> >>
> >>
> >>
> >>> 在 2018年7月18日,上午1:16,Ruslan Dautkhanov  dautkha...@gmail.com>> 写道:
> >>>
> >>> Nice.
> >>>
> >>> Thanks for sharing.
> >>>
> >>> Can you explain how are users routed into a particular zeppelin server
> >>> instance? I've seen nginx on top of them, but I don't think the
> document
> >>> covers details? If one zeppelin server goes down or unhealthy, is nginx
> >>> supposed to detect (if so, how?) that and reroute users to a survived
> >>> instance?
> >>>
> >>> Thanks,
> >>> Ruslan Dautkhanov
> >>>
> >>>
> >>> On Tue, Jul 17, 2018 at 2:46 AM liuxun  neliu...@163.com>> wrote:
> >>>
>  hi:
> 
>  Our company installed and deployed a lot of zeppelin for data
> analysis.
>  The single server version of zeppelin could not meet our application
>  scenarios, so we transformed zeppelin into a clustered service that
>  supports distributed deployment, Have a unified entrance, high
>  availability, and High server resource usage.  the email attach

Re: Zeppelin distributed architecture design

2018-07-20 Thread liuxun
HI:

In order to more intuitively express the actual use of distributed zeppelin 
clusters.
I updated this design document, starting with the 16th page of the document, 
adding 2 GIF animations showing the operation record screen of the zeppelin 
cluster we are using now.
https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit#
 


Distributed clustered zeppelin is already in use at our company, and the 
recorded screens are all real.
The first recorded screens GIF shows the following
Create a cluster of three zeppelin servers
Add 234, 235, 236 to the zeppelin.cluster.addr attribute in zeppelin-site.xml 
to create a cluster
Start these 3 servers at the same time
Open the web pages of these 3 servers and prepare for the notebook operation.


The second recorded screens GIF shows the following
Create an interpreter process in the cluster
Create a notebook on host234 and execute it, This action will create an 
interpreter process in the server with free resources in the cluster.
You can then continue editing this notebook on host235 and execute it, You can 
return results immediately without waiting for the time to create an 
interpreter process.
Again, you can continue to edit this notebook on host236. And execute it, you 
can return results immediately without waiting for the time to create the 
interpreter process
The same notebook will reuse the first created interpreter process, so you can 
get the execution result immediately on any server.
By looking at the background server process, you will find that host234, 
host235, and host235 use the same interpreter process for the same notebook.

Originally, I wanted to record the interpreter process exception. The cluster 
re-created the screenshot of the interpreter process in the idle server, but I 
am too tired now.
There is time to record later.


> 在 2018年7月19日,上午7:36,Ruslan Dautkhanov  写道:
> 
> Thank you luxun,
> 
> I left a couple of comments in that google document. 
> 
> -- 
> Ruslan Dautkhanov
> 
> 
> On Tue, Jul 17, 2018 at 11:30 PM liuxun  > wrote:
> hi,Ruslan Dautkhanov
> 
> Thank you very much for your question. according to your advice, I added 3 
> schematics to illustrate.
> 1. Distributed Zeppelin Deployment architecture diagram.
> 2. Distributed zeppelin Server fault tolerance diagram.
> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
> 
> 
> The email attachment exceeded the size limit, so I reorganized the document 
> and updated it with Google Docs.
> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit?usp=sharing
>  
> 
> 
> 
>> 在 2018年7月18日,下午1:03,liuxun mailto:neliu...@163.com>> 写道:
>> 
>> hi,Ruslan Dautkhanov
>> 
>> Thank you very much for your question. according to your advice, I added 3 
>> schematics to illustrate.
>> 1. Zeppelin Cluster architecture diagram.
>> 2. Distributed zeppelin Server fault tolerance diagram.
>> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
>> 
>> Later, I will merge the schematic into the system design document.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>> 在 2018年7月18日,上午1:16,Ruslan Dautkhanov >> > 写道:
>>> 
>>> Nice.
>>> 
>>> Thanks for sharing.
>>> 
>>> Can you explain how are users routed into a particular zeppelin server
>>> instance? I've seen nginx on top of them, but I don't think the document
>>> covers details? If one zeppelin server goes down or unhealthy, is nginx
>>> supposed to detect (if so, how?) that and reroute users to a survived
>>> instance?
>>> 
>>> Thanks,
>>> Ruslan Dautkhanov
>>> 
>>> 
>>> On Tue, Jul 17, 2018 at 2:46 AM liuxun >> > wrote:
>>> 
 hi:
 
 Our company installed and deployed a lot of zeppelin for data analysis.
 The single server version of zeppelin could not meet our application
 scenarios, so we transformed zeppelin into a clustered service that
 supports distributed deployment, Have a unified entrance, high
 availability, and High server resource usage.  the email attachment is the
 entire design document, I am very happy to feedback our modified code back
 to the community.
 
 
 this is the JIRA I submitted in the community,
 
 https://issues.apache.org/jira/browse/ZEPPELIN-3471 
 
 
 
 Since the design document size exceeds the mail attachment size limit, the
 document link address has to be sent.
 
 https://issues.apache.org/jira/secure/attachment/12931896/Zeppelin%20distributed%20architecture%20design.pdf
  
 
 
 https://issues.apa

Re: Zeppelin distributed architecture design

2018-07-18 Thread Ruslan Dautkhanov
Thank you luxun,

I left a couple of comments in that google document.

-- 
Ruslan Dautkhanov


On Tue, Jul 17, 2018 at 11:30 PM liuxun  wrote:

> hi,Ruslan Dautkhanov
>
> Thank you very much for your question. according to your advice, I added 3
> schematics to illustrate.
> 1. Distributed Zeppelin Deployment architecture diagram.
> 2. Distributed zeppelin Server fault tolerance diagram.
> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
>
>
> The email attachment exceeded the size limit, so I reorganized the
> document and updated it with Google Docs.
>
> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit?usp=sharing
>
>
> 在 2018年7月18日,下午1:03,liuxun  写道:
>
> hi,Ruslan Dautkhanov
>
> Thank you very much for your question. according to your advice, I added 3
> schematics to illustrate.
> 1. Zeppelin Cluster architecture diagram.
> 2. Distributed zeppelin Server fault tolerance diagram.
> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
>
> Later, I will merge the schematic into the system design document.
>
> 
>
>
> 
>
>
>
> 
>
>
>
> 在 2018年7月18日,上午1:16,Ruslan Dautkhanov  写道:
>
> Nice.
>
> Thanks for sharing.
>
> Can you explain how are users routed into a particular zeppelin server
> instance? I've seen nginx on top of them, but I don't think the document
> covers details? If one zeppelin server goes down or unhealthy, is nginx
> supposed to detect (if so, how?) that and reroute users to a survived
> instance?
>
> Thanks,
> Ruslan Dautkhanov
>
>
> On Tue, Jul 17, 2018 at 2:46 AM liuxun  wrote:
>
> hi:
>
> Our company installed and deployed a lot of zeppelin for data analysis.
> The single server version of zeppelin could not meet our application
> scenarios, so we transformed zeppelin into a clustered service that
> supports distributed deployment, Have a unified entrance, high
> availability, and High server resource usage.  the email attachment is the
> entire design document, I am very happy to feedback our modified code back
> to the community.
>
>
> this is the JIRA I submitted in the community,
>
> https://issues.apache.org/jira/browse/ZEPPELIN-3471
>
>
> Since the design document size exceeds the mail attachment size limit, the
> document link address has to be sent.
>
>
> https://issues.apache.org/jira/secure/attachment/12931896/Zeppelin%20distributed%20architecture%20design.pdf
>
>
> https://issues.apache.org/jira/secure/attachment/12931895/zepplin%20Cluster%20Sequence%20Diagram.png
>
>
> liuxun
>
>
>
>


Re: Zeppelin distributed architecture design

2018-07-18 Thread vincent gromakowski
good job ! it seems to be very interesting

2018-07-18 7:30 GMT+02:00 liuxun :

> hi,Ruslan Dautkhanov
>
> Thank you very much for your question. according to your advice, I added 3
> schematics to illustrate.
> 1. Distributed Zeppelin Deployment architecture diagram.
> 2. Distributed zeppelin Server fault tolerance diagram.
> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
>
>
> The email attachment exceeded the size limit, so I reorganized the
> document and updated it with Google Docs.
> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu
> VDKCRRBm-Qa3Bw/edit?usp=sharing
>
>
> 在 2018年7月18日,下午1:03,liuxun  写道:
>
> hi,Ruslan Dautkhanov
>
> Thank you very much for your question. according to your advice, I added 3
> schematics to illustrate.
> 1. Zeppelin Cluster architecture diagram.
> 2. Distributed zeppelin Server fault tolerance diagram.
> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
>
> Later, I will merge the schematic into the system design document.
>
> 
>
>
> 
>
>
>
> 
>
>
>
> 在 2018年7月18日,上午1:16,Ruslan Dautkhanov  写道:
>
> Nice.
>
> Thanks for sharing.
>
> Can you explain how are users routed into a particular zeppelin server
> instance? I've seen nginx on top of them, but I don't think the document
> covers details? If one zeppelin server goes down or unhealthy, is nginx
> supposed to detect (if so, how?) that and reroute users to a survived
> instance?
>
> Thanks,
> Ruslan Dautkhanov
>
>
> On Tue, Jul 17, 2018 at 2:46 AM liuxun  wrote:
>
> hi:
>
> Our company installed and deployed a lot of zeppelin for data analysis.
> The single server version of zeppelin could not meet our application
> scenarios, so we transformed zeppelin into a clustered service that
> supports distributed deployment, Have a unified entrance, high
> availability, and High server resource usage.  the email attachment is the
> entire design document, I am very happy to feedback our modified code back
> to the community.
>
>
> this is the JIRA I submitted in the community,
>
> https://issues.apache.org/jira/browse/ZEPPELIN-3471
>
>
> Since the design document size exceeds the mail attachment size limit, the
> document link address has to be sent.
>
> https://issues.apache.org/jira/secure/attachment/12931896/Zeppelin%
> 20distributed%20architecture%20design.pdf
>
> https://issues.apache.org/jira/secure/attachment/
> 12931895/zepplin%20Cluster%20Sequence%20Diagram.png
>
>
> liuxun
>
>
>
>


Re: Zeppelin distributed architecture design

2018-07-17 Thread liuxun
hi,Ruslan Dautkhanov

Thank you very much for your question. according to your advice, I added 3 
schematics to illustrate.
1. Distributed Zeppelin Deployment architecture diagram.
2. Distributed zeppelin Server fault tolerance diagram.
3. Distributed zeppelin Server & intp process fault tolerance diagram.


The email attachment exceeded the size limit, so I reorganized the document and 
updated it with Google Docs.
https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit?usp=sharing
 



> 在 2018年7月18日,下午1:03,liuxun  写道:
> 
> hi,Ruslan Dautkhanov
> 
> Thank you very much for your question. according to your advice, I added 3 
> schematics to illustrate.
> 1. Zeppelin Cluster architecture diagram.
> 2. Distributed zeppelin Server fault tolerance diagram.
> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
> 
> Later, I will merge the schematic into the system design document.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>> 在 2018年7月18日,上午1:16,Ruslan Dautkhanov > > 写道:
>> 
>> Nice.
>> 
>> Thanks for sharing.
>> 
>> Can you explain how are users routed into a particular zeppelin server
>> instance? I've seen nginx on top of them, but I don't think the document
>> covers details? If one zeppelin server goes down or unhealthy, is nginx
>> supposed to detect (if so, how?) that and reroute users to a survived
>> instance?
>> 
>> Thanks,
>> Ruslan Dautkhanov
>> 
>> 
>> On Tue, Jul 17, 2018 at 2:46 AM liuxun > > wrote:
>> 
>>> hi:
>>> 
>>> Our company installed and deployed a lot of zeppelin for data analysis.
>>> The single server version of zeppelin could not meet our application
>>> scenarios, so we transformed zeppelin into a clustered service that
>>> supports distributed deployment, Have a unified entrance, high
>>> availability, and High server resource usage.  the email attachment is the
>>> entire design document, I am very happy to feedback our modified code back
>>> to the community.
>>> 
>>> 
>>> this is the JIRA I submitted in the community,
>>> 
>>> https://issues.apache.org/jira/browse/ZEPPELIN-3471 
>>> 
>>> 
>>> 
>>> Since the design document size exceeds the mail attachment size limit, the
>>> document link address has to be sent.
>>> 
>>> https://issues.apache.org/jira/secure/attachment/12931896/Zeppelin%20distributed%20architecture%20design.pdf
>>> 
>>> https://issues.apache.org/jira/secure/attachment/12931895/zepplin%20Cluster%20Sequence%20Diagram.png
>>> 
>>> 
>>> liuxun
>>> 
> 



Re: Zeppelin distributed architecture design

2018-07-17 Thread Ruslan Dautkhanov
Nice.

Thanks for sharing.

Can you explain how are users routed into a particular zeppelin server
instance? I've seen nginx on top of them, but I don't think the document
covers details? If one zeppelin server goes down or unhealthy, is nginx
supposed to detect (if so, how?) that and reroute users to a survived
instance?

Thanks,
Ruslan Dautkhanov


On Tue, Jul 17, 2018 at 2:46 AM liuxun  wrote:

> hi:
>
> Our company installed and deployed a lot of zeppelin for data analysis.
> The single server version of zeppelin could not meet our application
> scenarios, so we transformed zeppelin into a clustered service that
> supports distributed deployment, Have a unified entrance, high
> availability, and High server resource usage.  the email attachment is the
> entire design document, I am very happy to feedback our modified code back
> to the community.
>
>
> this is the JIRA I submitted in the community,
>
> https://issues.apache.org/jira/browse/ZEPPELIN-3471
>
>
> Since the design document size exceeds the mail attachment size limit, the
> document link address has to be sent.
>
> https://issues.apache.org/jira/secure/attachment/12931896/Zeppelin%20distributed%20architecture%20design.pdf
>
> https://issues.apache.org/jira/secure/attachment/12931895/zepplin%20Cluster%20Sequence%20Diagram.png
>
>
> liuxun
>


Zeppelin distributed architecture design

2018-07-17 Thread liuxun
hi:

Our company installed and deployed a lot of zeppelin for data analysis. The 
single server version of zeppelin could not meet our application scenarios, so 
we transformed zeppelin into a clustered service that supports distributed 
deployment, Have a unified entrance, high availability, and High server 
resource usage.  the email attachment is the entire design document, I am very 
happy to feedback our modified code back to the community.


this is the JIRA I submitted in the community,

https://issues.apache.org/jira/browse/ZEPPELIN-3471 



Since the design document size exceeds the mail attachment size limit, the 
document link address has to be sent.
https://issues.apache.org/jira/secure/attachment/12931896/Zeppelin%20distributed%20architecture%20design.pdf
 

https://issues.apache.org/jira/secure/attachment/12931895/zepplin%20Cluster%20Sequence%20Diagram.png
 



liuxun