Re: FW: high availability with automated disaster recovery using zookeeper

2018-07-16 Thread Scott Kidder
rom:* Scott Kidder > *Sent:* יום ו 13 יולי 2018 01:13 > *To:* Sofer, Tovi [ICG-IT] > *Cc:* user@flink.apache.org > *Subject:* Re: high availability with automated disaster recovery using > zookeeper > > > > I've used a multi-datacenter Consul cluster used to coordinate

FW: high availability with automated disaster recovery using zookeeper

2018-07-16 Thread Sofer, Tovi
Thank you Scott, Looks like a very elegant solution. How did you manage high availability in single data center? Thanks, Tovi From: Scott Kidder Sent: יום ו 13 יולי 2018 01:13 To: Sofer, Tovi [ICG-IT] Cc: user@flink.apache.org Subject: Re: high availability with automated disaster recovery

Re: high availability with automated disaster recovery using zookeeper

2018-07-12 Thread Scott Kidder
I've used a multi-datacenter Consul cluster used to coordinate service-discovery. When a service starts up in the primary DC, it registers itself in Consul with a key that has a TTL that must be periodically renewed. If the service shuts down or terminates abruptly, the key expires and is removed

Re: high availability with automated disaster recovery using zookeeper

2018-07-12 Thread Till Rohrmann
is accurate, since it seems to contradict the image in link > below > > https://mesosphere.com/blog/apache-flink-on-dcos-and-apache-mesos ] > > > > *From:* Sofer, Tovi [ICG-IT] > *Sent:* יום ג 10 יולי 2018 20:04 > *To:* 'Till Rohrmann' ; user > *Cc:* Gardi, Hila [ICG-IT] &

RE: high availability with automated disaster recovery using zookeeper

2018-07-10 Thread Sofer, Tovi
: Gardi, Hila [ICG-IT] Subject: RE: high availability with automated disaster recovery using zookeeper Hi Till, group, Thank you for your response. After reading further online on Mesos – Can’t Mesos fill the requirement of running job manager in primary server? By using: “constraints

RE: high availability with automated disaster recovery using zookeeper

2018-07-10 Thread Sofer, Tovi
-for-disaster-recovery/ ) Is this supported by Flink cluster on Mesos ? Thanks again Tovi From: Till Rohrmann Sent: יום ג 10 יולי 2018 10:11 To: Sofer, Tovi [ICG-IT] Cc: user Subject: Re: high availability with automated disaster recovery using zookeeper Hi Tovi, that is an interesting use case

Re: high availability with automated disaster recovery using zookeeper

2018-07-10 Thread Till Rohrmann
Hi Tovi, that is an interesting use case you are describing here. I think, however, it depends mainly on the capabilities of ZooKeeper to produce the intended behavior. Flink itself relies on ZooKeeper for leader election in HA mode but does not expose any means to influence the leader election

high availability with automated disaster recovery using zookeeper

2018-07-09 Thread Sofer, Tovi
Hi all, We are now examining how to achieve high availability for Flink, and to support also automatic recovery in disaster scenario- when all DC goes down. We have DC1 which we usually want work to be done, and DC2 - which is more remote and we want work to go there only when DC1 is down. We