I have very little experience with ZK and cannot explain the differences
between ZK and Consul by myself. However there are some comparisions
* https://www.consul.io/intro/vs/zookeeper.html - done by Consul so may be
Regarding testing - I did basic failover scenarios on my workstation with 2
JobManagers, 2 TaskManagers and WindowJoin example app with checkpointing
and restarting turned on.
I was running the cluster no longer than for few hours.
For now I'd like to open Flink for alternative HA backends (
On Wed, Feb 14, 2018 at 1:47 PM, Chesnay Schepler <ches...@apache.org>
> I don't know anything about Consul but the prospect of having other
> options beside Zookeeper is very interesting. It's rather surprising how
> little you had to modify existing classes to get this to work.
> It may take a bit until someone provides proper feedback as the community
> is currently prepping 2 releases (1.4.1 and 1.5), please don't be
> discouraged by this.
> I saw that your branch was based on the 1.4 version. In 1.5 we reworked
> the distributed architecture of Flink (in an initiative commonly referred
> to as FLIP-6) which may affect your work.
> 2 things to note from my side:
> It would also be helpful if you could explain the differences between ZK
> and Consul and how they stack up in terms of guarantees etc. .
> How did you test your solution so far? (Like how long was a cluster
> running, what failure scenarios)
> On 13.02.2018 21:38, Krzysztof Białek wrote:
> I'd like to get your opinion about this idea. I found related JIRA issue
> but it seems to be dead. To attract your attention I copy my comment here.
> As an experiment I've implemented Flink HA on top of Consul. The
> implementation is working fine in the "lab" but is not battle tested yet.
> The source code is available at https://github.com/kbialek/
> flink/tree/feature/consul (flink-runtime package
> Why?. Generally I'd like to keep as less moving parts as possible. We do
> not have Zookeeper running, but Consul is already in place. And in the end
> freedom of choice is a good thing.
> It would be great to see built-in Consul support in Flink someday, but if
> it is not expected then I suggest a little refactoring to open possibility
> to configure HighAvailabilityServicesFactory. As far as I can see this
> should be enough to inject any HA implementation.