Hi Emmanuel,
yes Master HA is currently under development and only available in 0.10
snapshot. AFAIK, it is almost but not completely done yet.

Best, Fabian
On Sep 10, 2015 01:29, "Emmanuel" <ele...@msn.com> wrote:

> is this a 0.10 snapshot feature only? I'm using 0.9.1 right now
>
>
> ------------------------------
> From: ele...@msn.com
> To: user@flink.apache.org
> Subject: RE: Flink HA mode
> Date: Wed, 9 Sep 2015 16:11:38 -0700
>
> Been playing with the HA...
> I find the UIs confusing here:
> in the dashboard on one side I see 0 slots 0 taskmanagers, but a job
> running, while on the other side I see my taskmanagers and slots but no
> jobs...
> putting the UI being a proxy, it's load balanced to the JM, so I can't
> tell which is which (my bad) but how would I be able to tell which is the
> leader anyway? Do I need to query ZK to know?
>
> Looking at ZK, I actually don't see any keys for Flink... so I'm wondering
> if this is working as expected.
> The config lists the options
>
> recovery.mode: zookeeper
> ha.zookeeper.quorum: zookeeper:2181
>
> but I don't see any mention of Zookeeper in the JM logs.
>
> I start the JMs and TMs manually with
> ./jobmanager.sh start streaming
> ./taskmanager.sh start streaming
>
> so i guess there is no need for the masters file.
>
> What else am I missing here?
> Do I need to list the multiple JMs IPs in the jobmanager.rpc.address?
>
> Thanks
>
> ------------------------------
> Date: Wed, 9 Sep 2015 10:19:36 +0200
> Subject: Re: Flink HA mode
> From: trohrm...@apache.org
> To: user@flink.apache.org
>
> The only necessary information for the JobManager and TaskManager is to
> know where to find the ZooKeeper quorum to do leader election and retrieve
> the leader address from. This will be configured via the config parameter
> `ha.zookeeper.quorum`.
>
> On Wed, Sep 9, 2015 at 10:15 AM, Stephan Ewen <se...@apache.org> wrote:
>
> TL;DR is that you are right, it is only the initial list. If a JobManager
> comes back with a new IP address, it will be available.
>
> On Wed, Sep 9, 2015 at 8:35 AM, Ufuk Celebi <u...@apache.org> wrote:
>
>
> > On 09 Sep 2015, at 04:48, Emmanuel <ele...@msn.com> wrote:
> >
> > my questions is: how critical is the bootstrap ip list in masters?
>
> Hey Emmanuel,
>
> good questions. I read over the docs for this again [1] and you are right
> that we should make this clearer.
>
> The “masters" file is only relevant for the start/stop cluster scripts
> (Flink standalone mode).
>
> If you specify hosts in the “masters" file the start-cluster scripts will
> use these hosts to start job managers. After that all coordination happens
> via ZooKeeper via a leader election and retrieval service. All job managers
> elect a single leader and task managers and clients (submitting programs)
> retrieve this leader via ZooKeeper. If a job manager fails and becomes
> available again, it will publish itself via this mechanism (if it becomes
> leader at some point again). There was a recent PR [2] which introduced
> this. You can read over the very good PR description for more info for now.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/setup/jobmanager_high_availability.html
>
> [2] https://github.com/apache/flink/pull/1016
>
>
> > does this get updated or does it have to be updated by some other
> service?
>
> If you start a new cluster on GCE with different hosts and use Flink’s
> standalone mode you have to set this up again. This is the same for the
> “slaves” file.
>
>
> Does this answer your question? If anything is unclear, please post here.
> :)
>
> – Ufuk
>
>
>
>

Reply via email to