Dear Devs,
Now I encounter a problem in the HMaster.
Currently I run multiple HMasters in a cluster. If the ZK connection of one
of the backup HMasters expires, this backup HMaster will go down directly
without recovering the ZK connection.
I saw there were such code in the HMaster.abortNow() listed below, the
fail.fast only works for active HMaster. Do the backup ones need to be
recovered if the zk connection expires? Please advise. Thanks.
if (!this.isActiveMaster || this.stopped) {
return true;
}
boolean failFast = conf.getBoolean("fail.fast.expired.active.master", false);
Regards,
Jingcheng