Github user HeartSaVioR commented on a diff in the pull request:
https://github.com/apache/storm/pull/2433#discussion_r152703257
--- Diff: storm-client/src/jvm/org/apache/storm/daemon/worker/Worker.java
---
@@ -382,6 +374,33 @@ public void establishLogSettingCallback() {
workerState.stormClusterState.topologyLogConfig(topologyId,
this::checkLogConfigChanged);
}
+ /**
+ * Send a heartbeat to local supervisor first to check if supervisor
is ok for heartbeating.
+ */
+ private void heartbeatToMasterIfLocalbeatFail(LSWorkerHeartbeat
lsWorkerHeartbeat) {
+ if (ConfigUtils.isLocalMode(this.conf)) {
+ return;
+ }
+ //in distributed mode, send heartbeat directly to master if local
supervisor goes down
+ SupervisorWorkerHeartbeat workerHeartbeat = new
SupervisorWorkerHeartbeat(lsWorkerHeartbeat.get_topology_id(),
+ lsWorkerHeartbeat.get_executors(),
lsWorkerHeartbeat.get_time_secs());
+ try{
+ SupervisorClient client =
SupervisorClient.getConfiguredClient(conf, Utils.hostname());
+
client.getClient().sendSupervisorWorkerHeartbeat(workerHeartbeat);
+ client.close();
+ } catch (Throwable tr1) {
+ //if any error/exception thrown, report directly to nimbus.
+ LOG.debug("Exception when send heartbeat to local supervisor",
tr1.getMessage());
--- End diff --
We could use `warn`, given that supervisor unavailability is also kind of
critical thing in operation perspective.
---