Github user agresch commented on a diff in the pull request:
https://github.com/apache/storm/pull/2732#discussion_r197790334
--- Diff:
storm-server/src/main/java/org/apache/storm/daemon/nimbus/Nimbus.java ---
@@ -2672,7 +2680,12 @@ private ClusterSummary getClusterInfoImpl() throws
Exception {
summary.set_assigned_memoffheap(resources.getAssignedMemOffHeap());
summary.set_assigned_cpu(resources.getAssignedCpu());
}
-
summary.set_replication_count(getBlobReplicationCount(ConfigUtils.masterStormCodeKey(topoId)));
+ try {
+
summary.set_replication_count(getBlobReplicationCount(ConfigUtils.masterStormCodeKey(topoId)));
+ } catch (KeyNotFoundException e) {
+ // This could fail if a blob gets deleted by mistake.
Don't crash nimbus.
+ LOG.error("Unable to find blob entry", e);
--- End diff --
@HeartSaVioR - This change I made first in isolation (when still allowing
the blob to be deleted). This prevents nimbus from crashing, but a nimbus
restart will still have issues. With the other change, this line may no longer
be necessary, but I would rather be defensive. It should not hit unless we
have further race conditions the other code check is missing.
---