Hi,

We're running a v2.2.0 cluster with two nimbus hosts and recently noticed
storm-nimbus on the leader is effectively in a restart loop.

When I look at nimbus.log on that host it is full of log entries related to
old versions of topologies we're running. There are the two types of
exceptions I am seeing

1. get blob meta exception:

For *topology-A *for example, we're currently on topology-A-25:

2021-10-25 13:39:51.064 o.a.s.d.n.Nimbus pool-29-thread-62 [WARN] Exception
when getting heartbeat timeout.
2021-10-25 13:39:51.075 o.a.s.d.n.Nimbus pool-29-thread-16 [WARN] get blob
meta exception.
org.apache.storm.utils.WrappedKeyNotFoundException:
topology-A-5-1633368551-stormjar.jar

For *topology-B*, we're on topology-B-24:

2021-10-25 13:38:51.106 o.a.s.d.n.Nimbus pool-29-thread-21 [WARN] get blob
meta exception.
org.apache.storm.utils.WrappedKeyNotFoundException:
topology-B-11-1632770137-stormcode.ser

2. Send HB exception:

2021-10-25 13:39:51.745 o.a.s.d.n.Nimbus pool-29-thread-36 [WARN] Exception
when getting heartbeat timeout.
2021-10-25 13:39:51.760 o.a.s.d.n.Nimbus pool-29-thread-37 [WARN] Send HB
exception. (topology id='topology-A-10-1632769783')
org.apache.storm.utils.WrappedNotAliveException: topology-A-10-1632769783

This seems isolated to two versions of "topology-A" and one version of
"topology-B".

I'm not seeing references to these topology versions in Zookeeper. Does
anyone know how to safely clear out this old state? If not, any suggestions
on how to debug this? Further, is this related to any known bug?

Thanks,
Andrew

Reply via email to