As a follow-up, you can disregard what I was saying about nimbus crashing
but I'm still interested in fixing these noisy errors in logs.

@Rui thanks. I did check ZK and did not see refs to the old versions in
there at least?

On Mon, Oct 25, 2021 at 11:31 AM Rui Abreu <[email protected]> wrote:

> Hi Andrew,
>
> Not sure how much this helps, but in version 1.x, state was on the
> following znodes:
>
> /$storm-znode/storms
> /$storm-znode/assignments
> /$storm-znode/blobstore
>
>
> Deleting all references (with rm or deleteall, depending on Zookeeper's
> version), followed by a Nimbus's rolling restart should suffice.
>
> On Mon, Oct 25, 2021, 18:49 Andrew Neilson <[email protected]> wrote:
>
>> Hi,
>>
>> We're running a v2.2.0 cluster with two nimbus hosts and recently noticed
>> storm-nimbus on the leader is effectively in a restart loop.
>>
>> When I look at nimbus.log on that host it is full of log entries related
>> to old versions of topologies we're running. There are the two types of
>> exceptions I am seeing
>>
>> 1. get blob meta exception:
>>
>> For *topology-A *for example, we're currently on topology-A-25:
>>
>> 2021-10-25 13:39:51.064 o.a.s.d.n.Nimbus pool-29-thread-62 [WARN]
>> Exception when getting heartbeat timeout.
>> 2021-10-25 13:39:51.075 o.a.s.d.n.Nimbus pool-29-thread-16 [WARN] get
>> blob meta exception.
>> org.apache.storm.utils.WrappedKeyNotFoundException:
>> topology-A-5-1633368551-stormjar.jar
>>
>> For *topology-B*, we're on topology-B-24:
>>
>> 2021-10-25 13:38:51.106 o.a.s.d.n.Nimbus pool-29-thread-21 [WARN] get
>> blob meta exception.
>> org.apache.storm.utils.WrappedKeyNotFoundException:
>> topology-B-11-1632770137-stormcode.ser
>>
>> 2. Send HB exception:
>>
>> 2021-10-25 13:39:51.745 o.a.s.d.n.Nimbus pool-29-thread-36 [WARN]
>> Exception when getting heartbeat timeout.
>> 2021-10-25 13:39:51.760 o.a.s.d.n.Nimbus pool-29-thread-37 [WARN] Send HB
>> exception. (topology id='topology-A-10-1632769783')
>> org.apache.storm.utils.WrappedNotAliveException: topology-A-10-1632769783
>>
>> This seems isolated to two versions of "topology-A" and one version of
>> "topology-B".
>>
>> I'm not seeing references to these topology versions in Zookeeper. Does
>> anyone know how to safely clear out this old state? If not, any suggestions
>> on how to debug this? Further, is this related to any known bug?
>>
>> Thanks,
>> Andrew
>>
>

Reply via email to