Re: Start command not propagated by Nimbus

Ethan Li Tue, 04 Jun 2019 13:57:24 -0700

Hi Mitchell,

Does the UI show that this topology is fully scheduled? Are you using 
ResourceAwareScheduler? Because it’s possible that the scheduler cannot find 
enough resources to schedule this topology. 

Also after starting the topology, you could login to zookeeper and check if 
there is a assignment belong to this topology. You can also read the content of 
it to get some better idea. But you will need some coding to do it since it’s 
serialized. 

It’s really hard to tell the root cause without more information. Is it 
possible for you to provide all the related nimbus.log, supervisor.log so I can 
take a look?

Best,
Ethan

> On May 17, 2019, at 11:38 AM, Mitchell Rathbun (BLOOMBERG/ 731 LEX) 
> <mrathb...@bloomberg.net> wrote:
> 
> We have a topology that was never started, even though nimbus received the 
> start command. Supervisor never received a command to start this topology, so 
> the issue wasn't in our topology code. In the logs, I see:
> 
> 2019-05-11 10:07:28,087 INFO  nimbus [pool-14-thread-16] Activating 
> WingmanTopology4246: WingmanTopology4246-251-1557583643
> 
> There were a bunch of topologies started around the same time, and most of 
> them had the following message occur next:
> 
> [timer] Setting new assignment for topology id <Topology 
> Name>:................
> 
> However, we did not see this logged for the topology that wasn't started. 
> When the cluster was stopped, we saw:
> 
> 2019-05-11 10:36:04,447 INFO  nimbus [pool-14-thread-4] Delaying event 
> :remove for 5 secs for WingmanTopology4246-251-1557583643
> 2019-05-11 10:36:04,457 INFO  nimbus [pool-14-thread-4] Adding topo to 
> history log: WingmanTopology4246-251-1557583643
> 
> 
> What could have caused this? There were 16 topologies submitted to be run in 
> total, our storm.yaml file allocates more than enough slots under 
> supervisor.slots.ports.

Re: Start command not propagated by Nimbus

Reply via email to