Hi, I'm seeing weird behavior in my topologies and was hoping for some advice on how to troubleshoot the issue.
This behavior occurs throughout my topology, but it is easiest to explain it as the behavior of one bolt. This bolt has 20 executors. When I submit the topology, the executors are evenly split between 2 hosts. The executors on one host seem stable, but the Uptime for the executors on the other host never grows above 10mins-ish, they are constantly being re-prepared. I don't know what this is symptomatic of or how to diagnose it. All the Executors have the same Uptime, so I assume this indicates that their Worker is dying. Any advice on how to troubleshoot this? Possibly a way to tap into the Worker lifecycle so I can confirm it is dying every few minutes? Possibly an explanation of why a Worker would die so consistently, and suggestions about how to approach this? Also, any input on how "bad" this is? My topology still processes stuff, but I assume this constant recreation of Executors has a significant performance impact? thanks, Abe
