Sometimes when we run a three node topology, if a worker fails and comes back up, the entire topology will become sluggish, and messages will constantly be marked as failed. After changing the logging, we can determine that the topology is actually fully processing messages, however they are never being passed back to the acker to be acked. I've done searches to try and find solutions (other than don't let the worker fail) to fix the issue, but haven't found anything yet.
- Trouble with Acking After a Worker Fails Poling, Raymond
- Re: Trouble with Acking After a Worker Fails Phil Burress
