I'm using TcpDiscovery with the nodes running at Amazon. So far the networking inside the Amazon cloud has been rock solid. That said, over the weekend I had two compute tasks fail because of an empty projection. It looks like the compute nodes all disconnected.

Its unclear why exactly the nodes disconnected.

I'm thinking of increasing the number of missed heartbeats but I'd also like to increase the amount of logging so that I have more information to debug similar errors in the future.

I know that Ignite logging can be verbose - does anyone have suggestions for how I'd want to configure the logging in order to capture clues to diagnose the random disconnects but also not have a mountain of useless logs?

Thanks!
Ryan

Reply via email to