daniverltd commented on issue #11271: URL: https://github.com/apache/ignite/issues/11271#issuecomment-2020240927
Either accidentally or maliciously cause a node to run out of file descriptors on Linux by creating a cache or caches with the number of partitions exceeding the number of remaining file descriptors (native persistence has to be on), or by keep opening socket connections to the server (no SSL certificate required) without ever closing them, or by using a commercial piece of software such as 3DNS that periodically polls the Ignite discovery port to check the live-ness of the port - this seems to cause Ignite to leak open files. Doing any of the above will ultimately cause a send message to another node to fail because it can't open a new socket connection but then wait indefinitely for a reply that will never happen because the original message wasn't sent. This then triggers the handshake timeout, the blocked system thread and then the restart handler to fail. Killing the node with a SIGTERM causes the node to log "shutdown hook invoked" and then nothing else and it never exits; only a SIGKILL will break the deadlock. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
