>>I'm almost certain is that problem is that server node cannot open a
connection to client node (and while it tries, it will reject connection
attempts from client node)

The default idleTimeout of TCP communication spi is 6 minutes. So I assume,
after this timeout, the connection is closed and restarted probably later on
a request from the client. 
So I can imagine, this issue happening, when the client and server are
trying to re-establish the connection and your explanation makes sense. 

However, my concern still remains.  The server has plenty of timeouts in its
communication SPI. Why wouldn't the server, not throw that client out of the
cluster and let the client fail gracefully? This incessant pinging by client
to the server is a problem in production environments. 

Currently, the only way out for me seems to be set 
 >>>    communicationSpi.setIdleConnectionTimeout(Long.MAX_VALUE);
testing is currently and seems to be holding up for 22 hours now. 

Do you see any issues with setting idleConnectionTimeout that high?


Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Reply via email to