Hello Pavel, I was able to reproduce this issue and I've attached the DEBUG log and thread dump for three nodes as you suggested. Archive.zip <http://apache-ignite-users.70518.x6.nabble.com/file/t1346/Archive.zip>
This time, there's no "no route to host" exception between server and client nodes. Node2 and node3 logs "Unable to await partitions release latch within timeout: ClientLatch" shortly after cluster starts, node1 don't have explicitly errors. And cluster begins to freeze after about 20 minutes after the data ingestion starts. The attached picture is data streaming threads running/park time slice in each of three nodes. You can see that node3 freezes first then node2 freezes. So client can only writes to node1 and triggered a lot of rebalancing. node1.png <http://apache-ignite-users.70518.x6.nabble.com/file/t1346/node1.png> node2.png <http://apache-ignite-users.70518.x6.nabble.com/file/t1346/node2.png> node3.png <http://apache-ignite-users.70518.x6.nabble.com/file/t1346/node3.png> By the time I wrote the post, the data ingestion usually takes 5 minutes is still not finished after 1.1 hour. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
