Ibrahim, Could you please also share the cache configuration that is used for dynamic creation?
чт, 10 окт. 2019 г. в 19:09, Pavel Kovalenko <jokse...@gmail.com>: > Hi Ibrahim, > > I see that one node didn't send acknowledgment during cache creation: > [2019-09-27T15:00:17,727][WARN > ][exchange-worker-#219][GridDhtPartitionsExchangeFuture] Unable to await > partitions release latch within timeout: ServerLatch [permits=1, > pendingAcks=[*3561ac09-6752-4e2e-8279-d975c268d045*], > super=CompletableLatch [id=exchange, topVer=AffinityTopologyVersion > [topVer=92, minorTopVer=2]]] > > Do you have any logs from a node with id = > "3561ac09-6752-4e2e-8279-d975c268d045". > You can find this node by grepping the following > "locNodeId=3561ac09-6752-4e2e-8279-d975c268d045" like in line: > [2019-09-27T15:24:03,532][INFO ][main][TcpDiscoverySpi] Successfully bound > to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0,* > locNodeId=70b49e00-5b9f-4459-9055-a05ce358be10*] > > > ср, 9 окт. 2019 г. в 17:34, ihalilaltun <ibrahim.al...@segmentify.com>: > >> Hi There Igniters, >> >> We had a very strange cluster behivour while creating new caches on the >> fly. >> Just after caches are created we start get following warnings from all >> cluster nodes, including coordinator node; >> >> [2019-09-27T15:00:17,727][WARN >> ][exchange-worker-#219][GridDhtPartitionsExchangeFuture] Unable to await >> partitions release latch within timeout: ServerLatch [permits=1, >> pendingAcks=[3561ac09-6752-4e2e-8279-d975c268d045], super=CompletableLatch >> [id=exchange, topVer=AffinityTopologyVersion [topVer=92, minorTopVer=2]]] >> >> After a while all client nodes are seemed to disconnected from cluster >> with >> no logs on clients' side. >> >> Coordinator node has many logs like; >> 2019-09-27T15:00:03,124][WARN >> ][sys-#337823][GridDhtPartitionsExchangeFuture] Partition states >> validation >> has failed for group: acc_1306acd07be78000_userPriceDrop. Partitions cache >> sizes are inconsistent for Part 129: >> [9497f1c4-13bd-4f90-bbf7-be7371cea22f=757 >> 1486cd47-7d40-400c-8e36-b66947865602=2427 ] Part 138: >> [1486cd47-7d40-400c-8e36-b66947865602=2463 >> f9cf594b-24f2-4a91-8d84-298c97eb0f98=736 ] Part 156: >> [b7782803-10da-45d8-b042-b5b4a880eb07=672 >> 9f0c2155-50a4-4147-b444-5cc002cf6f5d=2414 ] Part 284: >> [b7782803-10da-45d8-b042-b5b4a880eb07=690 >> 1486cd47-7d40-400c-8e36-b66947865602=1539 ] Part 308: >> [1486cd47-7d40-400c-8e36-b66947865602=2401 >> 7750e2f1-7102-4da2-9a9d-ea202f73905a=706 ] Part 362: >> [1486cd47-7d40-400c-8e36-b66947865602=2387 >> 7750e2f1-7102-4da2-9a9d-ea202f73905a=697 ] Part 434: >> [53c253e1-ccbe-4af1-a3d6-178523023c8b=681 >> 1486cd47-7d40-400c-8e36-b66947865602=1541 ] Part 499: >> [1486cd47-7d40-400c-8e36-b66947865602=2505 >> 7750e2f1-7102-4da2-9a9d-ea202f73905a=699 ] Part 622: >> [1486cd47-7d40-400c-8e36-b66947865602=2436 >> e97a0f3f-3175-49f7-a476-54eddd59d493=662 ] Part 662: >> [b7782803-10da-45d8-b042-b5b4a880eb07=686 >> 1486cd47-7d40-400c-8e36-b66947865602=2445 ] Part 699: >> [1486cd47-7d40-400c-8e36-b66947865602=2427 >> f9cf594b-24f2-4a91-8d84-298c97eb0f98=646 ] Part 827: >> [62a05754-3f3a-4dc8-b0fa-53c0a0a0da63=703 >> 1486cd47-7d40-400c-8e36-b66947865602=1549 ] Part 923: >> [1486cd47-7d40-400c-8e36-b66947865602=2434 >> a9e9eaba-d227-4687-8c6c-7ed522e6c342=706 ] Part 967: >> [62a05754-3f3a-4dc8-b0fa-53c0a0a0da63=673 >> 1486cd47-7d40-400c-8e36-b66947865602=1595 ] Part 976: >> [33301384-3293-417f-b94a-ed36ebc82583=666 >> 1486cd47-7d40-400c-8e36-b66947865602=2384 ] >> >> Coordinator's log and one of the cluster node's log is attached. >> coordinator_log.gz >> < >> http://apache-ignite-users.70518.x6.nabble.com/file/t2515/coordinator_log.gz> >> >> cluster_node_log.gz >> < >> http://apache-ignite-users.70518.x6.nabble.com/file/t2515/cluster_node_log.gz> >> >> >> Any help/comment is appriciated. >> >> Thanks. >> >> >> >> >> >> ----- >> İbrahim Halil Altun >> Senior Software Engineer @ Segmentify >> -- >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >> >