[ https://issues.apache.org/jira/browse/IGNITE-10912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexei Scherbakov updated IGNITE-10912: --------------------------------------- Description: WIth large topology and large number of caches/groups node join message can reach a size > 30M due to a large amount of transferred discovery data. It adds overhead on ring traversal and slows down "node join" PME. Possible solution: # introduce DiscoveryDataMessage for transferring discovery related data which doesn't increment topology version. After all nodes wil have corressponding discovery data start actual joining. Discovery data probably should be stored off-heap(or even on disk) to avoid heap usage bursts on joining of multiple nodes. # Add compression to discovery data. Same problem for CacheAffinityChangeMessage (PME after late affinity) and dynamic cache start message (if starting many caches). was: WIth large topology and large number of caches/groups node join message can reach a size > 30M due to a large amount of transferred discovery data. It adds overhead on ring traversal and slows down "node join" PME. Possible solution: # introduce pre-join message with discovery data which doesn't increment topology version. After all nodes wil have corressponding discovery data start actual joining. Discovery data probably should be stored off-heap(or even on disk) to avoid heap usage bursts on joining of multiple nodes. # Add compression to discovery data. Same problem for CacheAffinityChangeMessage (PME after late affinity) and dynamic cache start message (if starting many caches). > Huge discovery messages slow down node joining/dynamic cache start and > corresponding PME > ---------------------------------------------------------------------------------------- > > Key: IGNITE-10912 > URL: https://issues.apache.org/jira/browse/IGNITE-10912 > Project: Ignite > Issue Type: Improvement > Reporter: Alexei Scherbakov > Priority: Major > Fix For: 2.8 > > > WIth large topology and large number of caches/groups node join message can > reach a size > 30M due to a large amount of transferred discovery data. > It adds overhead on ring traversal and slows down "node join" PME. > Possible solution: > # introduce DiscoveryDataMessage for transferring discovery related data > which doesn't increment topology version. After all nodes wil have > corressponding discovery data start actual joining. Discovery data probably > should be stored off-heap(or even on disk) to avoid heap usage bursts on > joining of multiple nodes. > # Add compression to discovery data. > Same problem for CacheAffinityChangeMessage (PME after late affinity) and > dynamic cache start message (if starting many caches). -- This message was sent by Atlassian JIRA (v7.6.3#76005)