[
https://issues.apache.org/jira/browse/IGNITE-10912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexei Scherbakov updated IGNITE-10912:
---------------------------------------
Description:
WIth large topology and large number of caches/groups node join message can
reach a size > 30M due to a large amount of transferred discovery data.
It adds overhead on ring traversal and slows down "node join" PME.
Possible solution:
# introduce DiscoveryDataMessage for transferring discovery related data which
doesn't increment topology version. After all nodes wil have corressponding
discovery data start actual joining. Discovery data probably should be stored
off-heap(or even on disk) to avoid heap usage bursts on joining of multiple
nodes.
# Add compression to discovery data.
Same problem for CacheAffinityChangeMessage (PME after late affinity) and
dynamic cache start message (if starting many caches).
was:
WIth large topology and large number of caches/groups node join message can
reach a size > 30M due to a large amount of transferred discovery data.
It adds overhead on ring traversal and slows down "node join" PME.
Possible solution:
# introduce pre-join message with discovery data which doesn't increment
topology version. After all nodes wil have corressponding discovery data start
actual joining. Discovery data probably should be stored off-heap(or even on
disk) to avoid heap usage bursts on joining of multiple nodes.
# Add compression to discovery data.
Same problem for CacheAffinityChangeMessage (PME after late affinity) and
dynamic cache start message (if starting many caches).
> Huge discovery messages slow down node joining/dynamic cache start and
> corresponding PME
> ----------------------------------------------------------------------------------------
>
> Key: IGNITE-10912
> URL: https://issues.apache.org/jira/browse/IGNITE-10912
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexei Scherbakov
> Priority: Major
> Fix For: 2.8
>
>
> WIth large topology and large number of caches/groups node join message can
> reach a size > 30M due to a large amount of transferred discovery data.
> It adds overhead on ring traversal and slows down "node join" PME.
> Possible solution:
> # introduce DiscoveryDataMessage for transferring discovery related data
> which doesn't increment topology version. After all nodes wil have
> corressponding discovery data start actual joining. Discovery data probably
> should be stored off-heap(or even on disk) to avoid heap usage bursts on
> joining of multiple nodes.
> # Add compression to discovery data.
> Same problem for CacheAffinityChangeMessage (PME after late affinity) and
> dynamic cache start message (if starting many caches).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)