[
https://issues.apache.org/jira/browse/IGNITE-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506233#comment-16506233
]
Sergey Chugunov edited comment on IGNITE-8657 at 6/8/18 4:31 PM:
-----------------------------------------------------------------
[~agoncharuk],
Good catch, thanks for spotting this!
I reviewed the code and found out that assertion was caused by quite unusual
property *forceServerMode*.
The problem with it was that ClusterNode#isClient for such client returns false
when SinglePartitionMessage sent from this client returns true.
I'm not sure if we need to force reconnecting of such clients so I changed
implementation in such was that we don't force them to reconnect. After that
test started passing on TC so I think this logic works.
What do you think?
was (Author: sergey-chugunov):
[~agoncharuk],
Good catch, thanks for spotting this!
I reviewed the code and found out that assertion was caused by quite unusual
property forceServerMode.
The problem with it was that ClusterNode#isClient for such client returns false
when SinglePartitionMessage sent from this client returns true.
I'm not sure if we need to force reconnecting of such clients so I changed
implementation in such was that we don't force them to reconnect. After that
test started passing on TC so I think this logic works.
What do you think?
> Simultaneous start of bunch of client nodes may lead to some clients hangs
> --------------------------------------------------------------------------
>
> Key: IGNITE-8657
> URL: https://issues.apache.org/jira/browse/IGNITE-8657
> Project: Ignite
> Issue Type: Bug
> Affects Versions: 2.5
> Reporter: Sergey Chugunov
> Assignee: Sergey Chugunov
> Priority: Major
> Fix For: 2.6
>
>
> h3. Description
> PartitionExchangeManager uses a system property
> *IGNITE_EXCHANGE_HISTORY_SIZE* to manage max number of exchange objects and
> optimize memory consumption.
> Default value of the property is 1000 but in scenarios with many caches and
> partitions it is reasonable to set exchange history size to a smaller values
> around few dozens.
> Then if user starts up at once more client nodes than history size some
> clients may hang because their exchange information was preempted and no
> longer available.
> h3. Workarounds
> Two workarounds are possible:
> * Do not start at once more clients than history size.
> * Restart hanging client node.
> h3. Solution
> Forcing client node to reconnect when server detected loosing its exchange
> information prevents client nodes hanging.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)