[ 
https://issues.apache.org/jira/browse/IGNITE-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451986#comment-16451986
 ] 

Alexey Goncharuk commented on IGNITE-7648:
------------------------------------------

[~ascherbakov], a few minor comments:
1) Since you've added exponential backoff before reconnect, please add backoff 
timeout to the logging output
2) Please limit the maximum time to delay with some reasonable value, as far as 
I can uderstand, the sleep may be too long if the number of attempts is large
3) I see a bunch of IgniteCachePutRetryTransactionalSelfTest failed on TC, 
please make sure the failures are not related to your test - maybe it makes 
sense to trigger a couple more runs of failover suite

[~ilyak], can you also take a look at the change?

> Revert IGNITE_ENABLE_FORCIBLE_NODE_KILL system property.
> --------------------------------------------------------
>
>                 Key: IGNITE-7648
>                 URL: https://issues.apache.org/jira/browse/IGNITE-7648
>             Project: Ignite
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Alexei Scherbakov
>            Assignee: Alexei Scherbakov
>            Priority: Major
>             Fix For: 2.6
>
>
> IGNITE_ENABLE_FORCIBLE_NODE_KILL system property was introduced in 
> IGNITE-5718 as a way to prevent unnecessary node drops in case of short 
> network problems.
> I suppose it's wrong decision to fix it in such way.
> We had faced some issues in our production due to lack of automatic kicking 
> of ill-behaving nodes (on example, hanging due to long GC pauses) until we 
> realised the necessity of changing default behavior via property.
> Right solution is to kick nodes only if failure threshold is reached. Such 
> behavior should be always enabled.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to