[ 
https://issues.apache.org/jira/browse/IGNITE-20955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Mohan updated IGNITE-20955:
---------------------------------
    Language: Java

> Issue in Reconnect Throttling 
> ------------------------------
>
>                 Key: IGNITE-20955
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20955
>             Project: Ignite
>          Issue Type: Bug
>          Components: thin client
>    Affects Versions: 2.14, 2.15
>            Reporter: Rahul Mohan
>            Priority: Major
>
> Encountered an issue where the below Reconnect throttling exception is thrown 
> even when there are no connection issues to Ignite server.
> {code:java}
>  org.apache.ignite.client.ClientConnectionException: Reconnect is not allowed 
> due to applied throttling
>         at 
> org.apache.ignite.internal.client.thin.ReliableChannel$ClientChannelHolder.getOrCreateChannel(ReliableChannel.java:959)
>  ~[ignite-core-2.14.0.jar!/:2.14.0]
>         at 
> org.apache.ignite.internal.client.thin.ReliableChannel$ClientChannelHolder.getOrCreateChannel(ReliableChannel.java:942)
>  ~[ignite-core-2.14.0.jar!/:2.14.0]
>         at 
> org.apache.ignite.internal.client.thin.ReliableChannel$ClientChannelHolder.access$200(ReliableChannel.java:891)
>  ~[ignite-core-2.14.0.jar!/:2.14.0]
>         at 
> org.apache.ignite.internal.client.thin.ReliableChannel.applyOnDefaultChannel(ReliableChannel.java:795)
>  ~[ignite-core-2.14.0.jar!/:2.14.0]
>         at 
> org.apache.ignite.internal.client.thin.ReliableChannel.applyOnNodeChannelWithFallback(ReliableChannel.java:848)
>  ~[ignite-core-2.14.0.jar!/:2.14.0]
>         at 
> org.apache.ignite.internal.client.thin.ReliableChannel.affinityService(ReliableChannel.java:315)
>  ~[ignite-core-2.14.0.jar!/:2.14.0]
>         at 
> org.apache.ignite.internal.client.thin.TcpClientCache.cacheSingleKeyOperation(TcpClientCache.java:1084)
>  ~[ignite-core-2.14.0.jar!/:2.14.0]{code}
> Disabling reconnect throttling avoids this issue and cache accesses works as 
> expected, which provides further confirmation that there are no underlying 
> connection issues.
> On digging into the thin client code i noticed that, 
> applyReconnectionThrottling() doesnt actually check the connection status of 
> the Channel but just tracks the timestamps of the last n (where n is 
> reconnectThrottlingRetries ) connection/operations .
> {code:java}
> org.apache.ignite.internal.client.thin.ReliableChannel.java
> private ClientChannel getOrCreateChannel(boolean ignoreThrottling)
>             throws ClientConnectionException, ClientAuthenticationException, 
> ClientProtocolError {
>             if (ch == null && !close) {
>                 synchronized (this) {
>                     if (close)
>                         return null;                    
>                     if (ch != null)
>                         return ch;                   
>                     if (!ignoreThrottling && applyReconnectionThrottling())
>                         throw new ClientConnectionException("Reconnect is not 
> allowed due to applied throttling"); {code}
>  
> {code:java}
> org.apache.ignite.internal.client.thin.ReliableChannel.java
> private boolean applyReconnectionThrottling() {
>             if (reconnectRetries == null)
>                 return false;
>             long ts = System.currentTimeMillis();
>             for (int i = 0; i < reconnectRetries.length; i++) {
>                 if (ts - reconnectRetries[i] >= 
> chCfg.getReconnectThrottlingPeriod()){                    
>                       reconnectRetries[i] = ts;                       
>                       return false;                 
>                 }
>             }
>             return true;
> }{code}
> Assuming my understanding is correct , its possible that depending on the 
> timing of cache access operations within reconnectThrottlingPeriod  requests 
> are mistakenly throttled even when there are no prior connection/timeout 
> issues.
> Example scenario:
> Using default  value of throttling period and retries and using mock 
> timestamps for conciseness
> reconnectThrottlingPeriod = 30_000L;
> reconnectThrottlingRetries = 3;
> ts=1000
> reconnectRetries[0] = 1000 ->  returns false ,no throttlling exception
> ts=1005
> reconnectRetries[1] = 1005 ->  returns  false,no throttlling exception
> ts=1020 
> reconnectRetries[2] = 1020 -> returns  false,no throttlling exception
> ts=1025  -> returns true , reconnect throttlling exception thrown and 
> operation aborted



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to