[ https://issues.apache.org/jira/browse/IGNITE-13016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vladimir Steshin updated IGNITE-13016: -------------------------------------- Description: Backward node connection checking looks wierd. What we might improve are: 1) Addresses checking could be done in parrallel, not sequentially. {code:java} for (InetSocketAddress addr : nodeAddrs) { // Connection refused may be got if node doesn't listen // (or blocked by firewall, but anyway assume it is dead). if (!isConnectionRefused(addr)) { liveAddr = addr; break; } } {code} 2) Any io-exception should be considered as failed connection, not only connection-refused: {code:java} catch (ConnectException e) { return true; } catch (IOException e) { return false; } {code} 3) Timeout on connection checking should not be constant or hardcode: {code:java} sock.connect(addr, 100); {code} 4) Decision to check connection should rely on configured exchange timeout, no on the ping interval {code:java} // We got message from previous in less than double connection check interval. boolean ok = rcvdTime + U.millisToNanos(connCheckInterval) * 2 >= now; {code} was: Backward node connection checking looks wierd. What might be improved are: 1) Addresses checking could be done in parrallel, not sequentially. {code:java} for (InetSocketAddress addr : nodeAddrs) { // Connection refused may be got if node doesn't listen // (or blocked by firewall, but anyway assume it is dead). if (!isConnectionRefused(addr)) { liveAddr = addr; break; } } {code} 2) Any io-exception should be considered as failed connection, not only connection-refused: {code:java} catch (ConnectException e) { return true; } catch (IOException e) { return false; } {code} 3) Timeout on connection checking should not be constand or hardcoced: {code:java} sock.connect(addr, 100); {code} 4) Decision to check connection should rely on configured exchange timeout, no on the ping interval {code:java} // We got message from previous in less than double connection check interval. boolean ok = rcvdTime + U.millisToNanos(connCheckInterval) * 2 >= now; {code} > Fix backward checking of failed node. > ------------------------------------- > > Key: IGNITE-13016 > URL: https://issues.apache.org/jira/browse/IGNITE-13016 > Project: Ignite > Issue Type: Sub-task > Reporter: Vladimir Steshin > Assignee: Vladimir Steshin > Priority: Major > Labels: iep-45 > Fix For: 2.9 > > Time Spent: 10m > Remaining Estimate: 0h > > Backward node connection checking looks wierd. What we might improve are: > 1) Addresses checking could be done in parrallel, not sequentially. > {code:java} > for (InetSocketAddress addr : nodeAddrs) { > // Connection refused may be got if node doesn't listen > // (or blocked by firewall, but anyway assume it is dead). > if (!isConnectionRefused(addr)) { > liveAddr = addr; > break; > } > } > {code} > 2) Any io-exception should be considered as failed connection, not only > connection-refused: > {code:java} > catch (ConnectException e) { > return true; > } > catch (IOException e) { > return false; > } > {code} > 3) Timeout on connection checking should not be constant or hardcode: > {code:java} > sock.connect(addr, 100); > {code} > 4) Decision to check connection should rely on configured exchange timeout, > no on the ping interval > {code:java} > // We got message from previous in less than double connection check interval. > boolean ok = rcvdTime + U.millisToNanos(connCheckInterval) * 2 >= now; > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)