[ 
https://issues.apache.org/jira/browse/IGNITE-28459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Petrov updated IGNITE-28459:
------------------------------------
    Description: 
Steps that lead to mentioned problem:

1. The thin client establishes a connection to the server.
2. The server accepts the connection and enqueues the socket channel 
registration in the selector (see 
GridNioServer.AbstractNioClientWorker#offer(), 
GridNioServer.NioOperation#REGISTER)
3. The server stops the NIO server, which results in stopping queue processing
4. The thin client's socket channel remains in the queue, unclosed and 
unprocessed

This can be a major problem for cases when
1. server node stops but Java process keeps running
2. thin client default handshake timeout is equal to zero
This causes the thin client to hang indefinitely

The problem can be locally reproduced by applying attached patch and running 
org.apache.ignite.client.ReliabilityTest#testServiceProxyFailover test.

After IGNITE-28442, the issue on the thin client side is mitigated by 
automatically setting a connection handshake timeout. However, the root cause 
of the problem remains. 

  was:
Steps that lead to mentioned problem:

1. The thin client establishes a connection to the server.
2. The server accepts the connection and enqueues the socket channel 
registration in the selector (see 
GridNioServer.AbstractNioClientWorker#offer(), 
GridNioServer.NioOperation#CONNECT, GridNioServer.NioOperation#REGISTER)
3. The server stops the NIO server, which results in stopping queue processing
4. The thin client's socket channel remains in the queue, unclosed and 
unprocessed

This can be a major problem for cases when
1. server node stops but Java process keeps running
2. thin client default handshake timeout is equal to zero
This causes the thin client to hang indefinitely

The problem can be locally reproduced by applying attached patch and running 
org.apache.ignite.client.ReliabilityTest#testServiceProxyFailover test.

After IGNITE-28442, the issue on the thin client side is resolved by 
automatically setting a connection handshake timeout. However, the root cause 
of the problem remains. 


> Ignite NIO server may not close client socket channels if connection was 
> accepted while node is stopping
> --------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-28459
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28459
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Mikhail Petrov
>            Assignee: Mikhail Petrov
>            Priority: Major
>              Labels: ise
>             Fix For: 2.19
>
>         Attachments: reproducer.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Steps that lead to mentioned problem:
> 1. The thin client establishes a connection to the server.
> 2. The server accepts the connection and enqueues the socket channel 
> registration in the selector (see 
> GridNioServer.AbstractNioClientWorker#offer(), 
> GridNioServer.NioOperation#REGISTER)
> 3. The server stops the NIO server, which results in stopping queue processing
> 4. The thin client's socket channel remains in the queue, unclosed and 
> unprocessed
> This can be a major problem for cases when
> 1. server node stops but Java process keeps running
> 2. thin client default handshake timeout is equal to zero
> This causes the thin client to hang indefinitely
> The problem can be locally reproduced by applying attached patch and running 
> org.apache.ignite.client.ReliabilityTest#testServiceProxyFailover test.
> After IGNITE-28442, the issue on the thin client side is mitigated by 
> automatically setting a connection handshake timeout. However, the root cause 
> of the problem remains. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to