[
https://issues.apache.org/jira/browse/IGNITE-28459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mikhail Petrov updated IGNITE-28459:
------------------------------------
Description:
Steps that lead to mentioned problem:
1. The thin client establishes a connection to the server.
2. The server accepts the connection and enqueues the socket channel
registration in the selector (see
GridNioServer.AbstractNioClientWorker#offer(),
GridNioServer.NioOperation#CONNECT, GridNioServer.NioOperation#REGISTER)
3. The server stops the NIO server, which results in stopping queue processing
4. The thin client's socket channel remains in the queue, unclosed and
unprocessed
This can be a major problem for cases when
1. server node stops but Java process keeps running
2. thin client default handshake timeout is equal to zero
This causes the thin client to hang indefinitely
The problem can be locally reproduced by applying attached patch and running
org.apache.ignite.client.ReliabilityTest#testServiceProxyFailover test.
After IGNITE-28442, the issue on the thin client side is resolved by
automatically setting a connection handshake timeout. However, the root cause
of the problem remains.
was:
Steps that lead to mentioned problem:
1. The thin client establishes a connection to the server.
2. The server accepts the connection and enqueues the socket channel
registration in the selector (see
GridNioServer.AbstractNioClientWorker#offer(),
GridNioServer.NioOperation#CONNECT, GridNioServer.NioOperation#REGISTER)
3. The server stops the NIO server, which results in stopping queue processing
4. The thin client's socket channel remains in the queue, unclosed and
unprocessed
This can be a major problem for cases when
1. server node stops but Java process keeps running
2. thin client default handshake timeout is equal to zero
This causes the thin client to hang indefinitely
The problem can be locally reproduced by applying attached patch and running
org.apache.ignite.client.ReliabilityTest#testServiceProxyFailover test.
> Ignite NIO server may not close client socket channels if connection was
> accepted while node is stopping
> --------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-28459
> URL: https://issues.apache.org/jira/browse/IGNITE-28459
> Project: Ignite
> Issue Type: Improvement
> Reporter: Mikhail Petrov
> Assignee: Mikhail Petrov
> Priority: Major
> Labels: ise
> Fix For: 2.19
>
> Attachments: reproducer.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Steps that lead to mentioned problem:
> 1. The thin client establishes a connection to the server.
> 2. The server accepts the connection and enqueues the socket channel
> registration in the selector (see
> GridNioServer.AbstractNioClientWorker#offer(),
> GridNioServer.NioOperation#CONNECT, GridNioServer.NioOperation#REGISTER)
> 3. The server stops the NIO server, which results in stopping queue processing
> 4. The thin client's socket channel remains in the queue, unclosed and
> unprocessed
> This can be a major problem for cases when
> 1. server node stops but Java process keeps running
> 2. thin client default handshake timeout is equal to zero
> This causes the thin client to hang indefinitely
> The problem can be locally reproduced by applying attached patch and running
> org.apache.ignite.client.ReliabilityTest#testServiceProxyFailover test.
> After IGNITE-28442, the issue on the thin client side is resolved by
> automatically setting a connection handshake timeout. However, the root cause
> of the problem remains.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)