[
https://issues.apache.org/jira/browse/IGNITE-23384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890439#comment-17890439
]
Pavel Tupitsyn commented on IGNITE-23384:
-----------------------------------------
IGNITE-23257 seems to improve the situation - I've added
*ClientOp.STREAMER_BATCH_SEND* to *isPartitionOperation*.
> Java thin: heartbeat timeout under load
> ---------------------------------------
>
> Key: IGNITE-23384
> URL: https://issues.apache.org/jira/browse/IGNITE-23384
> Project: Ignite
> Issue Type: Bug
> Reporter: Pavel Tupitsyn
> Assignee: Pavel Tupitsyn
> Priority: Major
> Labels: ignite-3, important
> Fix For: 3.0
>
>
> When doing YCSB throughput benchmarks for 1 server node cluster, I noticed
> that the client often fails with a heartbeat timeout:
> {code}
> 2024-09-25 16:19:06:345 [PAYLOAD] 214 sec: 2410491 operations; 0 current
> ops/sec; est completion in 13 seconds
> 2024-09-25 16:19:07:345 [PAYLOAD] 215 sec: 2410491 operations; 0 current
> ops/sec; est completion in 13 seconds
> 2024-09-25 16:19:07:345 [PAYLOAD] 215 sec: 2410491 operations; 0 current
> ops/sec; est completion in 13 seconds
> Sep 25, 2024 4:19:07 PM org.apache.ignite.internal.logger.IgniteLogger
> logInternal
> WARNING: Heartbeat timeout, closing the channel
> [remoteAddress=192.168.210.33:10800]
> Sep 25, 2024 4:19:07 PM org.apache.ignite.internal.logger.IgniteLogger
> logInternal
> INFO: The timeout worker was interrupted, probably the worker is stopping.
> 2024-09-25 16:19:08:345 [PAYLOAD] 216 sec: 2410491 operations; 0 current
> ops/sec; est completion in 13 seconds
> 2024-09-25 16:19:08:345 [PAYLOAD] 216 sec: 2410491 operations; 0 current
> ops/sec; est completion in 13 seconds
> 2024-09-25 16:19:09:345 [PAYLOAD] 217 sec: 2410491 operations; 0 current
> ops/sec; est completion in 13 seconds
> ...
> Sep 25, 2024 4:19:12 PM org.apache.ignite.internal.logger.IgniteLogger warn
> WARNING: Failed to establish connection to 192.168.210.33:10800:
> org.apache.ignite.client.IgniteClientConnectionException: IGN-CLIENT-1
> TraceId:e8797794-e6f9-495d-bdd5-bf5639b8878e Handshake timeout
> [endpoint=192.168.210.33:10800]
> java.util.concurrent.CompletionException:
> org.apache.ignite.client.IgniteClientConnectionException: IGN-CLIENT-1
> TraceId:e8797794-e6f9-495d-bdd5-bf5639b8878e Handshake timeout
> [endpoint=192.168.210.33:10800]
> at
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
> at
> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
> at
> java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:932)
> at
> java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
> at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
> at
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
> at
> org.apache.ignite.internal.future.timeout.TimeoutWorker.body(TimeoutWorker.java:96)
> at
> org.apache.ignite.internal.util.worker.IgniteWorker.run(IgniteWorker.java:108)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: org.apache.ignite.client.IgniteClientConnectionException:
> IGN-CLIENT-1 TraceId:e8797794-e6f9-495d-bdd5-bf5639b8878e Handshake timeout
> [endpoint=192.168.210.33:10800]
> at
> org.apache.ignite.internal.client.TcpClientChannel.lambda$handshakeAsync$7(TcpClientChannel.java:601)
> at
> java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
> ... 6 more
> Caused by: java.util.concurrent.TimeoutException
> ... 3 more
> {code}
> * We don't need heartbeats under load (they are only useful when idle)
> * If a heartbeat request was sent and timed out, but other responses arrived
> meanwhile, we can ignore the timeout
--
This message was sent by Atlassian Jira
(v8.20.10#820010)