Hello everyone,
I am coming to user mailing list with fairly basic issue, which probably
can be easily answered, but it got me stuck for quite a bit already.
I got a fairly basic ignite application with map/reduce task. A caller
of a task uses thin client and async api to dispatch request. All works
as expected until ignite cluster is being restarted. What I found is
that thin client is not able to renew its connection if member IP
addresses are shifted.
Strange enough client marks channel as closed, but does not re-attempt
to re-resolve addresses, it just tries to connect same address over and
over. Last day I tested several configurations, however none of them
helped (throttling, heartbeat, timeout).
Additionally I found out that when cluster member pods are rolled out
(ie. through scale down), re-connection attempt hangs for fairly long
time. Turns out that connection timeout is set to fairly large amount:
https://github.com/apache/ignite/blob/ignite-2.13/modules/core/src/main/java/org/apache/ignite/internal/client/thin/io/gridnioserver/GridNioClientConnectionMultiplexer.java#L125
My client configuration (still using ignite 2.13) is as below:
ClientConfiguration cfg = new ClientConfiguration();
cfg.setReconnectThrottlingPeriod(0);
cfg.setReconnectThrottlingRetries(0);
cfg.setHeartbeatEnabled(true);
cfg.setTimeout(500);
cfg.setPartitionAwarenessEnabled(false);
k8cfg = new KubernetesConnectionConfiguration();
k8cfg.setServiceName(kubernetesServiceName);
cfg.setAddressesFinder(new ThinClientKubernetesAddressFinder(k8cfg));
Client call is just:
igniteClient.compute()
.executeAsync2("sampleTask", taskArgument)
.toCompletableFuture()
.whenComplete((r, e) -> ...)
I wanted to confirm an expected behavior here, maybe I miss something
obvious or miss understand the documentation.
Kind regards,
Łukasz