piby180 opened a new issue, #18902:
URL: https://github.com/apache/pinot/issues/18902
We regularly experience a brief window of query failures immediately after a
Pinot server pod transitions to a healthy state following a restart. Notably,
there is no observed disruption during the initial termination phase or while
the pod is in a 0/1 Runningstate; the instability occurs specifically as the
service becomes ready to handle traffic again.
### Error Details
The failures manifest as QueryExecutionError exceptions (likely caused by
gRPC connection timeouts). The logs indicate that the query dispatcher is
unable to establish a connection to the newly restarted pod.
Full Java Stack Trace:
```
QueryExecutionError: Error dispatching query: 667717545000000055 to server:
pinot-server-8.pinot-server-headless.pinot.svc.cluster.local@{8421,8442}
at
org.apache.pinot.query.service.dispatch.QueryDispatcher.processResults(QueryDispatcher.java:689)
at
org.apache.pinot.query.service.dispatch.QueryDispatcher.execute(QueryDispatcher.java:643)
at
org.apache.pinot.query.service.dispatch.QueryDispatcher.submit(QueryDispatcher.java:584)
at
org.apache.pinot.query.service.dispatch.QueryDispatcher.submitAndReduce(QueryDispatcher.java:219)
Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
at io.grpc.Status.asRuntimeException(Status.java:532)
at
io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:581)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:566)
at io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:72)
at
io.grpc.internal.ClientCallImpl$ClientCallListenerImpl.onClose(ClientCallImpl.java:547)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:566)
at io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:72)
at
io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:739)
at
io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:720)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: java.io.IOException: connection timed out after 30000 ms:
pinot-server-8.pinot-server-headless.pinot.svc.cluster.local/100.64.62.48:8421
at
io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$2.run(AbstractEpollChannel.java:615)
at
io.grpc.netty.shaded.io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at
io.grpc.netty.shaded.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:160)
at
io.grpc.netty.shaded.io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
at
io.grpc.netty.shaded.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
at
io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
at
io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384)
at
io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at
io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.base/java.lang.Thread.run(Thread.java:840)
```
### Environment Context
Pinot Version: 1.5.0
EKS 1.34
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]