[
https://issues.apache.org/jira/browse/IMPALA-12326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17749364#comment-17749364
]
Riza Suminto commented on IMPALA-12326:
---------------------------------------
Adding WaitForLocalServer within StatestoreSubscriber::Start() seems appropriate
{code:java}
diff --git a/be/src/statestore/statestore-subscriber.cc
b/be/src/statestore/statestore-subscriber.cc
index f0f9f3c..2b4bcb1 100644
--- a/be/src/statestore/statestore-subscriber.cc
+++ b/be/src/statestore/statestore-subscriber.cc
@@ -238,6 +238,7 @@ Status StatestoreSubscriber::Start() {
RETURN_IF_ERROR(builder.Build(&server));
heartbeat_server_.reset(server);
RETURN_IF_ERROR(heartbeat_server_->Start());
+ RETURN_IF_ERROR(WaitForLocalServer(*heartbeat_server_, 10, 1000));
// Specify the port which the heartbeat server is listening on.
heartbeat_address_.port = heartbeat_server_->port(); {code}
> Impala daemons should only subscribe to statestore once rpc services are ready
> ------------------------------------------------------------------------------
>
> Key: IMPALA-12326
> URL: https://issues.apache.org/jira/browse/IMPALA-12326
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Reporter: Abhishek Rawat
> Priority: Major
>
> The Impala daemons start the statestore subscriber service before the
> krpc/rpc services are ready:
> [https://github.com/apache/impala/blob/branch-4.2.0/be/src/service/impala-server.cc#L2934]
> As a result, there is a small window where statestore could try to connect
> with Impala daemons, but the rpc service isn't ready and so statestore logs
> get filled with thrift timeout errors:
> {code:java}
> RPC Error: Client for 10.80.205.184:23000 hit an unexpected exception: No
> more data to read., type: N6apache6thrift9transport19TTransportExceptionE,
> rpc: N6impala18THeartbe
> I0731 19:43:09.058470 79 client-cache.cc:174] Broken Connection, destroy
> client for 10.80.205.184:23000
> I0731 19:43:09.076826 83 client-cache.h:362] RPC Error: Client for
> 10.80.192.41:23000 hit an unexpected exception: No more data to read., type:
> N6apache6thrift9transport19TTransportExceptionE, rpc: N6impala18THeartbea
> {code}
> It makes sense for statestore subscriber on Impala daemons to only start once
> the rpc/krpc service has started successfully.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]