[
https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qifan Chen resolved IMPALA-10811.
---------------------------------
Target Version: Impala 4.1.0
Resolution: Fixed
> RPC to submit query getting stuck for AWS NLB forever.
> ------------------------------------------------------
>
> Key: IMPALA-10811
> URL: https://issues.apache.org/jira/browse/IMPALA-10811
> Project: IMPALA
> Issue Type: Bug
> Reporter: Amogh Margoor
> Assignee: Qifan Chen
> Priority: Major
> Attachments: profile+(13).txt
>
>
> Initial RPC to submit a query and fetch the query handle can take quite long
> time to return as it can do various operations for planning and submission
> that involve executing Catalog Operations like Rename, Alter Table Recover
> partition that can take time on tables with many
> partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]).
> Attached is the profile of one such DDL query (with few fields hidden).
> These RPCs are:
> 1. Beeswax:
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57]
> 2. HS2:
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462]
>
> One of the side effects of such RPC taking long time is that clients such as
> impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks
> and closes connections after 350s and cannot be configured. But after closing
> the connection it doesn;t send TCP RST to the client. Only when client tries
> to send data or packets NLB issues back TCP RST to indicate connection is not
> alive. Documentation is here:
> [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout].
> Hence the impala-shell waiting for RPC to return gets stuck indefinitely.
> Hence, we may need to evaluate techniques for RPCs to return query handle
> after
> # Creating Driver:
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150]
> # Register Query:
> [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168]
> and execute later parts of RPC asynchronously in different thread without
> blocking the RPC. That way clients can get query handle and poll for it for
> state and results.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]