[jira] [Updated] (IMPALA-12014) Output a warning message on failed KeepAlive RPC for a Kudu scanner

Alexey Serbin (Jira) Tue, 21 Mar 2023 13:33:05 -0700


     [ 
https://issues.apache.org/jira/browse/IMPALA-12014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Alexey Serbin updated IMPALA-12014:
-----------------------------------
    Description: 
With [IMPALA-3292|https://issues.apache.org/jira/browse/IMPALA-3292], the code 
has been modified to ignore failed KeepAlive RPCs for Kudu scanners because one 
of the follow-up KeepAlive RPCs usually succeeds within TTL for an idle Kudu 
scanner.

However, if a Kudu tablet server had been busy for a long time, it might happen 
that all the consecutive KeepAlive requests failed as well (e.g., if the RPC 
queue stayed full for the whole interval of the scanner TTL).  If so happened, 
the corresponding Impala query would fail with an error message in the Impala's 
query profile like below:

{noformat}
...
    Query Type: QUERY
    Query State: EXCEPTION
    Impala Query State: ERROR
    Query Status: Unable to advance iterator for node with id '1' for Kudu 
table 'mega_table': Not found: Scanner 4235dd17eb444a36a945f003c23dcf81 not 
found (it may have expired)
...
{noformat}

Without a warning message logged by impalad or other clues it's hard to 
troubleshoot such a situation for people who have not much knowledge of the 
Kudu specifics.

It would be great to at least log a warning message about failed attempts to 
send KeepAlive RPC for Kudu scanners.  As of now, the message is logged with 
{{VLOG(1)}} facility, but verbose logging isn't usually enabled by default for 
impalad.

  was:
With [IMPALA-3292|https://issues.apache.org/jira/browse/IMPALA-3292], the code 
has been modified to ignore failed KeepAlive RPCs for Kudu scanners because one 
of the follow-up KeepAlive RPCs usually succeeds within TTL for an idle Kudu 
scanner.

However, if a Kudu tablet server had been busy for a long time, it might happen 
that all the consecutive KeepAlive requests failed as well (e.g., if the RPC 
queue stayed full for the whole interval of the scanner TTL).  If so happened, 
the corresponding Impala query would fail with an error message in the Impala's 
query profile like below:

{noformat}
...
    Query Type: QUERY
    Query State: EXCEPTION
    Impala Query State: ERROR
    Query Status: Unable to advance iterator for node with id '1' for Kudu 
table 'mega_table': Not found: Scanner 4235dd17eb444a36a945f003c23dcf81 not 
found (it may have expired)
...
{noformat}

However, without a warning message logged by impalad or other clues it's hard 
to troubleshoot such a situation for people who have not much knowledge of the 
Kudu specifics.

It would be great to at least log a warning message about failed attempts to 
send KeepAlive RPC for Kudu scanners.  As of now, the message is logged with 
{{VLOG(1)}} facility, but verbose logging isn't usually enabled by default for 
impalad.


> Output a warning message on failed KeepAlive RPC for a Kudu scanner
> -------------------------------------------------------------------
>
>                 Key: IMPALA-12014
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12014
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend, be
>    Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.7.1, Impala 2.9.0, 
> Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0, Impala 3.1.0, Impala 
> 3.2.0, Impala 4.0.0, Impala 3.3.0, Impala 3.4.0, Impala 3.4.1, Impala 4.1.0, 
> Impala 4.0.1, Impala 4.2.0, Impala 4.1.1
>            Reporter: Alexey Serbin
>            Assignee: Alexey Serbin
>            Priority: Minor
>              Labels: supportability, troubleshooting
>
> With [IMPALA-3292|https://issues.apache.org/jira/browse/IMPALA-3292], the 
> code has been modified to ignore failed KeepAlive RPCs for Kudu scanners 
> because one of the follow-up KeepAlive RPCs usually succeeds within TTL for 
> an idle Kudu scanner.
> However, if a Kudu tablet server had been busy for a long time, it might 
> happen that all the consecutive KeepAlive requests failed as well (e.g., if 
> the RPC queue stayed full for the whole interval of the scanner TTL).  If so 
> happened, the corresponding Impala query would fail with an error message in 
> the Impala's query profile like below:
> {noformat}
> ...
>     Query Type: QUERY
>     Query State: EXCEPTION
>     Impala Query State: ERROR
>     Query Status: Unable to advance iterator for node with id '1' for Kudu 
> table 'mega_table': Not found: Scanner 4235dd17eb444a36a945f003c23dcf81 not 
> found (it may have expired)
> ...
> {noformat}
> Without a warning message logged by impalad or other clues it's hard to 
> troubleshoot such a situation for people who have not much knowledge of the 
> Kudu specifics.
> It would be great to at least log a warning message about failed attempts to 
> send KeepAlive RPC for Kudu scanners.  As of now, the message is logged with 
> {{VLOG(1)}} facility, but verbose logging isn't usually enabled by default 
> for impalad.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (IMPALA-12014) Output a warning message on failed KeepAlive RPC for a Kudu scanner

Reply via email to