[
https://issues.apache.org/jira/browse/IMPALA-12014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Serbin updated IMPALA-12014:
-----------------------------------
Description:
With [IMPALA-3292|https://issues.apache.org/jira/browse/IMPALA-3292], the code
has been modified to ignore failed KeepAlive RPCs for Kudu scanners because one
of the follow-up KeepAlive RPCs usually succeeds within TTL for an idle Kudu
scanner.
However, if a Kudu tablet server had been busy for a long time, it might happen
that all the consecutive KeepAlive requests failed as well (e.g., if the RPC
queue stayed full for the whole interval of the scanner TTL). If so happened,
the corresponding Impala query would fail with an error message in the Impala's
query profile like below:
{noformat}
...
Query Type: QUERY
Query State: EXCEPTION
Impala Query State: ERROR
Query Status: Unable to advance iterator for node with id '1' for Kudu
table 'mega_table': Not found: Scanner 4235dd17eb444a36a945f003c23dcf81 not
found (it may have expired)
...
{noformat}
Without a warning message logged by impalad or other clues it's hard to
troubleshoot such a situation for people who have not much knowledge of the
Kudu specifics.
It would be great to at least log a warning message about failed attempts to
send KeepAlive RPC for Kudu scanners. As of now, the message is logged with
{{VLOG(1)}} facility, but verbose logging isn't usually enabled by default for
impalad.
was:
With [IMPALA-3292|https://issues.apache.org/jira/browse/IMPALA-3292], the code
has been modified to ignore failed KeepAlive RPCs for Kudu scanners because one
of the follow-up KeepAlive RPCs usually succeeds within TTL for an idle Kudu
scanner.
However, if a Kudu tablet server had been busy for a long time, it might happen
that all the consecutive KeepAlive requests failed as well (e.g., if the RPC
queue stayed full for the whole interval of the scanner TTL). If so happened,
the corresponding Impala query would fail with an error message in the Impala's
query profile like below:
{noformat}
...
Query Type: QUERY
Query State: EXCEPTION
Impala Query State: ERROR
Query Status: Unable to advance iterator for node with id '1' for Kudu
table 'mega_table': Not found: Scanner 4235dd17eb444a36a945f003c23dcf81 not
found (it may have expired)
...
{noformat}
However, without a warning message logged by impalad or other clues it's hard
to troubleshoot such a situation for people who have not much knowledge of the
Kudu specifics.
It would be great to at least log a warning message about failed attempts to
send KeepAlive RPC for Kudu scanners. As of now, the message is logged with
{{VLOG(1)}} facility, but verbose logging isn't usually enabled by default for
impalad.
> Output a warning message on failed KeepAlive RPC for a Kudu scanner
> -------------------------------------------------------------------
>
> Key: IMPALA-12014
> URL: https://issues.apache.org/jira/browse/IMPALA-12014
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend, be
> Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.7.1, Impala 2.9.0,
> Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0, Impala 3.1.0, Impala
> 3.2.0, Impala 4.0.0, Impala 3.3.0, Impala 3.4.0, Impala 3.4.1, Impala 4.1.0,
> Impala 4.0.1, Impala 4.2.0, Impala 4.1.1
> Reporter: Alexey Serbin
> Assignee: Alexey Serbin
> Priority: Minor
> Labels: supportability, troubleshooting
>
> With [IMPALA-3292|https://issues.apache.org/jira/browse/IMPALA-3292], the
> code has been modified to ignore failed KeepAlive RPCs for Kudu scanners
> because one of the follow-up KeepAlive RPCs usually succeeds within TTL for
> an idle Kudu scanner.
> However, if a Kudu tablet server had been busy for a long time, it might
> happen that all the consecutive KeepAlive requests failed as well (e.g., if
> the RPC queue stayed full for the whole interval of the scanner TTL). If so
> happened, the corresponding Impala query would fail with an error message in
> the Impala's query profile like below:
> {noformat}
> ...
> Query Type: QUERY
> Query State: EXCEPTION
> Impala Query State: ERROR
> Query Status: Unable to advance iterator for node with id '1' for Kudu
> table 'mega_table': Not found: Scanner 4235dd17eb444a36a945f003c23dcf81 not
> found (it may have expired)
> ...
> {noformat}
> Without a warning message logged by impalad or other clues it's hard to
> troubleshoot such a situation for people who have not much knowledge of the
> Kudu specifics.
> It would be great to at least log a warning message about failed attempts to
> send KeepAlive RPC for Kudu scanners. As of now, the message is logged with
> {{VLOG(1)}} facility, but verbose logging isn't usually enabled by default
> for impalad.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]