[jira] [Updated] (IMPALA-14271) Tuple caching needs to handle runtime filters destined for other fragments

Joe McDonnell (Jira) Wed, 30 Jul 2025 10:10:06 -0700


     [ 
https://issues.apache.org/jira/browse/IMPALA-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Joe McDonnell updated IMPALA-14271:
-----------------------------------
    Description: 
In benchmarking with the cost based placement, for a couple queries, some 
fragments get stuck waiting for runtime filters. In particular, Q5 for TPC-H 
and Q61 for TPC-DS get stuck. They time out of the wait after 5 seconds, which 
this is a significant performance regression.

The runtime filters that they are waiting for are remote, so they pass through 
the coordinator. The problem is that if the query returns results quickly, the 
coordinator will transition from EXECUTING to RETURNED_RESULTS. The code that 
is processing the runtime filter on the coordinator in 
Coordinator::UpdateFilter() will bail out if the query is no longer executing:
{noformat}
  if (!IsExecuting()) {
    LOG(INFO) << "Filter update received for non-executing query with id: "
        << PrintId(query_id());
    return;
  }{noformat}
The fragment that is waiting on the runtime filter won't receive it, so it 
waits until it hits the runtime filter wait time or the query is cancelled. The 
problem is that this doesn't happen until seconds later.

One solution for this is to reapply the core logic from IMPALA-6984 (i.e. 
reverting IMPALA-10047). That immediately sends a cancel when the query 
transitions to RETURNED_RESULTS.

This should only be a problem if all fragment instances hit the tuple cache. If 
one fragment instance does not, then the query won't transition to 
RETURNED_RESULTS before the runtime filter is processed, because the fragment 
instance still needs the hash table.

  was:
In benchmarking with the cost based placement, for a couple queries, some 
fragments get stuck waiting for runtime filters. In particular, Q5 for TPC-H 
and Q61 for TPC-DS get stuck. They time out of the wait after 5 seconds, which 
this is a significant performance regression.

Consider a plan like this where we are caching above a string of hash joins. 
With mt_dop, each hash join has a separate fragment on the build side:

Fragment 1:

Cache location

Hash join 3 <--- broadcast: build side fragment 2

Hash join 2 <--- broadcast: build side fragment 3

Hash join 1 <--- broadcast: build side fragment 4

Probe scan node

An example problematic runtime filter goes from hash join 3 to hash join 2's 
build side fragment 3. With a cache hit above everything, the runtime filter 
never gets generated, so build side fragment 3 is waiting for a filter that 
never comes.

We need some approach to handle this to avoid the performance issue.


> Tuple caching needs to handle runtime filters destined for other fragments
> --------------------------------------------------------------------------
>
>                 Key: IMPALA-14271
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14271
>             Project: IMPALA
>          Issue Type: Task
>          Components: Backend, Frontend
>    Affects Versions: Impala 5.0.0
>            Reporter: Joe McDonnell
>            Priority: Major
>
> In benchmarking with the cost based placement, for a couple queries, some 
> fragments get stuck waiting for runtime filters. In particular, Q5 for TPC-H 
> and Q61 for TPC-DS get stuck. They time out of the wait after 5 seconds, 
> which this is a significant performance regression.
> The runtime filters that they are waiting for are remote, so they pass 
> through the coordinator. The problem is that if the query returns results 
> quickly, the coordinator will transition from EXECUTING to RETURNED_RESULTS. 
> The code that is processing the runtime filter on the coordinator in 
> Coordinator::UpdateFilter() will bail out if the query is no longer executing:
> {noformat}
>   if (!IsExecuting()) {
>     LOG(INFO) << "Filter update received for non-executing query with id: "
>         << PrintId(query_id());
>     return;
>   }{noformat}
> The fragment that is waiting on the runtime filter won't receive it, so it 
> waits until it hits the runtime filter wait time or the query is cancelled. 
> The problem is that this doesn't happen until seconds later.
> One solution for this is to reapply the core logic from IMPALA-6984 (i.e. 
> reverting IMPALA-10047). That immediately sends a cancel when the query 
> transitions to RETURNED_RESULTS.
> This should only be a problem if all fragment instances hit the tuple cache. 
> If one fragment instance does not, then the query won't transition to 
> RETURNED_RESULTS before the runtime filter is processed, because the fragment 
> instance still needs the hash table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (IMPALA-14271) Tuple caching needs to handle runtime filters destined for other fragments

Reply via email to