[
https://issues.apache.org/jira/browse/IMPALA-13185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887032#comment-17887032
]
ASF subversion and git services commented on IMPALA-13185:
----------------------------------------------------------
Commit d81c4db5e1e99865cf3a9a96cbc48af6436dc0c0 in impala's branch
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d81c4db5e ]
IMPALA-13185: Include runtime filter source in key
Incorporates the build-side PlanNode of a runtime filter in the tuple
cache key to avoid re-using intermediate results that were generated
using a runtime filter on the same target but different selection
criteria (build-side conjuncts).
We currently don't support caching ExchangeNode, but a common scenario
is a runtime filter produced by a HashJoin, with an Exchange on the
build side. Looks through the first ExchangeNode when considering the
cache key and eligibility for the build side source for a runtime
filter.
Testing shows all tests now passing for test_tuple_cache_tpc_queries
except those that hit "TupleCacheNode does not enforce limits itself and
cannot have a limit set."
Adds planner tests covering some scenarios where runtime filters are
expected to match or differ, and custom cluster tests for multi-node
testing.
Change-Id: I0077964be5acdb588d76251a6a39e57a0f42bb5a
Reviewed-on: http://gerrit.cloudera.org:8080/21729
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Joe McDonnell <[email protected]>
> Tuple cache keys need to incorporate runtime filter information
> ---------------------------------------------------------------
>
> Key: IMPALA-13185
> URL: https://issues.apache.org/jira/browse/IMPALA-13185
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 4.5.0
> Reporter: Joe McDonnell
> Assignee: Michael Smith
> Priority: Major
> Fix For: Impala 4.5.0
>
>
> If a runtime filter impacts the results of a fragment, then the tuple cache
> key needs to incorporate information about the generation of that runtime
> filter. This needs to include information about the base tables that impact
> the runtime filter.
> For example, suppose there is a join. The build side of the join produces a
> runtime filter that gets delivered to the probe side of the join. The tuple
> cache key for the probe side of the join will need to include a
> representation of the runtime filter. If the table on the build side of the
> join changes, the tuple cache key for the probe side needs to change due to
> the possible difference in the runtime filter.
> This can also impact eligibility. In theory, the build side of a join could
> be constructed from a source with a limit specified, and this can result in
> non-determinism. Since the build of the runtime filter is not deterministic,
> the consumer of the runtime filter is not deterministic and can't participate
> in tuple caching.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]