[ https://issues.apache.org/jira/browse/IMPALA-13437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18021270#comment-18021270 ]
ASF subversion and git services commented on IMPALA-13437: ---------------------------------------------------------- Commit 3181fe18006e392e0ce3f2f48fe285569ccfd148 in impala's branch refs/heads/master from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=3181fe180 ] IMPALA-13437 (part 1): Compute processing cost before TupleCachePlanner This is a preparatory change for cost-based placement for TupleCacheNodes. It reorders planning so that the processing cost and filtered cardinality are calculated before running the TupleCachePlanner. This computes the processing cost when enable_tuple_cache=true. It also displays the cost information in the explain plan output when enable_tuple_cache=true. This does not impact the adjustment of fragment parallelism, which continues to be controlled by the compute_processing_cost option. This uses the processing cost to calculate a cumulative processing cost in the TupleCacheInfo. This is all of the processing cost below this point including other fragments. This is an indicator of how much processing a cache hit could avoid. This does not accumulate the cost when merging the TupleCacheInfo due to a runtime filter, as that cost is not actually being avoided. This also computes the estimated serialized size for the TupleCacheNode based on the filtered cardinality and the row size. Testing: - Ran a core job Change-Id: If78f5d002b0e079eef1eece612f0d4fefde545c7 Reviewed-on: http://gerrit.cloudera.org:8080/23164 Reviewed-by: Yida Wu <wydbaggio...@gmail.com> Reviewed-by: Michael Smith <michael.sm...@cloudera.com> Tested-by: Michael Smith <michael.sm...@cloudera.com> > Improve heuristics for placing the tuple cache nodes > ---------------------------------------------------- > > Key: IMPALA-13437 > URL: https://issues.apache.org/jira/browse/IMPALA-13437 > Project: IMPALA > Issue Type: Improvement > Components: Frontend > Affects Versions: Impala 4.5.0 > Reporter: Joe McDonnell > Assignee: Joe McDonnell > Priority: Major > > Improve placement of tuple cache nodes by considering: > # Selectivity > # Result Size > # Operator cost > # Data change frequency (maybe followup) > # Etc > This should avoid caching large results that don't have a major performance > improvement. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org