Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/23074 )
Change subject: IMPALA-13898: Incorporate partition information into tuple cache keys ...................................................................... Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/23074/6/be/src/exec/tuple-cache-node.cc File be/src/exec/tuple-cache-node.cc: http://gerrit.cloudera.org:8080/#/c/23074/6/be/src/exec/tuple-cache-node.cc@499 PS6, Line 499: for (const ExecNode* exec_node : scan_nodes) { > This seems like it could be large enough we might want to construct a map f Good point, done http://gerrit.cloudera.org:8080/#/c/23074/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/23074/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@2850 PS6, Line 2850: for (TScanRangeLocationList origLocList: orig.concrete_ranges) { > I don't have a good intuition for how large this list can get. But this cod It can get quite large for tables with a lot of files. We could incorporate this hash into the metadata that the catalog maintains. We'd compute it once when the table is modified and then just grab it. I'd need to think about the exact implications of that. -- To view, visit http://gerrit.cloudera.org:8080/23074 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a7109fcf8a30bf915bb566f7d642f8037793a8c Gerrit-Change-Number: 23074 Gerrit-PatchSet: 6 Gerrit-Owner: Joe McDonnell <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Tue, 08 Jul 2025 00:27:24 +0000 Gerrit-HasComments: Yes
