Hello Andrew Sherman, Riza Suminto, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/20523
to look at the new patch set (#5).
Change subject: IMPALA-12477: Make Iceberg planFiles() use multiple threads
......................................................................
IMPALA-12477: Make Iceberg planFiles() use multiple threads
Impala is not using Iceberg’s planFiles() API in a performant way:
https://github.com/apache/impala/blob/2d3289027c2ffdd245d13b60e6fa3f9b3e7bf833/fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java#L46
Instead of a for-loop we should use a forEach() like Hive does:
https://github.com/apache/hive/blob/071b721d8d73cc4d5d2d9469d7953bdc75ff615f/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java#L222
planFiles() returns an org.apache.iceberg.util.ParallelIterable object.
Its forEach() method spreads the work across multiple threads. This will
not only improve table loading times, but also improves queries that
use planFiles(), e.g. queries that push down predicates to Iceberg and
time-travel queries.
Change-Id: I00db941dd5ac9917cd91d990fccf37e5bcfddbfc
---
M fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
2 files changed, 44 insertions(+), 29 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/20523/5
--
To view, visit http://gerrit.cloudera.org:8080/20523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I00db941dd5ac9917cd91d990fccf37e5bcfddbfc
Gerrit-Change-Number: 20523
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Andrew Sherman <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>