Csaba Ringhofer has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/23958 )

Change subject: IMPALA-14734: Optimize sorting file descriptors during planning
......................................................................

IMPALA-14734: Optimize sorting file descriptors during planning

IcebergScanNode sorts the file descriptors by path (IMPALA-12765).
This can dominate planning time if there are many files.

This change makes this faster by avoiding extracting Java
Strings from flatbuffer, which involves utf8 decoding. Also
changes a few similar functions to avoid duplicate decoding.

For a table with ~1 million files:
explain select * from bigice limit 1;
before: ~12s
after: ~6.5s

Change-Id: Icb914eb4de7bdadeb876f7dd101e8737b9527b6f
Reviewed-on: http://gerrit.cloudera.org:8080/23958
Reviewed-by: Csaba Ringhofer <[email protected]>
Tested-by: Csaba Ringhofer <[email protected]>
---
M fe/src/main/java/org/apache/impala/catalog/FileDescriptor.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
2 files changed, 23 insertions(+), 8 deletions(-)

Approvals:
  Csaba Ringhofer: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/23958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Icb914eb4de7bdadeb876f7dd101e8737b9527b6f
Gerrit-Change-Number: 23958
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>

Reply via email to