Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18531 )

Change subject: IMPALA-10453: Support file pruning via runtime filters on 
Iceberg
......................................................................


Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/18531/10/be/src/exec/file-metadata-utils.h
File be/src/exec/file-metadata-utils.h:

http://gerrit.cloudera.org:8080/#/c/18531/10/be/src/exec/file-metadata-utils.h@42
PS10, Line 42:   void Open(RuntimeState* state, const HdfsFileDesc* file_desc);
I think that the name "Open" is a bit misleading as this function can be called 
more than once in the lifetime of a  FileMetadataUtils (without calling reset 
or close).

I have 3 ideas to improve this:
1. add a comment about Open being called more than once
2. change the name to something like SetFile()
3. change the usage of this class to be be instantiated where we call Open() 
now, so that the constructor can get this params.


http://gerrit.cloudera.org:8080/#/c/18531/9/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18531/9/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@116
PS9, Line 116:     // Increase the per instance memory estimate with the tuples 
created for runtime file
             :     // filtering.
I am a bit confused about why do have to increase the mem estimate - do we need 
to hold the tuples for all files in memory at the same time?  Don't we have a 
tuple per-partition, not per-file? Also, this was added in a previous commit, 
right?


http://gerrit.cloudera.org:8080/#/c/18531/9/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@120
PS9, Line 120: avgRowSize_
Can you add a Precondition that this is not negative?

Note that we are overestimating the memory here, as avgRowSize also contains 
strings, and those won't be created for non-partition columns.


http://gerrit.cloudera.org:8080/#/c/18531/9/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@123
PS9, Line 123: setMinMemReservationBytes
Shouldn't we increase min reservation similarly to the estimate?



--
To view, visit http://gerrit.cloudera.org:8080/18531
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7762e1238bdf236b85d2728881a402a2bb41f36a
Gerrit-Change-Number: 18531
Gerrit-PatchSet: 10
Gerrit-Owner: Tamas Mate <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Mon, 01 Aug 2022 09:45:13 +0000
Gerrit-HasComments: Yes

Reply via email to