Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20434 )

Change subject: IMPALA-12408: Optimize HdfsScanNode.computeScanRangeLocations()
......................................................................


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/20434/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20434/3//COMMIT_MSG@30
PS3, Line 30: changes how we count the number of partitions
            : per file system
> Does it only change the "implementation" of how we count them or does it al
It only changes the implementation, added mention of this.


http://gerrit.cloudera.org:8080/#/c/20434/3/fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
File fe/src/main/java/org/apache/impala/catalog/FeFsTable.java:

http://gerrit.cloudera.org:8080/#/c/20434/3/fe/src/main/java/org/apache/impala/catalog/FeFsTable.java@550
PS3, Line 550:         if (sampleFileIdxs == null) {
> Could use 'computeIfAbsent()'.
Done


http://gerrit.cloudera.org:8080/#/c/20434/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/20434/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@a1155
PS3, Line 1155:
> Can 'computeScanRangeLocations()' be called multiple times on the same 'Hdf
I don't know if we would rerun this on any occasion, but I see that other 
counters are also rest, so I added a reset for this too.

Also moved + changed the counter logic to be similar to other counters like 
totalFilesPerFs_


http://gerrit.cloudera.org:8080/#/c/20434/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1341
PS3, Line 1341:    * the scan ranges can be (may be ignored if the file is not 
splittable).
> Could mention 'partitionLocation', for example that it must correspond to '
Done


http://gerrit.cloudera.org:8080/#/c/20434/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1370
PS3, Line 1370:    * whether the file has a missing disk id and the maximum 
scan range (in bytes) found.
> Could mention 'partitionLocation' and 'fsType', for example that they must 
Done



--
To view, visit http://gerrit.cloudera.org:8080/20434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icf3e9c169d65c15df6a6762cc68fbb477fe64a7c
Gerrit-Change-Number: 20434
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Comment-Date: Wed, 30 Aug 2023 14:04:54 +0000
Gerrit-HasComments: Yes

Reply via email to