Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21959 )

Change subject: IMPALA-13370: Read Puffin stats from metadata.json property if 
available
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21959/1/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java
File fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java:

http://gerrit.cloudera.org:8080/#/c/21959/1/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@83
PS1, Line 83:     for (StatisticsFile statsFile : statsFiles) {
            :       loader.loadStatsFromFile(statsFile);
            :     }
> The logic looks good to me but I think it could be made much simpler by hav
What if there are two StatisticsFile object in the metadata, with the same 
stats, and the first one doesn't have NDV in properties, but the second one 
has? (e.g. two different engines created Puffin files for the same snapshot)

IIUC with the original logic we read the statistic from the blob, even though 
we could retrieve the NDV from the metadata. We could also have a test for this.



--
To view, visit http://gerrit.cloudera.org:8080/21959
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e92056ce97c4849742db6309562af3b575f647b
Gerrit-Change-Number: 21959
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 19 Nov 2024 17:00:30 +0000
Gerrit-HasComments: Yes

Reply via email to