Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21959 )

Change subject: IMPALA-13370: Read Puffin stats from metadata.json property if 
available
......................................................................


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/21959/4/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java
File fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java:

http://gerrit.cloudera.org:8080/#/c/21959/4/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@62
PS4, Line 62: public final boolean isFromMetadataJson
> What is the relevance of having this field? Can we drop it?
In the new patch set it's needed to determine whether we need to log a warning 
for duplicate stats when determining which blobs to read from a Puffin file.


http://gerrit.cloudera.org:8080/#/c/21959/4/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@83
PS4, Line 83:     for (StatisticsFile statsFile : statsFiles) {
> Why do we maintain this map? Isn't it enough later to check that fieldId is
We need a way to decide whether we need to open the Puffin file at all. For 
that, we want to know that it contains field ids for which there is no NDV 
property in the metadata. But we can go through the blobs from the 
metadata.json file again to create that list. Doing that in the new patch.


http://gerrit.cloudera.org:8080/#/c/21959/4/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@85
PS4, Line 85:
> Same here, I don't think we need to maintain this list.
Done


http://gerrit.cloudera.org:8080/#/c/21959/4/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@197
PS4, Line 197:
> nit: indentation is off by 2
Done


http://gerrit.cloudera.org:8080/#/c/21959/4/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@221
PS4, Line 221:
> nit: needs +2 spaces
Done


http://gerrit.cloudera.org:8080/#/c/21959/4/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@268
PS4, Line 268: ie
> nit: indentation is off by 2
Done


http://gerrit.cloudera.org:8080/#/c/21959/4/java/puffin-data-generator/src/main/java/org/apache/impala/puffindatagenerator/PuffinDataGenerator.java
File 
java/puffin-data-generator/src/main/java/org/apache/impala/puffindatagenerator/PuffinDataGenerator.java:

http://gerrit.cloudera.org:8080/#/c/21959/4/java/puffin-data-generator/src/main/java/org/apache/impala/puffindatagenerator/PuffinDataGenerator.java@438
PS4, Line 438:
> nit: redundant spaces
Done



--
To view, visit http://gerrit.cloudera.org:8080/21959
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e92056ce97c4849742db6309562af3b575f647b
Gerrit-Change-Number: 21959
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 21 Nov 2024 13:06:42 +0000
Gerrit-HasComments: Yes

Reply via email to