Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3909: Populate min/max statistics in Parquet writer
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5611/9/be/src/exec/parquet-column-stats.h
File be/src/exec/parquet-column-stats.h:

Line 127:       // statistics behavior from any implicit behavior of the types?
> If Parquet's and Impala's ordering were "roughly the same", then we would n
My concern is that min()/max() use overloaded operators, e.g. like the ones in 
timestamp-value.h - so any changes or additions get automatically picked up 
here via an implicit mechanism.

It seems like it would make it easy for someone to accidentally break our 
Parquet implementation by changing a seemingly-unrelated bit of code. I think 
the static check above helps a bit but my instinct is to be as explicit as 
possible.

Another way I was thinking about it is that the conceptual distinction between 
Parquet's ordering and ours is important and the code should reflect that in 
some way.

We'll have to see what happens with parquet-format. I agree that it makes 
things tricky if the ordering diverges at all.


-- 
To view, visit http://gerrit.cloudera.org:8080/5611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8368ee58daa50c07a3b8ef65be70203eb941f619
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Michael Brown <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zoltan Ivanfi <[email protected]>
Gerrit-HasComments: Yes

Reply via email to