Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19793 )

Change subject: IMPALA-6433: Add read support for PageHeaderV2
......................................................................


Patch Set 11:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/19793/11/be/src/exec/parquet/parquet-column-chunk-reader.h
File be/src/exec/parquet/parquet-column-chunk-reader.h:

http://gerrit.cloudera.org:8080/#/c/19793/11/be/src/exec/parquet/parquet-column-chunk-reader.h@123
PS11, Line 123: till
"Till" could be understood as "until but not including" but here we do read the 
page header. Maybe omitting "till" would be clearer?


http://gerrit.cloudera.org:8080/#/c/19793/8/be/src/exec/parquet/parquet-column-chunk-reader.cc
File be/src/exec/parquet/parquet-column-chunk-reader.cc:

http://gerrit.cloudera.org:8080/#/c/19793/8/be/src/exec/parquet/parquet-column-chunk-reader.cc@338
PS8, Line 338:     // 
https://github.com/apache/parquet-format/blob/2a481fe1aad64ff770e21734533bb7ef5a057dac/src/main/thrift/parquet.thrift#L578
> This error message is a bit misleading, I would think we tried  to read 'co
What do you think about this?


http://gerrit.cloudera.org:8080/#/c/19793/11/be/src/exec/parquet/parquet-column-chunk-reader.cc
File be/src/exec/parquet/parquet-column-chunk-reader.cc:

http://gerrit.cloudera.org:8080/#/c/19793/11/be/src/exec/parquet/parquet-column-chunk-reader.cc@259
PS11, Line 259: num_bytes
It would be easier to understand if we called it 'num_rep_level_bytes' and the 
variable on L271 'num_def_level_bytes'.


http://gerrit.cloudera.org:8080/#/c/19793/11/be/src/exec/parquet/parquet-column-chunk-reader.cc@318
PS11, Line 318: write likes
Nit: writes like.


http://gerrit.cloudera.org:8080/#/c/19793/8/be/src/exec/parquet/parquet-column-readers.cc
File be/src/exec/parquet/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/19793/8/be/src/exec/parquet/parquet-column-readers.cc@1145
PS8, Line 1145: Status BaseScalarColumnReader::ReadDataPage() {
> I thought about this too but wouldn't to it in this patch. The difference o
Ok.


http://gerrit.cloudera.org:8080/#/c/19793/11/be/src/exec/parquet/parquet-level-decoder.h
File be/src/exec/parquet/parquet-level-decoder.h:

http://gerrit.cloudera.org:8080/#/c/19793/11/be/src/exec/parquet/parquet-level-decoder.h@57
PS11, Line 57: bytes
Nit: "number of bytes" would be clearer.


http://gerrit.cloudera.org:8080/#/c/19793/8/be/src/exec/parquet/parquet-level-decoder.h
File be/src/exec/parquet/parquet-level-decoder.h:

http://gerrit.cloudera.org:8080/#/c/19793/8/be/src/exec/parquet/parquet-level-decoder.h@56
PS8, Line 56:
> done - int32_t vs int is used a bit chaotically in Impala and I think that
Yeah, but let's do it the right way when we can.


http://gerrit.cloudera.org:8080/#/c/19793/11/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/19793/11/testdata/datasets/functional/functional_schema_template.sql@4099
PS11, Line 4099: alltypesagg_parquet_v2_uncompressed
Do we still need 'alltypesagg_parquet_v2' on L4074?



--
To view, visit http://gerrit.cloudera.org:8080/19793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I282962a6e4611e2b662c04a81592af83ecaf08ca
Gerrit-Change-Number: 19793
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Comment-Date: Thu, 04 May 2023 09:51:49 +0000
Gerrit-HasComments: Yes

Reply via email to