In the context of my PR trying to encode the consensus that records can't
span page boundaries[1], Antoine brought up the excellent point[2] that the
format[3] seems to use the terms "records" and "rows" to refer to the same
concept.

I agree it would clarify the spec to use the same terminology throughout.
Given there are several fields named `num_rows` I propose changing
parquet.thrift to use the term "row" throughout.

I can make another PR to do so if this seems like a good idea.

Andrew
(p.s the PR[1] is still waiting on some more review and merging :pray:)

[1] https://github.com/apache/parquet-format/pull/244
[2] https://github.com/apache/parquet-format/pull/244#discussion_r1617320495
[3]
https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift

Reply via email to