In the V2 data page header, we have: * num_values * num_rows * num_nulls
While on the V1 data page header, we only have "num_values". On a page representing a list, e.g. [[0, 1], None, [2, None, 3]], how should each of these numbers be written in v1 and v2? My current understanding from the docs is that for the example above, we should write: v2: * num_values: 6 * num_rows: 3 * num_nulls: 2 v1: * num_values: 6 But I am not sure this is correct. For example, pyarrow==4.0.0 writes v2: * num_values: 6 * num_nulls: 1 * num_rows: 6 v1: * num_values: 6 Is there any reference for this? Are the extra numbers in v2 necessary to read a page? My understanding is that the (compressed_size, uncompressed_size, num_values) is enough for reading everything. Best, Jorge
