rok commented on code in PR #241: URL: https://github.com/apache/parquet-format/pull/241#discussion_r1626825231
########## LogicalTypes.md: ########## @@ -255,6 +255,16 @@ The primitive type is a 2-byte fixed length binary. The sort order for `FLOAT16` is signed (with special handling of NANs and signed zeros); it uses the same [logic](https://github.com/apache/parquet-format#sort-order) as `FLOAT` and `DOUBLE`. +### FIXED_SIZE_LIST + +The `FIXED_SIZE_LIST` annotation represents a fixed-size list of elements +of a primitive data type. It must annotate a `binary` primitive type. Review Comment: > Could you please provide a concrete example on how the list is structured? What about their definition & repetition levels? Intuitively, I thought not limit it to binary type. For example, it would be possible to support something like int[N] or double[N] and even multi-dimensional list like int[M][N]. I would represent the fixed sized list as a non-nested `FIXED_LEN_BYTE_ARRAY` + `type` + `num_values`. Multidimensional lists/arrays bring much more complexity that I'm not sure makes sense to store as a logical type (see [FixedShapeTensor in Arrow](https://arrow.apache.org/docs/format/CanonicalExtensions.html#fixed-shape-tensor)). Also see https://github.com/apache/parquet-format/pull/241#issuecomment-2148750273. > Perhaps use `byte_array` in this PR (see #251). Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
