rok commented on code in PR #241:
URL: https://github.com/apache/parquet-format/pull/241#discussion_r1626825231


##########
LogicalTypes.md:
##########
@@ -255,6 +255,16 @@ The primitive type is a 2-byte fixed length binary.
 
 The sort order for `FLOAT16` is signed (with special handling of NANs and 
signed zeros); it uses the same 
[logic](https://github.com/apache/parquet-format#sort-order) as `FLOAT` and 
`DOUBLE`.
 
+### FIXED_SIZE_LIST
+
+The `FIXED_SIZE_LIST` annotation represents a fixed-size list of elements
+of a primitive data type. It must annotate a `binary` primitive type.

Review Comment:
   > Could you please provide a concrete example on how the list is structured? 
What about their definition & repetition levels? Intuitively, I thought not 
limit it to binary type. For example, it would be possible to support something 
like int[N] or double[N] and even multi-dimensional list like int[M][N].
   
   I would represent the fixed sized list as a non-nested 
`FIXED_LEN_BYTE_ARRAY` + `type` + `num_values`. Multidimensional lists/arrays 
bring much more complexity that I'm not sure makes sense to store as a logical 
type (see [FixedShapeTensor in 
Arrow](https://arrow.apache.org/docs/format/CanonicalExtensions.html#fixed-shape-tensor)).
 Also see 
https://github.com/apache/parquet-format/pull/241#issuecomment-2148750273.
   
   > Perhaps use `byte_array` in this PR (see #251).
   
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to