mapleFU commented on code in PR #466:
URL: https://github.com/apache/parquet-format/pull/466#discussion_r1854389700
##########
LogicalTypes.md:
##########
@@ -684,44 +702,68 @@ optional group my_list (LIST) {
}
```
-Some existing data does not include the inner element layer. For
-backward-compatibility, the type of elements in `LIST`-annotated structures
-should always be determined by the following rules:
+##### 2-level structure
+
+Some existing data does not include the inner element layer, resulting in a
+`LIST` that annotates a 2-level structure. Unlike the 3-level structure, the
+repetition of a 2-level structure can be `optional`, `required`, or `repeated`.
Review Comment:
"the repetition of a 2-level structure can be" so here means the outer
"list-repetition"? Should we denote it as out-most level?
##########
LogicalTypes.md:
##########
@@ -684,44 +702,68 @@ optional group my_list (LIST) {
}
```
-Some existing data does not include the inner element layer. For
-backward-compatibility, the type of elements in `LIST`-annotated structures
-should always be determined by the following rules:
+##### 2-level structure
+
+Some existing data does not include the inner element layer, resulting in a
+`LIST` that annotates a 2-level structure. Unlike the 3-level structure, the
+repetition of a 2-level structure can be `optional`, `required`, or `repeated`.
+When it is `repeated`, the `LIST`-annotated 2-level structure can only serve as
+an element within another `LIST`-annotated 2-level structure.
+
+```
+<list-repetition> group <name> (LIST) {
+ repeated <element-type> <element-name>;
+}
+```
+
+For backward-compatibility, the type of elements in `LIST`-annotated structures
Review Comment:
So here following the rule:
1. Trying to parse as 3-level structure
2. Trying to parse 2-level legacy
3. Trying to parse 1-level legacy
?
##########
LogicalTypes.md:
##########
@@ -684,44 +702,68 @@ optional group my_list (LIST) {
}
```
-Some existing data does not include the inner element layer. For
-backward-compatibility, the type of elements in `LIST`-annotated structures
-should always be determined by the following rules:
+##### 2-level structure
+
+Some existing data does not include the inner element layer, resulting in a
+`LIST` that annotates a 2-level structure. Unlike the 3-level structure, the
+repetition of a 2-level structure can be `optional`, `required`, or `repeated`.
+When it is `repeated`, the `LIST`-annotated 2-level structure can only serve as
+an element within another `LIST`-annotated 2-level structure.
+
+```
+<list-repetition> group <name> (LIST) {
+ repeated <element-type> <element-name>;
+}
+```
+
+For backward-compatibility, the type of elements in `LIST`-annotated structures
+should always be determined by the following rules if they cannot be determined
+as 3-level structures:
1. If the repeated field is not a group, then its type is the element type and
elements are required.
2. If the repeated field is a group with multiple fields, then its type is the
element type and elements are required.
-3. If the repeated field is a group with one field and is named either `array`
+3. If the repeated field is a group with one field and the repetition of that
+ field is `repeated`, then its type is the element type and elements are
Review Comment:
I fount the "repeated field" is `repeated` is ambigious here...
##########
LogicalTypes.md:
##########
@@ -609,9 +609,20 @@ that is neither contained by a `LIST`- or `MAP`-annotated
group nor annotated
by `LIST` or `MAP` should be interpreted as a required list of required
elements where the element type is the type of the field.
-Implementations should use either `LIST` and `MAP` annotations _or_ unannotated
-repeated fields, but not both. When using the annotations, no unannotated
-repeated types are allowed.
+```
+// List<Integer> (non-null list, non-null elements)
+repeated int32 num;
+
+// List<Tuple<Integer, String>> (non-null list, non-null elements)
+repeated group my_list {
+ required int32 num;
+ optional binary str (STRING);
+}
+```
+
+For all fields in the schema, implementations should use either `LIST` and
Review Comment:
👍I think this is neccessary, hope implemetation would follow this rule...
##########
LogicalTypes.md:
##########
@@ -684,49 +697,76 @@ optional group my_list (LIST) {
}
```
-Some existing data does not include the inner element layer. For
-backward-compatibility, the type of elements in `LIST`-annotated structures
-should always be determined by the following rules:
+##### 2-level structure
+
+Some existing data does not include the inner element layer, meaning that
`LIST`
+annotates a 2-level structure. In contrast to 3-level structure, the repetition
+of 2-level structure can be `optional`, `required`, or `repeated`.
+
+```
+<list-repetition> group <name> (LIST) {
+ repeated <element-type> <element-name>;
+}
+```
+
+For backward-compatibility, the type of elements in `LIST`-annotated 2-level
+structures should always be determined by the following rules:
1. If the repeated field is not a group, then its type is the element type and
elements are required.
2. If the repeated field is a group with multiple fields, then its type is the
element type and elements are required.
-3. If the repeated field is a group with one field and is named either `array`
+3. If the repeated field is a group with a `repeated` field, then the repeated
+ field is the element type because the type cannot be a 3-level list.
+4. If the repeated field is a group with one field and is named either `array`
or uses the `LIST`-annotated group's name with `_tuple` appended then the
repeated type is the element type and elements are required.
-4. Otherwise, the repeated field's type is the element type with the repeated
+5. Otherwise, the repeated field's type is the element type with the repeated
field's repetition.
Examples that can be interpreted using these rules:
```
-// List<Integer> (nullable list, non-null elements)
+// Rule 1: List<Integer> (nullable list, non-null elements)
optional group my_list (LIST) {
repeated int32 element;
}
-// List<Tuple<String, Integer>> (nullable list, non-null elements)
+// Rule 2: List<Tuple<String, Integer>> (nullable list, non-null elements)
optional group my_list (LIST) {
repeated group element {
required binary str (STRING);
required int32 num;
Review Comment:
Below there is a similiar example with `optional str` after `num`
##########
LogicalTypes.md:
##########
@@ -609,9 +609,20 @@ that is neither contained by a `LIST`- or `MAP`-annotated
group nor annotated
by `LIST` or `MAP` should be interpreted as a required list of required
elements where the element type is the type of the field.
-Implementations should use either `LIST` and `MAP` annotations _or_ unannotated
-repeated fields, but not both. When using the annotations, no unannotated
-repeated types are allowed.
+```
+// List<Integer> (non-null list, non-null elements)
+repeated int32 num;
+
+// List<Tuple<Integer, String>> (non-null list, non-null elements)
Review Comment:
nit: `List<Tuple<Integer, String>>` have some way to repr the `String` is
nullable in type annotation?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]