Zouxxyy commented on code in PR #7073:
URL: https://github.com/apache/paimon/pull/7073#discussion_r2702100114
##########
paimon-common/src/main/java/org/apache/paimon/data/variant/VariantExtraction.java:
##########
@@ -18,28 +18,36 @@
package org.apache.paimon.data.variant;
+import org.apache.paimon.annotation.Experimental;
import org.apache.paimon.types.DataField;
import java.io.Serializable;
+import java.util.Arrays;
import java.util.List;
+import java.util.Objects;
-/** Variant access information for a variant column. */
-public class VariantAccessInfo implements Serializable {
+/** Variant extraction information that describes fields extraction from a
variant column. */
+@Experimental
+public class VariantExtraction implements Serializable {
private static final long serialVersionUID = 1L;
- // The name of the variant column.
- private final String columnName;
+ /**
+ * Returns the path to the variant column. For top-level variant columns,
this is a single
+ * element array containing the column name. For nested variant columns
within structs, this is
+ * an array representing the path (e.g., ["structCol", "innerStruct",
"variantCol"]).
+ */
+ private final String[] columnName;
Review Comment:
Perhaps I can follow Spark's approach and design it this way—by leveraging
the description field in Paimon's DataField.The description value could be
encoded similarly as follows:
```
__VARIANT_METADATA:<path>;<failOnError>;<timeZoneId>
e.g.
__VARIANT_METADATA:$.a.b;true;UTC
```
I'm fine with using either `,` or `;` as the delimiter, since both are safe
(they won't be interpreted as part of a column name—the column names will be
present in the `path`).
And introduce a me like this
```scala
def isVariantRow(s: RowType): Boolean =
s.fields.length > 0 &&
s.fields.forall(_.description.startWith("__VARIANT_METADATA"))
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]