arina-ielchiieva commented on a change in pull request #2018: DRILL-7633: Fixes
for union and repeated list accessors
URL: https://github.com/apache/drill/pull/2018#discussion_r391479369
##########
File path:
exec/vector/src/main/java/org/apache/drill/exec/record/metadata/VariantColumnMetadata.java
##########
@@ -96,11 +134,13 @@ public StructureType structureType() {
public boolean isVariant() { return true; }
@Override
- public boolean isArray() { return type() == MinorType.LIST; }
+ public boolean isArray() {
+ return super.isArray() || type() == MinorType.LIST;
+ }
@Override
public ColumnMetadata cloneEmpty() {
- return new VariantColumnMetadata(name, type, variantSchema.cloneEmpty());
+ return new VariantColumnMetadata(name, type, mode, new VariantSchema());
Review comment:
Well, it turned out the `typeString` is also used during column metadata
serialization / deserialization. Type is serialized in schematic type to
consume less space, plus it's also stored in the same way in Metastore (see
org.apache.drill.exec.record.metadata.AbstractColumnMetadata#createColumnMetadata).
So currently ser / de for `VariantColumnMetadata` will fail. Since you are
planning to use it in JSON, we need to ensure this class can be properly
serialized and deserialized. To ensure this we need to do two things:
1. update `typeString` (I propose we do this in this PR).
2. update schema parser (I can do it in separate PR).
Back to `typeString`:
1. If `VariantColumnMetadata` is based on `UNION`, we can describe it as
`UNION` without types but we'll lose all types defined in `VariantSchema` or we
can describe it as UNION<INT, VARCHAR> and preserve types. The question is if
we need types during ser / de?
2. Similar case when `VariantColumnMetadata` is based on `LIST`, we can
describe it as ARRAY<UNION> without types or with types ARRAY<UNION<INT,
VARCHAR>>.
Since you know more about unions, I'll leave to you the final decision on
how schematically union types should be represented, afterwards I'll adjust the
schema parser.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services