arina-ielchiieva commented on a change in pull request #2018: DRILL-7633: Fixes for union and repeated list accessors URL: https://github.com/apache/drill/pull/2018#discussion_r391479369
########## File path: exec/vector/src/main/java/org/apache/drill/exec/record/metadata/VariantColumnMetadata.java ########## @@ -96,11 +134,13 @@ public StructureType structureType() { public boolean isVariant() { return true; } @Override - public boolean isArray() { return type() == MinorType.LIST; } + public boolean isArray() { + return super.isArray() || type() == MinorType.LIST; + } @Override public ColumnMetadata cloneEmpty() { - return new VariantColumnMetadata(name, type, variantSchema.cloneEmpty()); + return new VariantColumnMetadata(name, type, mode, new VariantSchema()); Review comment: Well, it turned out the `typeString` is also used during column metadata serialization / deserialization. Type is serialized in schematic type to consume less space, plus it's also stored in the same way in Metastore (see org.apache.drill.exec.record.metadata.AbstractColumnMetadata#createColumnMetadata). So currently ser / de for `VariantColumnMetadata` will fail. Since you are planning to use it in JSON, we need to ensure this class can be properly serialized and deserialized. To ensure this we need to do two things: 1. update `typeString` (I propose we do this in this PR). 2. update schema parser (I can do it in separate PR). Back to `typeString`: 1. If `VariantColumnMetadata` is based on `UNION`, we can describe it as `UNION` without types but we'll lose all types defined in `VariantSchema` or we can describe it as UNION<INT, VARCHAR> and preserve types. The question is if we need types during ser / de? 2. Similar case when `VariantColumnMetadata` is based on `LIST`, we can describe it as ARRAY<UNION> without types or with types ARRAY<UNION<INT, VARCHAR>>. Since you know more about unions, I'll leave to you the final decision on how schematically union types should be represented, afterwards I'll adjust the schema parser. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services