[GitHub] [drill] cgivre commented on a diff in pull request #2539: DRILL-8216: Use EVF-based JSON reader for Values operator

GitBox Wed, 11 May 2022 16:33:27 -0700


cgivre commented on code in PR #2539:
URL: https://github.com/apache/drill/pull/2539#discussion_r870823130



##########
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/scan/v3/schema/SchemaUtils.java:
##########
@@ -234,4 +241,80 @@ static void copyProperties(TupleMetadata source,
       dest.setProperty(ScanProjectionParser.PROJECTION_TYPE_PROP, value);
     }
   }
+
+  /**
+   * Converts specified {@code RelDataType relDataType} into {@link 
ColumnMetadata}.
+   * For the case when specified relDataType is struct, map with recursively 
converted children
+   * will be created.
+   *
+   * @param name        filed name
+   * @param relDataType filed type
+   * @return {@link ColumnMetadata} which corresponds to specified {@code 
RelDataType relDataType}
+   */
+  public static ColumnMetadata getColumnMetadata(String name, RelDataType 
relDataType) {
+    switch (relDataType.getSqlTypeName()) {
+      case ARRAY:
+        return getArrayMetadata(name, relDataType);
+      case MAP:
+      case OTHER:
+        throw new UnsupportedOperationException(String.format("Unsupported 
data type: %s", relDataType.getSqlTypeName()));
+      default:
+        if (relDataType.isStruct()) {
+          return getStructMetadata(name, relDataType);
+        } else {
+          return new PrimitiveColumnMetadata(
+            MaterializedField.create(name,
+              
TypeInferenceUtils.getDrillMajorTypeFromCalciteType(relDataType)));
+        }
+    }
+  }
+
+  /**
+   * Returns {@link ColumnMetadata} instance which corresponds to specified 
array {@code RelDataType relDataType}.
+   *
+   * @param name        name of the filed
+   * @param relDataType the source of type information to construct the schema
+   * @return {@link ColumnMetadata} instance
+   */
+  private static ColumnMetadata getArrayMetadata(String name, RelDataType 
relDataType) {
+    RelDataType componentType = relDataType.getComponentType();
+    ColumnMetadata childColumnMetadata = getColumnMetadata(name, 
componentType);
+    switch (componentType.getSqlTypeName()) {
+      case ARRAY:
+        // for the case when nested type is array, it should be placed into 
repeated list
+        return MetadataUtils.newRepeatedList(name, childColumnMetadata);
+      case MAP:
+      case OTHER:
+        throw new UnsupportedOperationException(String.format("Unsupported 
data type: %s", relDataType.getSqlTypeName()));
+      default:
+        if (componentType.isStruct()) {
+          // for the case when nested type is struct, it should be placed into 
repeated map
+          return MetadataUtils.newMapArray(name, 
childColumnMetadata.tupleSchema());
+        } else {
+          // otherwise creates column metadata with repeated data mode
+          return new PrimitiveColumnMetadata(
+            MaterializedField.create(name,
+              Types.overrideMode(
+                
TypeInferenceUtils.getDrillMajorTypeFromCalciteType(componentType),
+                TypeProtos.DataMode.REPEATED)));
+        }
+    }
+  }
+
+  /**
+   * Returns {@link MapColumnMetadata} column metadata created based on 
specified {@code RelDataType relDataType} with
+   * converted to {@link ColumnMetadata} {@code relDataType}'s children.
+   *
+   * @param name        name of the filed
+   * @param relDataType {@link RelDataType} the source of the children for 
resulting schema
+   * @return {@link MapColumnMetadata} column metadata
+   */
+  private static MapColumnMetadata getStructMetadata(String name, RelDataType 
relDataType) {
+    TupleMetadata mapSchema = new TupleSchema();
+    relDataType.getFieldList().stream()
+      .map(field -> getColumnMetadata(field.getName(), field.getType()))

Review Comment:
   What would happen here if you have a nested map?  IE something like this:
   ```
   {
      "field1": {
            "nested_field1":"something",
            "nested_field2": {
                   "really_nested":"something",
                   "real_nested2":"something_else"
                  
             }
       }
   }
   ```
   Or are nested schemata not supported for `VALUES()` operator?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [drill] cgivre commented on a diff in pull request #2539: DRILL-8216: Use EVF-based JSON reader for Values operator

Reply via email to