wgtmac commented on code in PR #1078:
URL: https://github.com/apache/parquet-mr/pull/1078#discussion_r1174720814


##########
parquet-avro/src/main/java/org/apache/parquet/avro/AvroWriteSupport.java:
##########
@@ -400,9 +400,18 @@ private Binary fromAvroString(Object value) {
     return Binary.fromCharSequence(value.toString());
   }
 
-  private static GenericData getDataModel(Configuration conf) {
+  private static GenericData getDataModel(Configuration conf, Schema schema) {
+    if (conf.get(AVRO_DATA_SUPPLIER) == null) {
+      final GenericData modelForSchema = 
AvroRecordConverter.getModelForSchema(schema);

Review Comment:
   What if `schema == null`?



##########
parquet-avro/src/main/java/org/apache/parquet/avro/AvroRecordConverter.java:
##########
@@ -169,6 +172,46 @@ public void add(Object value) {
     }
   }
 
+  /**
+   * Returns the specific data model for a given SpecificRecord schema by 
reflecting the underlying
+   * Avro class's `MODEL$` field, or Null if the class is not on the classpath 
or reflection fails.
+   */
+  static SpecificData getModelForSchema(Schema schema) {
+    final Class<?> clazz;
+
+    if (schema != null && (schema.getType() == Schema.Type.RECORD || 
schema.getType() == Schema.Type.UNION)) {
+      clazz = SpecificData.get().getClass(schema);
+    } else {
+      return null;
+    }
+
+    final SpecificData model;
+    try {
+      final Field modelField = clazz.getDeclaredField("MODEL$");
+      modelField.setAccessible(true);
+
+      model = (SpecificData) modelField.get(null);

Review Comment:
   Should it return if model is set successfully?



##########
parquet-avro/src/main/java/org/apache/parquet/avro/AvroReadSupport.java:
##########
@@ -149,12 +149,22 @@ private static <T> RecordMaterializer<T> 
newCompatMaterializer(
         parquetSchema, avroSchema, model);
   }
 
-  private GenericData getDataModel(Configuration conf) {
+  private GenericData getDataModel(Configuration conf, Schema schema) {
     if (model != null) {
       return model;
     }
+
+    if (conf.get(AVRO_DATA_SUPPLIER) == null) {
+      final GenericData modelForSchema = 
AvroRecordConverter.getModelForSchema(schema);
+
+      if (modelForSchema != null) {
+        return modelForSchema;
+      }
+    }
+
     Class<? extends AvroDataSupplier> suppClass = conf.getClass(
-        AVRO_DATA_SUPPLIER, SpecificDataSupplier.class, 
AvroDataSupplier.class);
-    return ReflectionUtils.newInstance(suppClass, conf).get();
+      AVRO_DATA_SUPPLIER, SpecificDataSupplier.class, AvroDataSupplier.class);
+
+      return ReflectionUtils.newInstance(suppClass, conf).get();

Review Comment:
   Could you please revert these format changes?



##########
parquet-avro/src/main/java/org/apache/parquet/avro/AvroRecordConverter.java:
##########
@@ -169,6 +172,46 @@ public void add(Object value) {
     }
   }
 
+  /**
+   * Returns the specific data model for a given SpecificRecord schema by 
reflecting the underlying
+   * Avro class's `MODEL$` field, or Null if the class is not on the classpath 
or reflection fails.
+   */
+  static SpecificData getModelForSchema(Schema schema) {

Review Comment:
   @gszadovszky Could you help review this? I am not that familiar with Avro.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to