shangxinli commented on code in PR #957:
URL: https://github.com/apache/parquet-mr/pull/957#discussion_r870479919
##########
parquet-avro/src/main/java/org/apache/parquet/avro/AvroRecordConverter.java:
##########
@@ -866,6 +866,20 @@ static boolean isElementType(Type repeatedType, Schema
elementSchema) {
} else if (elementSchema != null &&
elementSchema.getType() == Schema.Type.RECORD) {
Schema schemaFromRepeated =
CONVERTER.convert(repeatedType.asGroupType());
+
+ // Fix for PARQUET-2069
+ // ParquetMR breaks compatibility with itself by including a JSON
+ // representation of a schema that names a record "list", when
+ // it should be named "array" to match with the rest of the metadata.
+ // Inserting this code allows Avro to detect that the "array" and "list"
+ // types are compatible. Since this alias is being added to something
+ // that is the result of parsing JSON, we can't add the alias at the
+ // time of construction. Therefore we have to do it here where the the
data
+ // structures have been unwrapped to the point where we have the
+ // incompatible structure and can add the necessary alias.
+ if (elementSchema.getName().equals("list"))
elementSchema.addAlias("array", "");
Review Comment:
Follow the above standard like line 866.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]