danny0405 commented on code in PR #9743:
URL: https://github.com/apache/hudi/pull/9743#discussion_r1357686036
##########
hudi-common/src/main/java/org/apache/hudi/avro/AvroSchemaCompatibility.java:
##########
@@ -372,7 +372,8 @@ private SchemaCompatibilityResult
calculateCompatibility(final Schema reader, fi
return (writer.getType() == Type.STRING) ? result :
result.mergedWith(typeMismatch(reader, writer, locations));
}
case STRING: {
- return (writer.getType() == Type.BYTES) ? result :
result.mergedWith(typeMismatch(reader, writer, locations));
+ return ((writer.getType() == Type.BYTES) || (writer.getType() ==
Type.INT) || (writer.getType() == Type.LONG)
Review Comment:
Maybe we can have a utilities to decide whether the given type belongs to a
NUMERIC type.
##########
hudi-common/src/main/java/org/apache/hudi/avro/AvroSchemaCompatibility.java:
##########
@@ -372,7 +372,8 @@ private SchemaCompatibilityResult
calculateCompatibility(final Schema reader, fi
return (writer.getType() == Type.STRING) ? result :
result.mergedWith(typeMismatch(reader, writer, locations));
}
case STRING: {
- return (writer.getType() == Type.BYTES) ? result :
result.mergedWith(typeMismatch(reader, writer, locations));
+ return ((writer.getType() == Type.BYTES) || (writer.getType() ==
Type.INT) || (writer.getType() == Type.LONG)
Review Comment:
Maybe we can have a utility to decide whether the type is a numeric data
type.
##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieCommonConfig.java:
##########
@@ -71,6 +71,14 @@ public class HoodieCommonConfig extends HoodieConfig {
+ " operation will fail schema compatibility check. Set this option
to true will make the newly added "
+ " column nullable to successfully complete the write operation.");
+ public static final ConfigProperty<String> ADD_NULL_FOR_DELETED_COLUMNS =
ConfigProperty
Review Comment:
+1 for this to be default behavior or just fail the write operation.
##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -1131,6 +1196,25 @@ private static Schema getActualSchemaFromUnion(Schema
schema, Object data) {
return actualSchema;
}
+ private static Schema getActualSchemaFromUnion(Schema schema) {
+ Schema actualSchema;
+ if (schema.getType() != UNION) {
+ return schema;
+ }
+ if (schema.getTypes().size() == 2
+ && schema.getTypes().get(0).getType() == Schema.Type.NULL) {
+ actualSchema = schema.getTypes().get(1);
+ } else if (schema.getTypes().size() == 2
+ && schema.getTypes().get(1).getType() == Schema.Type.NULL) {
+ actualSchema = schema.getTypes().get(0);
+ } else if (schema.getTypes().size() == 1) {
+ actualSchema = schema.getTypes().get(0);
Review Comment:
+1
##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieAvroDataBlock.java:
##########
@@ -206,6 +213,9 @@ public IndexedRecord next() {
IndexedRecord record = this.reader.read(null, decoder);
this.dis.skipBytes(recordLength);
this.readRecords++;
+ if (this.promotedSchema.isPresent()) {
+ return HoodieAvroUtils.rewriteRecordWithNewSchema(record,
this.promotedSchema.get());
Review Comment:
Is it possible we read the record with the `promotedSchema` directly so that
there is no need to rewrite each record ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]