rdblue commented on a change in pull request #1177:
URL: https://github.com/apache/iceberg/pull/1177#discussion_r464525224
##########
File path: core/src/main/java/org/apache/iceberg/SchemaUpdate.java
##########
@@ -308,6 +314,162 @@ public UpdateSchema moveAfter(String name, String
afterName) {
return this;
}
+ @Override
+ public UpdateSchema upsertSchema(Schema newSchema) {
+ TypeUtil.visit(newSchema, new ApplyUpdates(this, schema));
+ return this;
+ }
+
+ private static class ApplyUpdates extends TypeUtil.SchemaVisitor<Void> {
+ private static final Joiner DOT = Joiner.on(".");
+ private final Deque<String> fieldNames = Lists.newLinkedList();
+ private NestedField currentField = null;
+
+ private final Schema baseSchema;
+ private final UpdateSchema api;
+ private final Map<String, Integer> indexByName;
+
+ private ApplyUpdates(UpdateSchema api, Schema baseSchema) {
+ this.api = api;
+ this.baseSchema = baseSchema;
+ this.indexByName = TypeUtil.indexByName(baseSchema.asStruct());
+ }
+
+ @Override
+ public void beforeListElement(NestedField elementField) {
+ beforeField(elementField);
+ }
+
+ @Override
+ public void afterListElement(NestedField elementField) {
+ afterField(elementField);
+ }
+
+ @Override
+ public void beforeMapKey(Types.NestedField keyField) {
+ beforeField(keyField);
+ }
+
+ @Override
+ public void afterMapKey(Types.NestedField keyField) {
+ afterField(keyField);
+ }
+
+ @Override
+ public void beforeMapValue(Types.NestedField valueField) {
+ beforeField(valueField);
+ }
+
+ @Override
+ public void afterMapValue(Types.NestedField valueField) {
+ afterField(valueField);
+ }
+
+ @Override
+ public void beforeField(NestedField field) {
+ fieldNames.push(field.name()); // we don't expect `element` to show up -
it breaks
+ currentField = field;
+ }
+
+ @Override
+ public void afterField(NestedField field) {
+ fieldNames.pop();
+ }
+
+ @Override
+ public Void field(NestedField field, Void fieldResult) {
+ return super.field(field, fieldResult);
+ }
+
+ @Override
+ public Void list(ListType list, Void elementResult) {
+ String fullName = DOT.join(fieldNames.descendingIterator());
+ Types.NestedField field = baseSchema.findField(fullName);
+ if (field == null) {
+ addColumn(fieldNames.peekFirst(), Types.ListType.ofOptional(0,
list.elementType()), ancestors());
+ } else if (!field.type().isListType()) {
+ throw new IllegalArgumentException(
Review comment:
This visitor includes quite a bit of logic for traversing two schemas at
once, by keeping track of the current name and looking up the equivalent in the
existing/base schema. This check is also primarily to validate structure.
I think it would be good to separate the logic to traverse schemas into a
different visitor class, like we've done with the other "partner" visitors.
That way, the union visitor would just be handed two types at the same place in
a schema and can decide whether the observed type needs to be added, used to
update, or if no action is needed.
In addition, we should be able to construct that visitor so that it keeps
track of the place in the existing schema, so that we don't need to keep track
of ancestors.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]