justinmclean opened a new issue, #10220:
URL: https://github.com/apache/gravitino/issues/10220
### What would you like to be improved?
When updating a table, TableColumnMetaService.updateColumnPOsFromTableDiff
only calls updateSchemaIdByTableId(...) when there are no column diffs. If
namespace changes and columns are also added/updated/deleted in the same
update, that branch is skipped. This can leave active non-deleted column rows
with stale old schema_id, creating inconsistent column metadata for the same
table.
### How should we improve?
In updateColumnPOsFromTableDiff, handle namespace migration independently
from column diff emptiness. If the namespace changed, always align existing
active column rows to the new schema_id (either by always calling
updateSchemaIdByTableId(...) before inserting diff rows, or by ensuring all
surviving rows for that table/version get a new schema_id). Keep the current
diff-insert logic for UPDATE/DELETE records, and add a unit test that updates
the namespace and columns together and verifies that no active non-deleted
column remains under the old schema_id.
Here's a unit test to help:
```
@TestTemplate
public void
testUpdateTableWithSchemaAndColumnChangesUpdatesAllColumnSchemaIds()
throws IOException {
String catalogName = "catalog1";
String oldSchemaName = "schema1";
String newSchemaName = "schema2";
createParentEntities(METALAKE_NAME, catalogName, oldSchemaName,
AUDIT_INFO);
createAndInsertSchema(METALAKE_NAME, catalogName, newSchemaName);
ColumnEntity existingColumn =
ColumnEntity.builder()
.withId(RandomIdGenerator.INSTANCE.nextId())
.withName("column1")
.withPosition(0)
.withComment("comment1")
.withDataType(Types.IntegerType.get())
.withNullable(true)
.withAutoIncrement(false)
.withDefaultValue(Literals.integerLiteral(1))
.withAuditInfo(AUDIT_INFO)
.build();
TableEntity createdTable =
TableEntity.builder()
.withId(RandomIdGenerator.INSTANCE.nextId())
.withName("table_move_schema")
.withNamespace(Namespace.of(METALAKE_NAME, catalogName,
oldSchemaName))
.withColumns(Lists.newArrayList(existingColumn))
.withAuditInfo(AUDIT_INFO)
.build();
TableMetaService.getInstance().insertTable(createdTable, false);
ColumnEntity newColumn =
ColumnEntity.builder()
.withId(RandomIdGenerator.INSTANCE.nextId())
.withName("column2")
.withPosition(1)
.withComment("comment2")
.withDataType(Types.StringType.get())
.withNullable(true)
.withAutoIncrement(false)
.withDefaultValue(Literals.stringLiteral("v2"))
.withAuditInfo(AUDIT_INFO)
.build();
TableEntity movedAndUpdatedTable =
TableEntity.builder()
.withId(createdTable.id())
.withName(createdTable.name())
.withNamespace(Namespace.of(METALAKE_NAME, catalogName,
newSchemaName))
.withColumns(Lists.newArrayList(existingColumn, newColumn))
.withAuditInfo(AUDIT_INFO)
.build();
TableMetaService.getInstance()
.updateTable(createdTable.nameIdentifier(), old ->
movedAndUpdatedTable);
long newSchemaId =
EntityIdService.getEntityId(
NameIdentifier.of(METALAKE_NAME, catalogName, newSchemaName),
Entity.EntityType.SCHEMA);
ColumnPO existingColumnPO =
TableColumnMetaService.getInstance().getColumnPOById(existingColumn.id());
Assertions.assertEquals(
newSchemaId,
existingColumnPO.getSchemaId(),
"Existing column schema id should be updated to new schema id");
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]