TEOTEO520 commented on code in PR #7789:
URL: https://github.com/apache/gravitino/pull/7789#discussion_r2284619222


##########
core/src/main/java/org/apache/gravitino/catalog/TableOperationDispatcher.java:
##########
@@ -624,12 +624,32 @@ private Pair<Boolean, List<ColumnEntity>> 
updateColumnsIfNecessary(
             ? Collections.emptyMap()
             : IntStream.range(0, tableFromCatalog.columns().length)
                 .mapToObj(i -> Pair.of(i, tableFromCatalog.columns()[i]))
-                .collect(Collectors.toMap(p -> p.getRight().name(), 
Function.identity()));
+                .collect(
+                    Collectors.toMap(
+                        p -> p.getRight().name(),
+                        Function.identity(),
+                        (existing, replacement) -> {
+                          LOG.warn(
+                              "Duplicate column name '{}' found at position {} 
and {}, using the first occurrence",
+                              existing.getRight().name(),
+                              existing.getLeft(),
+                              replacement.getLeft());
+                          return existing;

Review Comment:
   It's not always the best practice to completely fix the table metadata, for 
the following reasons:
   
   1. This situation occurs in HMS (Hive Metastore) where, due to certain 
special table creation circumstances, a field exists in both 
table.getSd().getCols() and table.getPartitionKeys(). This is not entirely a 
case of "duplicate fields" in the true sense.
   
   2. Tables in this situation can be used normally - they can be queried 
successfully with engines like Spark, and metadata information can be retrieved 
normally using SQL commands like SHOW CREATE TABLE and DESC FORMATTED. As a 
metadata proxy service, Gravitino should reasonably display the table's 
metadata information rather than directly reporting an error.
   
   3. In our production practice, there are indeed dozens of online production 
tables in this situation, and they are being widely used. We prefer Gravitino 
to be compatible with this situation rather than directly reporting an error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to