Re: [PR] [#7788] fix(core): fix duplicate column when load table [gravitino]

via GitHub Tue, 19 Aug 2025 02:19:42 -0700


mchades commented on code in PR #7789:
URL: https://github.com/apache/gravitino/pull/7789#discussion_r2284657032



##########
core/src/main/java/org/apache/gravitino/catalog/TableOperationDispatcher.java:
##########
@@ -624,12 +624,32 @@ private Pair<Boolean, List<ColumnEntity>> 
updateColumnsIfNecessary(
             ? Collections.emptyMap()
             : IntStream.range(0, tableFromCatalog.columns().length)
                 .mapToObj(i -> Pair.of(i, tableFromCatalog.columns()[i]))
-                .collect(Collectors.toMap(p -> p.getRight().name(), 
Function.identity()));
+                .collect(
+                    Collectors.toMap(
+                        p -> p.getRight().name(),
+                        Function.identity(),
+                        (existing, replacement) -> {
+                          LOG.warn(
+                              "Duplicate column name '{}' found at position {} 
and {}, using the first occurrence",
+                              existing.getRight().name(),
+                              existing.getLeft(),
+                              replacement.getLeft());
+                          return existing;

Review Comment:
   > It's not always the best practice to completely fix the table metadata
   
   The point is not to fix the table metadata, but to correct the logic of 
converting a HiveTable to a GravitinoTable in order to prevent column 
duplication.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [#7788] fix(core): fix duplicate column when load table [gravitino]

Reply via email to