github-actions[bot] commented on code in PR #64071:
URL: https://github.com/apache/doris/pull/64071#discussion_r3465871456


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/load/NereidsLoadScanProvider.java:
##########
@@ -195,9 +196,21 @@ private void 
fillContextExprMap(List<NereidsImportColumnDesc> columnDescList, Ne
         // If user does not specify the file field names, generate it by using 
base schema of table.
         // So that the following process can be unified
         boolean specifyFileFieldNames = copiedColumnExprs.stream().anyMatch(p 
-> p.isColumn());
-        if (!specifyFileFieldNames) {
+        boolean fillMissing = isFillMissingColumns(fileGroup);
+        if (!specifyFileFieldNames || fillMissing) {
+            // Only dedup against already-present columns for the 
fill_missing_columns path so that
+            // the existing !specifyFileFieldNames behavior stays 
byte-for-byte identical.
+            Set<String> existingColumns = new 
TreeSet<>(String.CASE_INSENSITIVE_ORDER);

Review Comment:
   This de-dup should not treat every existing descriptor as proof that the 
matching file slot is available. With `fill_missing_columns=true`, a same-name 
mapping like `COLUMNS(k1 = k1)` starts `copiedColumnExprs` with 
`NereidsImportColumnDesc("k1", UnboundSlot("k1"))`. Adding that target to 
`existingColumns` makes the base-schema loop skip the plain `k1` descriptor, 
and the later scan-slot loop only creates file slots for descriptors whose 
`expr == null`.
   
   The resulting reduced plan is:
   
   ```text
   LogicalLoadProject(k1 := UnboundSlot(k1))
     LogicalProject(scanSlots without k1)
       LogicalOneRowRelation(scanSlots without k1)
   ```
   
   So the mapping still references the input `k1`, but its child no longer 
outputs that slot. The old `!specifyFileFieldNames` path added the base-schema 
descriptor and kept this source slot available. Please de-dup only true 
file-field descriptors and constant mappings, or otherwise add the base scan 
descriptor whenever a mapping expression can still reference the same source 
column.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to