morrySnow commented on code in PR #64413:
URL: https://github.com/apache/doris/pull/64413#discussion_r3489799465


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -879,6 +883,8 @@ private PlanFragment 
computePhysicalOlapScan(PhysicalOlapScan olapScan, PlanTran
         // generate real output tuple
         TupleDescriptor tupleDescriptor = generateTupleDesc(slots, olapTable, 
context);
         List<SlotDescriptor> slotDescriptors = tupleDescriptor.getSlots();
+        Map<ExprId, SlotDescriptor> exprIdToSlotDescriptor = 
slotDescriptors.stream()
+                .collect(Collectors.toMap(s -> context.findExprId(s.getId()), 
s -> s));

Review Comment:
   为什么要生成这个临时的map? 后面使用 exprIdToSlotDescriptor 的地方,不能直接使用 `context.findExprId` 
吗?



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -1001,6 +1018,75 @@ private PlanFragment 
computePhysicalOlapScan(PhysicalOlapScan olapScan, PlanTran
         return planFragment;
     }
 
+    private StorageAlignedScanSlots 
computeStorageAlignedScanSlots(PhysicalOlapScan olapScan) {
+        if (!shouldAlignScanSlotsToStorageSchema(olapScan)) {
+            return new StorageAlignedScanSlots(olapScan.getOutput(), false);
+        }
+
+        List<Slot> outputSlots = olapScan.getOutput();
+        Map<Integer, Slot> slotByColumnUniqueId = new HashMap<>();
+        Map<String, Slot> slotByColumnName = new HashMap<>();
+        for (Slot slot : outputSlots) {
+            Optional<Column> originalColumn = ((SlotReference) 
slot).getOriginalColumn();
+            if (originalColumn.isPresent()) {
+                Column column = originalColumn.get();
+                if (column.getUniqueId() == 
Column.COLUMN_UNIQUE_ID_INIT_VALUE) {
+                    slotByColumnName.put(column.getName(), slot);
+                } else {
+                    slotByColumnUniqueId.put(column.getUniqueId(), slot);
+                }
+            }
+        }
+
+        List<Slot> storageSlots = new ArrayList<>();
+        Set<ExprId> storageExprIds = new HashSet<>();
+        long selectedIndexId = olapScan.getSelectedIndexId() == -1
+                ? olapScan.getTable().getBaseIndexId()
+                : olapScan.getSelectedIndexId();
+        for (Column column : 
olapScan.getTable().getSchemaByIndexId(selectedIndexId, true)) {
+            if (!column.isKey()) {
+                break;
+            }
+            Slot slot = column.getUniqueId() == 
Column.COLUMN_UNIQUE_ID_INIT_VALUE
+                    ? slotByColumnName.get(column.getName())
+                    : slotByColumnUniqueId.get(column.getUniqueId());
+            slot = Objects.requireNonNull(slot, "missing scan slot for storage 
key column " + column.getName());

Review Comment:
   这里如果只是为了判空,没必要重新赋值。用precondition或者直接if判断



##########
fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java:
##########
@@ -187,6 +187,7 @@ public class OlapScanNode extends ScanNode {
 
     private SortInfo sortInfo = null;
     private Set<Integer> outputColumnUniqueIds = new HashSet<>();
+    private Set<Integer> extraKeyColumnSlotIds = new HashSet<>();

Review Comment:
   把这个加入到 explain 中。对它增加explain的回归测试



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -866,11 +867,14 @@ private PlanFragment 
getPlanFragmentForPhysicalFileScan(PhysicalFileScan fileSca
 
     @Override
     public PlanFragment visitPhysicalOlapScan(PhysicalOlapScan olapScan, 
PlanTranslatorContext context) {
-        return computePhysicalOlapScan(olapScan, context);
+        return computePhysicalOlapScan(olapScan, context, true);
     }
 
-    private PlanFragment computePhysicalOlapScan(PhysicalOlapScan olapScan, 
PlanTranslatorContext context) {
-        List<Slot> slots = olapScan.getOutput();
+    private PlanFragment computePhysicalOlapScan(PhysicalOlapScan olapScan, 
PlanTranslatorContext context,

Review Comment:
   scan中的所有修改,应该都是没有必要的



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -1001,6 +1018,75 @@ private PlanFragment 
computePhysicalOlapScan(PhysicalOlapScan olapScan, PlanTran
         return planFragment;
     }
 
+    private StorageAlignedScanSlots 
computeStorageAlignedScanSlots(PhysicalOlapScan olapScan) {

Review Comment:
   这个函数存在的意义没有看懂。从设计原则上讲。输入的olapScan的output中没有重复 
exprid。且输出顺序和表的schema一致。那这个函数永远都会返回`StorageAlignedScanSlots(olapScan.getOutput(),
 false);`



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -1001,6 +1018,75 @@ private PlanFragment 
computePhysicalOlapScan(PhysicalOlapScan olapScan, PlanTran
         return planFragment;
     }
 
+    private StorageAlignedScanSlots 
computeStorageAlignedScanSlots(PhysicalOlapScan olapScan) {
+        if (!shouldAlignScanSlotsToStorageSchema(olapScan)) {
+            return new StorageAlignedScanSlots(olapScan.getOutput(), false);
+        }
+
+        List<Slot> outputSlots = olapScan.getOutput();
+        Map<Integer, Slot> slotByColumnUniqueId = new HashMap<>();
+        Map<String, Slot> slotByColumnName = new HashMap<>();
+        for (Slot slot : outputSlots) {
+            Optional<Column> originalColumn = ((SlotReference) 
slot).getOriginalColumn();
+            if (originalColumn.isPresent()) {
+                Column column = originalColumn.get();
+                if (column.getUniqueId() == 
Column.COLUMN_UNIQUE_ID_INIT_VALUE) {
+                    slotByColumnName.put(column.getName(), slot);
+                } else {
+                    slotByColumnUniqueId.put(column.getUniqueId(), slot);
+                }
+            }
+        }
+
+        List<Slot> storageSlots = new ArrayList<>();
+        Set<ExprId> storageExprIds = new HashSet<>();
+        long selectedIndexId = olapScan.getSelectedIndexId() == -1
+                ? olapScan.getTable().getBaseIndexId()
+                : olapScan.getSelectedIndexId();
+        for (Column column : 
olapScan.getTable().getSchemaByIndexId(selectedIndexId, true)) {

Review Comment:
   这里看起来一次遍历就能搞定



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -980,6 +989,14 @@ private PlanFragment 
computePhysicalOlapScan(PhysicalOlapScan olapScan, PlanTran
         context.addScanNode(olapScanNode, olapScan);
 
         translateRuntimeFilter(olapScan, olapScanNode, context);
+        if (projectStorageAlignedScanOutput && 
storageAlignedScanSlots.needsOutputProjection) {
+            List<Expr> projectionExprs = outputSlots.stream()
+                    .map(slot -> context.findSlotRef(slot.getExprId()))
+                    .collect(Collectors.toList());
+            TupleDescriptor projectionTuple = generateTupleDesc(outputSlots, 
olapTable, context);
+            olapScanNode.setProjectList(projectionExprs);
+            olapScanNode.setOutputTupleDesc(projectionTuple);

Review Comment:
   在这里设置project会导致 filter不能生成在 scan 上。但是如 `computeStorageAlignedScanSlots` 
上的评论所说。由于想不到什么情况下 storageAlignedScanSlots.needsOutputProjection 会是 
true。所以代码应该是没有走到这里。



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -1001,6 +1018,75 @@ private PlanFragment 
computePhysicalOlapScan(PhysicalOlapScan olapScan, PlanTran
         return planFragment;
     }
 
+    private StorageAlignedScanSlots 
computeStorageAlignedScanSlots(PhysicalOlapScan olapScan) {
+        if (!shouldAlignScanSlotsToStorageSchema(olapScan)) {
+            return new StorageAlignedScanSlots(olapScan.getOutput(), false);
+        }
+
+        List<Slot> outputSlots = olapScan.getOutput();
+        Map<Integer, Slot> slotByColumnUniqueId = new HashMap<>();
+        Map<String, Slot> slotByColumnName = new HashMap<>();
+        for (Slot slot : outputSlots) {
+            Optional<Column> originalColumn = ((SlotReference) 
slot).getOriginalColumn();
+            if (originalColumn.isPresent()) {
+                Column column = originalColumn.get();
+                if (column.getUniqueId() == 
Column.COLUMN_UNIQUE_ID_INIT_VALUE) {
+                    slotByColumnName.put(column.getName(), slot);
+                } else {
+                    slotByColumnUniqueId.put(column.getUniqueId(), slot);
+                }
+            }
+        }
+
+        List<Slot> storageSlots = new ArrayList<>();
+        Set<ExprId> storageExprIds = new HashSet<>();
+        long selectedIndexId = olapScan.getSelectedIndexId() == -1
+                ? olapScan.getTable().getBaseIndexId()
+                : olapScan.getSelectedIndexId();
+        for (Column column : 
olapScan.getTable().getSchemaByIndexId(selectedIndexId, true)) {
+            if (!column.isKey()) {
+                break;
+            }
+            Slot slot = column.getUniqueId() == 
Column.COLUMN_UNIQUE_ID_INIT_VALUE
+                    ? slotByColumnName.get(column.getName())
+                    : slotByColumnUniqueId.get(column.getUniqueId());
+            slot = Objects.requireNonNull(slot, "missing scan slot for storage 
key column " + column.getName());
+            if (storageExprIds.add(slot.getExprId())) {

Review Comment:
   scan输出的exprid不可能重复。这里的检查意义是什么?



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -2870,6 +2971,15 @@ public PlanFragment 
visitPhysicalLazyMaterializeOlapScan(PhysicalLazyMaterialize
 
         translateRuntimeFilter(lazyScan, olapScanNode, context);
 
+        if (!olapScanNode.getExtraKeyColumnSlotIds().isEmpty()) {
+            List<Expr> projectionExprs = lazyScan.getOutput().stream()
+                    .map(slot -> context.findSlotRef(slot.getExprId()))
+                    .collect(Collectors.toList());
+            TupleDescriptor projectionTuple = 
generateTupleDesc(lazyScan.getOutput(), lazyScan.getTable(), context);
+            olapScanNode.setProjectList(projectionExprs);
+            olapScanNode.setOutputTupleDesc(projectionTuple);

Review Comment:
   这个应该也没必要加



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -2851,14 +2951,15 @@ public PlanFragment 
visitPhysicalLazyMaterializeTVFScan(PhysicalLazyMaterializeT
     @Override
     public PlanFragment 
visitPhysicalLazyMaterializeOlapScan(PhysicalLazyMaterializeOlapScan lazyScan,
             PlanTranslatorContext context) {
-        PlanFragment planFragment = 
computePhysicalOlapScan(lazyScan.getScan(), context);
+        PlanFragment planFragment = 
computePhysicalOlapScan(lazyScan.getScan(), context, false);
         OlapScanNode olapScanNode = (OlapScanNode) planFragment.getPlanRoot();
         // set lazy materialized context
         olapScanNode.setIsTopnLazyMaterialize(true);
         
olapScanNode.setGlobalRowIdColumn(lazyScan.getRowId().getOriginalColumn().get());
         Set<SlotId> scanIds = 
lazyScan.getOutput().stream().map(NamedExpression::getExprId)
                 
.map(context::findSlotRef).filter(Objects::nonNull).map(SlotRef::getSlotId)
                 .collect(Collectors.toSet());
+        preserveExtraStorageKeySlots(olapScanNode, scanIds);
 
         olapScanNode.getTupleDesc().getSlots().removeIf(slot -> 
!scanIds.contains(slot.getId()));

Review Comment:
   这里都把scan上的slot移除了。getExtraKeyColumnSlotIds里面的ids应该是在 olapscan上的不存在了把。



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to