Re: [PR] [refactor](fe) Refactor SummaryProfile to use QueryTrace/ProfileSpan dynamic tracing [doris]

via GitHub Tue, 24 Mar 2026 14:38:44 -0700


github-actions[bot] commented on code in PR #61603:
URL: https://github.com/apache/doris/pull/61603#discussion_r2984426702



##########
fe/fe-core/src/main/java/org/apache/doris/datasource/hive/source/HiveScanNode.java:
##########
@@ -170,7 +172,13 @@ protected List<HivePartition> getPartitions() throws 
AnalysisException {
             resPartitions.add(dummyPartition);
         }
         if (ConnectContext.get().getExecutor() != null) {
-            
ConnectContext.get().getExecutor().getSummaryProfile().setGetPartitionsFinishTime();
+            SummaryProfile sp = 
ConnectContext.get().getExecutor().getSummaryProfile();
+            if (sp != null) {
+                QueryTrace trace = sp.getQueryTrace();
+                if (trace != null) {
+                    trace.recordDuration("Get Partitions Time", 0);
+                }

Review Comment:
   **Bug: Always records 0ms duration.**
   
   The old code recorded a timestamp via `setGetPartitionsFinishTime()` and 
computed the duration as `getPartitionsFinishTime - getSplitsStartTime` in 
`updateExecutionSummaryProfile()`. The new code passes a hardcoded `0` to 
`recordDuration()`, so this metric will always show `0ms`.
   
   You need to capture `System.currentTimeMillis()` before the timed work and 
compute the delta:
   ```java
   long partitionsStart = System.currentTimeMillis();
   // ... partition fetching logic ...
   trace.recordDuration("Get Partitions Time", System.currentTimeMillis() - 
partitionsStart);
   ```
   
   The same correct pattern is already used in 
`FileQueryScanNode.recordDurationToTrace()` and 
`HiveTableSink.setPartitionValues()`.



##########
fe/fe-core/src/main/java/org/apache/doris/datasource/hive/source/HiveScanNode.java:
##########
@@ -189,7 +197,13 @@ public List<Split> getSplits(int numBackends) throws 
UserException {
             List<Split> allFiles = Lists.newArrayList();
             getFileSplitByPartitions(cache, prunedPartitions, allFiles, 
bindBrokerName, numBackends, false);
             if (ConnectContext.get().getExecutor() != null) {
-                
ConnectContext.get().getExecutor().getSummaryProfile().setGetPartitionFilesFinishTime();
+                SummaryProfile sp = 
ConnectContext.get().getExecutor().getSummaryProfile();
+                if (sp != null) {
+                    QueryTrace trace = sp.getQueryTrace();
+                    if (trace != null) {
+                        trace.recordDuration("Get Partition Files Time", 0);
+                    }

Review Comment:
   **Bug: Same zero-duration issue as above.**
   
   `recordDuration("Get Partition Files Time", 0)` always records 0ms. The old 
code computed this as `getPartitionFilesFinishTime - getPartitionsFinishTime`. 
You need to capture the start time before `getFileSplitByPartitions()` and 
compute the delta.



##########
fe/fe-core/src/main/java/org/apache/doris/planner/HiveTableSink.java:
##########
@@ -235,8 +235,15 @@ private void setPartitionValues(THiveTableSink tSink) 
throws AnalysisException {
 
         tSink.setPartitions(partitions);
 
-        if (ConnectContext.get().getExecutor() != null) {
-            
ConnectContext.get().getExecutor().getSummaryProfile().setSinkGetPartitionsFinishTime();
+        if (ConnectContext.get() != null && ConnectContext.get().getExecutor() 
!= null) {
+            SummaryProfile sp = 
ConnectContext.get().getExecutor().getSummaryProfile();
+            if (sp != null) {
+                QueryTrace trace = sp.getQueryTrace();
+                if (trace != null) {
+                    trace.recordDuration("Sink Get Partitions Time",
+                            System.currentTimeMillis() - startTime);

Review Comment:
   **Bug: Span name mismatch.**
   
   The recorded name is `"Sink Get Partitions Time"` but the pre-registered 
constant is `SummaryProfile.SINK_SET_PARTITION_VALUES_TIME = "Sink Set 
Partition Values Time"`. This causes the pre-registered key to always show 
"N/A" while a separate entry is appended.
   
   Please use the constant `SummaryProfile.SINK_SET_PARTITION_VALUES_TIME` 
here, or update the constant to match.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/expression/rules/FoldConstantRuleOnBE.java:
##########
@@ -337,7 +337,14 @@ private static Map<String, Expression> 
evalOnBE(Map<String, Map<String, TExpr>>
             }
             PConstantExprResult result = future.get(5, TimeUnit.SECONDS);
             if (context.getExecutor() != null && 
context.getSessionVariable().enableProfile()) {
-                
context.getExecutor().getSummaryProfile().sumBeFoldTime(TimeUtils.getStartTimeMs()
 - beFoldStartTime);
+                org.apache.doris.common.profile.QueryTrace trace = null;
+                if (context.getExecutor().getSummaryProfile() != null) {
+                    trace = 
context.getExecutor().getSummaryProfile().getQueryTrace();
+                }
+                if (trace != null) {
+                    trace.recordDuration("Nereids BE Fold Constant Time",
+                            TimeUtils.getStartTimeMs() - beFoldStartTime);

Review Comment:
   **Bug: Span name mismatch.**
   
   The span name here is `"Nereids BE Fold Constant Time"`, but the 
pre-registered constant is `SummaryProfile.NEREIDS_BE_FOLD_CONST_TIME = 
"Nereids Fold Const By BE Time"` (line 126 of SummaryProfile.java).
   
   This causes the pre-registered key `"Nereids Fold Const By BE Time"` to 
always show "N/A" in the profile, while a *separate* entry `"Nereids BE Fold 
Constant Time"` is appended at the end with the actual value.
   
   Please use the constant `SummaryProfile.NEREIDS_BE_FOLD_CONST_TIME` here 
(and in the getter `getNereidsBeFoldConstTimeMs()`).



##########
fe/fe-core/src/main/java/org/apache/doris/common/profile/QueryTrace.java:
##########
@@ -0,0 +1,359 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.common.profile;
+
+import org.apache.doris.thrift.TUnit;
+
+import com.google.common.base.Preconditions;
+
+import java.util.ArrayDeque;
+import java.util.Deque;
+import java.util.LinkedHashMap;
+import java.util.concurrent.locks.ReentrantReadWriteLock;
+
+/**
+ * A per-query tracing container that dynamically collects timing spans, 
counters and text attributes
+ * during FE-side query execution. Replaces the hardcoded field-based approach 
in SummaryProfile.
+ *
+ * <h3>Core Concepts:</h3>
+ * <ul>
+ *   <li><b>Span</b> – A timed phase, created via {@link #startSpan(String)} 
with try-with-resources.
+ *       Multiple starts of the same name accumulate durations.</li>
+ *   <li><b>Counter</b> – A numeric metric (count, bytes, rate), set or 
accumulated.</li>
+ *   <li><b>Text</b> – A string attribute (version, workload group name, 
etc.).</li>
+ * </ul>
+ *
+ * <h3>ThreadLocal Scope:</h3>
+ * <p>A ThreadLocal Span stack automatically tracks the current active span.
+ * When {@link #startSpan(String)} is called, the stack-top span is used as 
the parent.
+ * This enables cross-class span nesting without parameter passing.</p>
+ *
+ * <h3>Display Order and Indentation:</h3>
+ * <ul>
+ *   <li>Display order = insertion order (first-seen order, maintained by 
LinkedHashMap)</li>
+ *   <li>Indentation = parent chain depth (computed at registration time)</li>
+ * </ul>
+ *
+ * <p>Usage:
+ * <pre>{@code
+ * // Timed span
+ * try (ProfileSpan span = trace.startSpan("Plan Time")) {
+ *     try (ProfileSpan child = trace.startSpan("Nereids Analysis Time")) {
+ *         // automatically a child of "Plan Time"
+ *     }
+ * }
+ *
+ * // Counter
+ * trace.setCounter("Fragment Compressed Size", size, TUnit.BYTES);
+ * trace.addCounter("Fragment RPC Count", 1);
+ *
+ * // Text
+ * trace.setText("Doris Version", version);
+ * }</pre>
+ */
+public class QueryTrace {
+
+    /**
+     * Type of a registered metric entry.
+     */
+    enum MetricType {
+        SPAN,       // Timed span (AutoCloseable, durations accumulate)
+        COUNTER,    // Numeric counter / gauge
+        TEXT        // String attribute
+    }
+
+    /**
+     * Internal data holder for a single registered metric.
+     */
+    static class MetricEntry {
+        final String name;
+        final String parentName;   // null for root-level entries
+        final MetricType type;
+        final TUnit unit;
+        final int indentLevel;     // computed from parent chain depth
+
+        // Numeric value storage (for SPAN durations and COUNTER values).
+        // Use volatile for thread safety on simple set/get.
+        // For accumulated spans, addAndGet provides atomicity.
+        volatile long numericValue;
+        // Text value storage

Review Comment:
   **Nit: Misleading comment.**
   
   The comment on line 91-92 says _"For accumulated spans, addAndGet provides 
atomicity"_ but the actual implementation uses `synchronized(entry)` blocks 
(not `AtomicLong.addAndGet()`). Please update the comment to match the 
implementation, e.g.:
   ```
   // For accumulated spans/counters, synchronized(this) on the entry provides 
atomicity.
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java:
##########
@@ -473,95 +468,92 @@ protected void preMaterializedViewRewrite() {
         if (!cascadesContext.getStatementContext().isNeedPreMvRewrite()) {
             return;
         }
-        if (LOG.isDebugEnabled()) {
-            LOG.debug("Start pre rewrite plan by mv");
-        }
-        List<Plan> tmpPlansForMvRewrite = 
cascadesContext.getStatementContext().getTmpPlanForMvRewrite();
-        Plan originalPlan = cascadesContext.getRewritePlan();
-        List<Plan> plansWhichContainMv = new ArrayList<>();
-        // because tmpPlansForMvRewrite only one, so timeout is cumulative 
which is ok
-        for (Plan planForRewrite : tmpPlansForMvRewrite) {
-            SessionVariable sessionVariable = 
cascadesContext.getConnectContext()
-                    .getSessionVariable();
-            int timeoutSecond = sessionVariable.nereidsTimeoutSecond;
-            boolean enableTimeout = sessionVariable.enableNereidsTimeout;
-            try {
-                // set mv rewrite timeout
-                sessionVariable.nereidsTimeoutSecond = 
PreMaterializedViewRewriter.convertMillisToCeilingSeconds(
-                                
sessionVariable.materializedViewRewriteDurationThresholdMs);
-                sessionVariable.enableNereidsTimeout = true;
-                // pre rewrite
-                Plan rewrittenPlan = 
MaterializedViewUtils.rewriteByRules(cascadesContext,
-                        PreMaterializedViewRewriter::rewrite, planForRewrite, 
planForRewrite, true);
-                Plan ruleOptimizedPlan = 
MaterializedViewUtils.rewriteByRules(cascadesContext,
-                        childOptContext -> {
-                            
Rewriter.getWholeTreeRewriterWithoutCostBasedJobs(childOptContext).execute();
-                            return childOptContext.getRewritePlan();
-                        }, rewrittenPlan, planForRewrite, false);
-                if (ruleOptimizedPlan == null) {
-                    continue;
-                }
-                // after rbo, maybe the plan changed a lot, so we need to 
normalize it with original plan
-                Plan normalizedPlan = 
MaterializedViewUtils.normalizeSinkExpressions(
-                        ruleOptimizedPlan, originalPlan);
-                if (normalizedPlan != null) {
-                    plansWhichContainMv.add(normalizedPlan);
-                }
-            } catch (Exception e) {
-                LOG.error("pre mv rewrite in rbo rewrite fail, query id is {}",
-                        
cascadesContext.getConnectContext().getQueryIdentifier(), e);
+        try (ProfileSpan span = startSpan("Nereids Pre Rewrite By MV Time")) {
+            if (LOG.isDebugEnabled()) {

Review Comment:
   **Bug: Span name mismatch (case difference).**
   
   The span name here is `"Nereids Pre Rewrite By MV Time"` (uppercase "MV"), 
but the pre-registered constant is 
`SummaryProfile.NEREIDS_PRE_REWRITE_BY_MV_TIME = "Nereids Pre Rewrite By Mv 
Time"` (lowercase 'v').
   
   This causes the same duplicate-entry problem: the pre-registered key shows 
"N/A" while a second entry with the actual value is appended.
   
   Please use the constant `SummaryProfile.NEREIDS_PRE_REWRITE_BY_MV_TIME` here 
(and update the getter `getPrettyNereidsPreRewriteByMvTime()` accordingly).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [refactor](fe) Refactor SummaryProfile to use QueryTrace/ProfileSpan dynamic tracing [doris]

Reply via email to