Copilot commented on code in PR #61691:
URL: https://github.com/apache/doris/pull/61691#discussion_r2985429568


##########
be/src/vec/exec/scan/scanner_scheduler.cpp:
##########
@@ -153,28 +153,36 @@ void 
ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
         Thread::set_thread_nice_value();
     }
 #endif
+
+    // we set and get counter according below order, to make sure the counter 
is updated before get_block, and the time of get_block is recorded in the 
counter.
+    // 1. update_wait_worker_timer to make sure the time of waiting for worker 
thread is recorded in the timer
+    // 2. start_scan_cpu_timer to make sure the cpu timer include the time of 
open and get_block, which is the real cpu time of scanner
+    // 3. update_scan_cpu_timer when defer, to make sure the cpu timer include 
the time of open and get_block, which is the real cpu time of scanner
+    // 4. start_wait_worker_timer when defer, to make sure the time of waiting 
for worker thread is recorded in the timer
+
     MonotonicStopWatch max_run_time_watch;
     max_run_time_watch.start();
     scanner->update_wait_worker_timer();
     scanner->start_scan_cpu_timer();
 
-    // Counter update need prepare successfully, or it maybe core. For 
example, olap scanner
-    // will open tablet reader during prepare, if not prepare successfully, 
tablet reader == nullptr.
-    bool need_update_profile = scanner->has_prepared();
+    bool need_update_profile = true;
     auto update_scanner_profile = [&]() {
         if (need_update_profile) {
             scanner->update_scan_cpu_timer();
             scanner->update_realtime_counters();
             need_update_profile = false;
         }
     };
-    Defer defer_scanner([&] {
-        // WorkloadGroup Policy will check cputime realtime, so that should 
update the counter
-        // as soon as possible, could not update it on close.
-        update_scanner_profile();
-    });
+
     Status status = Status::OK();
     bool eos = false;
+    Defer defer_scanner([&] {
+        if (status.ok() && !eos) {
+            // if status is not ok, it means the scanner is failed, and the 
counter may be not updated correctly, so no need to update counter again. if 
eos is true, it means the scanner is finished successfully, and the counter is 
updated correctly, so no need to update counter again.

Review Comment:
   `update_scanner_profile()` (which calls `update_scan_cpu_timer()` and 
`update_realtime_counters()`) is no longer executed for the non-eos path. Since 
`start_scan_cpu_timer()` resets `_cpu_watch` at the beginning of each 
`_scanner_scan` invocation, skipping `update_scan_cpu_timer()` at the end of a 
slice will drop CPU time for all partial scans and also stop realtime counters 
from being refreshed for workload-group policies. Consider calling 
`update_scanner_profile()` in the `Defer` cleanup (before 
`start_wait_worker_timer()`), and keep the existing `eos` fast-path if needed.
   ```suggestion
               // if status is not ok, it means the scanner is failed, and the 
counter may be not updated correctly, so no need to update counter again. if 
eos is true, it means the scanner is finished successfully, and the counter is 
updated correctly, so no need to update counter again.
               update_scanner_profile();
   ```



##########
be/src/vec/exec/scan/scanner_scheduler.cpp:
##########
@@ -153,28 +153,36 @@ void 
ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
         Thread::set_thread_nice_value();
     }
 #endif
+
+    // we set and get counter according below order, to make sure the counter 
is updated before get_block, and the time of get_block is recorded in the 
counter.
+    // 1. update_wait_worker_timer to make sure the time of waiting for worker 
thread is recorded in the timer
+    // 2. start_scan_cpu_timer to make sure the cpu timer include the time of 
open and get_block, which is the real cpu time of scanner
+    // 3. update_scan_cpu_timer when defer, to make sure the cpu timer include 
the time of open and get_block, which is the real cpu time of scanner
+    // 4. start_wait_worker_timer when defer, to make sure the time of waiting 
for worker thread is recorded in the timer
+
     MonotonicStopWatch max_run_time_watch;
     max_run_time_watch.start();
     scanner->update_wait_worker_timer();
     scanner->start_scan_cpu_timer();
 
-    // Counter update need prepare successfully, or it maybe core. For 
example, olap scanner
-    // will open tablet reader during prepare, if not prepare successfully, 
tablet reader == nullptr.
-    bool need_update_profile = scanner->has_prepared();
+    bool need_update_profile = true;
     auto update_scanner_profile = [&]() {
         if (need_update_profile) {
             scanner->update_scan_cpu_timer();
             scanner->update_realtime_counters();
             need_update_profile = false;
         }

Review Comment:
   Changing `need_update_profile` from `scanner->has_prepared()` to 
unconditional `true` means `update_scanner_profile()` may call 
`update_realtime_counters()` even when `prepare()` fails and scanner internals 
are uninitialized (the old comment referenced a potential crash). If the intent 
is to always update CPU time, consider keeping a guard for 
`update_realtime_counters()` (e.g., `if (scanner->has_prepared())`) and/or 
making all `update_realtime_counters()` implementations defensively handle the 
unprepared state.



##########
be/src/vec/exec/scan/olap_scanner.cpp:
##########
@@ -642,6 +642,11 @@ Status OlapScanner::close(RuntimeState* state) {
 }
 
 void OlapScanner::update_realtime_counters() {
+    if (!_has_prepared) {
+        // Counter update need prepare successfully, or it maybe core. For 
example, olap scanner
+        // will open tablet reader during prepare, if not prepare 
successfully, tablet reader == nullptr.

Review Comment:
   Typo/wording: "it maybe core" is unclear (likely meant "it may core dump" / 
"it may crash"). Consider rephrasing to a clearer description of the failure 
mode (e.g., counters depend on tablet reader being initialized).
   ```suggestion
           // Counter update requires successful prepare; otherwise it may 
crash. For example, OlapScanner
           // opens the tablet reader during prepare, so if prepare fails, 
tablet reader == nullptr.
   ```



##########
be/src/vec/exec/scan/scanner_scheduler.cpp:
##########
@@ -153,28 +153,36 @@ void 
ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
         Thread::set_thread_nice_value();
     }
 #endif
+
+    // we set and get counter according below order, to make sure the counter 
is updated before get_block, and the time of get_block is recorded in the 
counter.
+    // 1. update_wait_worker_timer to make sure the time of waiting for worker 
thread is recorded in the timer
+    // 2. start_scan_cpu_timer to make sure the cpu timer include the time of 
open and get_block, which is the real cpu time of scanner
+    // 3. update_scan_cpu_timer when defer, to make sure the cpu timer include 
the time of open and get_block, which is the real cpu time of scanner
+    // 4. start_wait_worker_timer when defer, to make sure the time of waiting 
for worker thread is recorded in the timer
+

Review Comment:
   The new comment block claims "update_scan_cpu_timer when defer" / 
"start_wait_worker_timer when defer", but the current `Defer` only starts the 
wait-worker timer and does not update the CPU timer (and only does so 
conditionally). Please align the comments with the actual control flow, or 
reintroduce the deferred counter update so the documentation matches behavior.
   ```suggestion
       // We set and get counters according to the order below, to make sure 
the counters are updated
       // before get_block, and the time of get_block is recorded correctly:
       // 1. update_wait_worker_timer to make sure the time of waiting for 
worker thread is recorded in the timer.
       // 2. start_scan_cpu_timer to make sure the CPU timer includes the time 
of open and get_block, which is the real CPU time of the scanner.
       // 3. update_scan_cpu_timer via update_scanner_profile before leaving 
the scanner, so the CPU timer reflects the work done.
       // 4. start_wait_worker_timer in the deferred lambda (when status.ok() 
&& !eos) to record the time of waiting for worker thread after this run.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to