Copilot commented on code in PR #60615:
URL: https://github.com/apache/doris/pull/60615#discussion_r2781167398
##########
be/src/vec/exec/scan/scanner_scheduler.cpp:
##########
@@ -156,16 +156,19 @@ void
ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
max_run_time_watch.start();
scanner->update_wait_worker_timer();
scanner->start_scan_cpu_timer();
+
+ auto update_scanner_profile = [&]() {
+ if (scanner->has_prepared()) {
+ // Counter update need prepare successfully, or it maybe core. For
example, olap scanner
+ // will open tablet reader during prepare, if not prepare
successfully, tablet reader == nullptr.
+ scanner->update_scan_cpu_timer();
+ scanner->update_realtime_counters();
+ }
+ };
Defer defer_scanner(
[&] { // WorkloadGroup Policy will check cputime realtime, so that
should update the counter
// as soon as possible, could not update it on close.
- if (scanner->has_prepared()) {
- // Counter update need prepare successfully, or it maybe
core. For example, olap scanner
- // will open tablet reader during prepare, if not prepare
successfully, tablet reader == nullptr.
- scanner->update_scan_cpu_timer();
- scanner->update_realtime_counters();
- scanner->start_wait_worker_timer();
- }
+ update_scanner_profile();
});
Review Comment:
`update_scanner_profile()` is invoked in the `eos` block before
`mark_to_need_to_close()`, but it’s also invoked unconditionally in
`defer_scanner`. Since `Scanner::update_scan_cpu_timer()` adds `elapsed_time()`
without resetting the watch, this will double-count CPU time (and potentially
other realtime counters) on EOS/error paths. Consider guarding the deferred
call (e.g., only run it when `!eos`), or otherwise ensure the profile update is
performed exactly once per scan invocation before
`_collect_profile_before_close()` runs.
##########
be/src/vec/exec/scan/scanner_scheduler.cpp:
##########
@@ -325,6 +328,9 @@ void
ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
}
if (eos) {
+ // If eos, scanner will call _collect_profile_before_close to update
profile,
+ // so we need update_scanner_profile here
Review Comment:
The new comment in the `eos` block is slightly misleading:
`_collect_profile_before_close()` is called by `mark_to_need_to_close()` only
when the scanner is opened (`_is_open`). Consider rewording to describe the
actual dependency (CPU/realtime counters must be updated before calling
`mark_to_need_to_close()` so the collected profile includes the final scan).
```suggestion
// If eos, we must update CPU and realtime counters before calling
mark_to_need_to_close().
// Then, when the scanner is closed (and
_collect_profile_before_close() is invoked if the
// scanner is still open), the collected profile will include the
final scan.
```
##########
be/src/vec/exec/scan/scanner_scheduler.cpp:
##########
@@ -156,16 +156,19 @@ void
ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
max_run_time_watch.start();
scanner->update_wait_worker_timer();
scanner->start_scan_cpu_timer();
+
+ auto update_scanner_profile = [&]() {
+ if (scanner->has_prepared()) {
+ // Counter update need prepare successfully, or it maybe core. For
example, olap scanner
+ // will open tablet reader during prepare, if not prepare
successfully, tablet reader == nullptr.
Review Comment:
Typo/grammar in the newly moved comment: “or it maybe core” is unclear
(likely meant “may crash/core dump”). Please correct wording to avoid confusion.
```suggestion
// Counter updates require successful preparation; otherwise,
the process may crash (core dump).
// For example, the OLAP scanner opens the tablet reader during
prepare; if prepare fails, the tablet reader is nullptr.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]