Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
BiteThet merged PR #61064: URL: https://github.com/apache/doris/pull/61064 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
github-actions[bot] commented on code in PR #61064:
URL: https://github.com/apache/doris/pull/61064#discussion_r2928738745
##
be/src/exec/scan/scanner_scheduler.cpp:
##
@@ -152,28 +152,36 @@ void
ScannerScheduler::_scanner_scan(std::shared_ptr ctx,
Thread::set_thread_nice_value();
}
#endif
+
+// we set and get counter according below order, to make sure the counter
is updated before get_block, and the time of get_block is recorded in the
counter.
+// 1. update_wait_worker_timer to make sure the time of waiting for worker
thread is recorded in the timer
+// 2. start_scan_cpu_timer to make sure the cpu timer include the time of
open and get_block, which is the real cpu time of scanner
+// 3. update_scan_cpu_timer when defer, to make sure the cpu timer include
the time of open and get_block, which is the real cpu time of scanner
+// 4. start_wait_worker_timer when defer, to make sure the time of waiting
for worker thread is recorded in the timer
+
MonotonicStopWatch max_run_time_watch;
max_run_time_watch.start();
scanner->update_wait_worker_timer();
scanner->start_scan_cpu_timer();
-// Counter update need prepare successfully, or it maybe core. For
example, olap scanner
-// will open tablet reader during prepare, if not prepare successfully,
tablet reader == nullptr.
-bool need_update_profile = scanner->has_prepared();
+bool need_update_profile = true;
auto update_scanner_profile = [&]() {
if (need_update_profile) {
scanner->update_scan_cpu_timer();
scanner->update_realtime_counters();
need_update_profile = false;
}
};
-Defer defer_scanner([&] {
-// WorkloadGroup Policy will check cputime realtime, so that should
update the counter
-// as soon as possible, could not update it on close.
-update_scanner_profile();
-});
+
Status status = Status::OK();
bool eos = false;
+Defer defer_scanner([&] {
+if (status.ok() && !eos) {
+// if status is not ok, it means the scanner is failed, and the
counter may be not updated correctly, so no need to update counter again. if
eos is true, it means the scanner is finished successfully, and the counter is
updated correctly, so no need to update counter again.
+scanner->start_wait_worker_timer();
Review Comment:
**Minor: `start_wait_worker_timer()` here is redundant**
`start_wait_worker_timer()` resets and restarts the internal `_watch`.
However, `ScannerScheduler::submit()` (line 70 of this file) also calls
`start_wait_worker_timer()` unconditionally when the scan task is re-submitted
to the thread pool. Since `submit()` runs after the current `_scanner_scan`
returns and `push_back_scan_task` is processed, the timer started here is
immediately overwritten.
The actual wait time is measured from the `submit()` call (line 70) to the
next `update_wait_worker_timer()` call (line 164) when a worker picks it up.
##
be/src/exec/scan/scanner_scheduler.cpp:
##
@@ -152,28 +152,36 @@ void
ScannerScheduler::_scanner_scan(std::shared_ptr ctx,
Thread::set_thread_nice_value();
}
#endif
+
+// we set and get counter according below order, to make sure the counter
is updated before get_block, and the time of get_block is recorded in the
counter.
+// 1. update_wait_worker_timer to make sure the time of waiting for worker
thread is recorded in the timer
+// 2. start_scan_cpu_timer to make sure the cpu timer include the time of
open and get_block, which is the real cpu time of scanner
+// 3. update_scan_cpu_timer when defer, to make sure the cpu timer include
the time of open and get_block, which is the real cpu time of scanner
+// 4. start_wait_worker_timer when defer, to make sure the time of waiting
for worker thread is recorded in the timer
+
MonotonicStopWatch max_run_time_watch;
max_run_time_watch.start();
scanner->update_wait_worker_timer();
scanner->start_scan_cpu_timer();
-// Counter update need prepare successfully, or it maybe core. For
example, olap scanner
-// will open tablet reader during prepare, if not prepare successfully,
tablet reader == nullptr.
-bool need_update_profile = scanner->has_prepared();
+bool need_update_profile = true;
auto update_scanner_profile = [&]() {
if (need_update_profile) {
scanner->update_scan_cpu_timer();
scanner->update_realtime_counters();
need_update_profile = false;
}
};
-Defer defer_scanner([&] {
-// WorkloadGroup Policy will check cputime realtime, so that should
update the counter
-// as soon as possible, could not update it on close.
-update_scanner_profile();
-});
+
Status status = Status::OK();
bool eos = false;
+Defer defer_scanner([&] {
+if (status.ok() &&
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
github-actions[bot] commented on PR #61064: URL: https://github.com/apache/doris/pull/61064#issuecomment-4052090850 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
zclllyybb commented on PR #61064: URL: https://github.com/apache/doris/pull/61064#issuecomment-4052089651 /review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
hello-stephen commented on PR #61064: URL: https://github.com/apache/doris/pull/61064#issuecomment-4036674264 # BE Regression && UT Coverage Report Increment line coverage `77.78% (7/9)` :tada: [Increment coverage report](http://coverage.selectdb-in.cc/coverage/61064_0f0bd1ea97922075a282b88d832581a36699397b_merge/increment_report/index.html) [Complete coverage report](http://coverage.selectdb-in.cc/coverage/61064_0f0bd1ea97922075a282b88d832581a36699397b_merge/report/index.html) | Category | Coverage | |---|| | Function Coverage | 71.58% (26203/36608) | | Line Coverage | 54.29% (274165/505006) | | Region Coverage | 51.47% (227611/442247) | | Branch Coverage | 52.94% (97975/185066) | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
BiteThet commented on PR #61064: URL: https://github.com/apache/doris/pull/61064#issuecomment-4035903813 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
BiteThet commented on PR #61064: URL: https://github.com/apache/doris/pull/61064#issuecomment-4035854250 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
BiteThet commented on PR #61064: URL: https://github.com/apache/doris/pull/61064#issuecomment-4021821692 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
BiteThet commented on PR #61064: URL: https://github.com/apache/doris/pull/61064#issuecomment-4021661916 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
hello-stephen commented on PR #61064: URL: https://github.com/apache/doris/pull/61064#issuecomment-4004570623 # BE Regression && UT Coverage Report Increment line coverage `100.00% (5/5)` :tada: [Increment coverage report](http://coverage.selectdb-in.cc/coverage/61064_e7998f773783b1f62d9fdf08bd3070d604d9093e_merge/increment_report/index.html) [Complete coverage report](http://coverage.selectdb-in.cc/coverage/61064_e7998f773783b1f62d9fdf08bd3070d604d9093e_merge/report/index.html) | Category | Coverage | |---|| | Function Coverage | 71.52% (26179/36603) | | Line Coverage | 54.31% (274527/505439) | | Region Coverage | 51.69% (228600/442241) | | Branch Coverage | 53.02% (98202/185226) | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
github-actions[bot] commented on PR #61064: URL: https://github.com/apache/doris/pull/61064#issuecomment-4003002626 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
github-actions[bot] commented on PR #61064: URL: https://github.com/apache/doris/pull/61064#issuecomment-4003002838 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
Copilot commented on code in PR #61064:
URL: https://github.com/apache/doris/pull/61064#discussion_r2888111323
##
be/src/vec/exec/scan/scanner_scheduler.cpp:
##
@@ -167,13 +174,18 @@ void
ScannerScheduler::_scanner_scan(std::shared_ptr ctx,
need_update_profile = false;
}
};
+Status status = Status::OK();
+bool eos = false;
+
Defer defer_scanner([&] {
// WorkloadGroup Policy will check cputime realtime, so that should
update the counter
// as soon as possible, could not update it on close.
update_scanner_profile();
+if (status.ok() && !eos) {
+// if status is not ok, it means the scanner is failed, and the
counter may be not updated correctly, so no need to update counter again. if
eos is true, it means the scanner is finished successfully, and the counter is
updated correctly, so no need to update counter again.
Review Comment:
The inline comment on this line states "if eos is true, it means the scanner
is finished successfully," but this is inaccurate. The `eos` flag is also set
to `true` in error/cancellation paths (e.g., when `scanner->prepare()` or
`scanner->open()` fails, or when `ctx->done()` is true). It would be more
accurate to say "if eos is true, the scanner will not be re-queued (either
because it completed scanning, encountered an error, or the context was
cancelled), so no need to start the wait timer again."
```suggestion
// If status is not ok, the scanner has failed and its counters
may not be updated
// correctly, so there is no need to restart the wait-worker
timer. If eos is true,
// the scanner will not be re-queued (either because it
completed scanning,
// encountered an error, or the context was cancelled), so there
is no need to
// restart the wait-worker timer either.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
BiteThet commented on PR #61064: URL: https://github.com/apache/doris/pull/61064#issuecomment-4002779493 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [Bug](profile) fix wrong count of scan wait worker timer [doris]
hello-stephen commented on PR #61064: URL: https://github.com/apache/doris/pull/61064#issuecomment-4002773782 Thank you for your contribution to Apache Doris. Don't know what should be done next? See [How to process your PR](https://cwiki.apache.org/confluence/display/DORIS/How+to+process+your+PR). Please clearly describe your PR: 1. What problem was fixed (it's best to include specific error reporting information). How it was fixed. 2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be. 3. What features were added. Why was this function added? 4. Which code was refactored and why was this part of the code refactored? 5. Which functions were optimized and what is the difference before and after the optimization? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
