jeffkbkim commented on code in PR #15835:
URL: https://github.com/apache/kafka/pull/15835#discussion_r1588068337
##########
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/runtime/MultiThreadedEventProcessor.java:
##########
@@ -126,9 +126,16 @@ private class EventProcessorThread extends Thread {
private void handleEvents() {
while (!shuttingDown) {
- recordPollStartTime(time.milliseconds());
+ // We use a single meter for aggregate idle percentage for the
thread pool.
+ // Since meter is calculated as total_recorded_value /
time_window and
+ // time_window is independent of the number of threads, each
recorded idle
+ // time should be discounted by # threads.
+
+ long idleStartTimeMs = time.milliseconds();
CoordinatorEvent event = accumulator.take();
Review Comment:
Yes, I think the main issue with this implementation is that we can have the
thread idle ratio go over 1.0 when the recorded idle time includes idle time
from the previous window. This is exacerbated when the take() call can block
forever, so the idle ratio is unbounded.
If we do have a timeout on take(), the idle time overflow is limited to that
value which is how it is implemented in
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaRequestHandler.scala#L114
Overall, we will get more accurate idle ratio measurements with a timeout.
wdyt?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]