keith-turner commented on issue #4876: URL: https://github.com/apache/accumulo/issues/4876#issuecomment-2346989662
The situation I was seeing that prompted me open this issue was that many tservers were having lots of errors, those were being forwarded to the monitor. The monitor would eventually die w/ OOME. Some jstacks of the monitor showed close to 200 jetty threads stuck trying to get the lock for [this method](https://github.com/apache/accumulo/blob/38952c922648014e74e4e7a5704bfa256f390faa/server/monitor/src/main/java/org/apache/accumulo/monitor/util/logging/RecentLogs.java#L66) prior to OOME. The hypothesis was that the tservers continued to send messages and those would be placed on a jetty thread pool queue until the monitor it ran out of memory. Following that hypothesis I opened this issue to improve the concurrency. Then opened #4877 to explore rate limiting with the assumption that solving this issue would increase the upper bound on what the monitor could process a second but there would still be an upper bound, so may still be nice to gracefully handle exceeding the upper bound. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
