albertogpz commented on code in PR #7747:
URL: https://github.com/apache/geode/pull/7747#discussion_r896655954
##########
geode-core/src/main/java/org/apache/geode/internal/monitoring/executor/AbstractExecutor.java:
##########
@@ -51,9 +56,25 @@ protected AbstractExecutor(String groupName, long threadID) {
public void handleExpiry(long stuckTime, Map<Long, ThreadInfo>
threadInfoMap) {
incNumIterationsStuck();
+ sendAlertForThreadStuckForLong(stuckTime, threadInfoMap);
logger.warn(createThreadReport(stuckTime, threadInfoMap));
}
+ private void sendAlertForThreadStuckForLong(long stuckTime, Map<Long,
ThreadInfo> threadInfoMap) {
+ if (maxThreadStuckTime <= 0) {
+ return;
+ }
+ if (threadInfoMap.get(threadID) == null) {
+ return;
+ }
+ if (stuckForGood) {
+ return;
+ }
+ if (stuckTime > maxThreadStuckTime) {
+ stuckForGood = true;
+ logger.fatal(createThreadReport(stuckTime, threadInfoMap));
Review Comment:
@kirklund The problem I see with using `error` in this case is that, as you
need to change the `alertLevel` in the `DistributedMXBean` to a lower level,
the number of notifications received will be much higher, something that you do
not necessarily want, for example because more resources will be used.
There are other alerts in Geode issued when a fatal error is sent that do
not necessarily mean a catastrophic failure, for example if the
`ack-severe-threshold` is surpassed and that feature is enabled.
Given that this is also a configurable feature, whoever makes use of it
should know what to expect and how to act (according to the documentation) when
the alert is received so I still think it makes sense to use `severe` to make
sure the alert is always received if the feature is enabled.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]