NewNimo commented on issue #3904: URL: https://github.com/apache/hertzbeat/issues/3904#issuecomment-3644662432
@Duansg Thank you for your response! about quersion1,I understand that the current grouping rule is: alerts belong to the same group only if all configured group-by labels (e.g., alertname, severity) have identical values. However, I'm encountering the following issue in practice: An alert fires and enters the group. After group_wait (30s), the first notification is sent, and subsequent notifications are scheduled every group_interval (500s). But 200 seconds after the initial firing, the alert resolves (status changes to resolved). At this point, the resolved notification is not delivered—it appears to be suppressed because the group is still within its waiting/sending cycle. My core questions are: Is the resolved notification also governed by group_wait and group_interval? If yes, is there a mechanism to ensure resolved alerts are notified immediately, even if they occur before the next group_interval? Since grouping is based solely on label values (like alertname, severity) and does not consider alert status (firing vs resolved), does this mean firing and resolved alerts are treated as part of the same group? If so, shouldn’t status (or an equivalent field) be considered in grouping—or should resolved notifications be handled separately? Is there a configuration option to force immediate delivery of resolved notifications, bypassing the group wait/interval logic? Regarding my second question: I’m not asking whether I can customize grouping labels. What I’m looking for is official documentation or semantic definitions for commonly seen labels like job, service, instance, etc.—especially when they appear in default configurations or examples. For example: Does job refer to the Prometheus scrape job name? Does service represent a microservice name, and if so, how is it populated (e.g., from annotations, relabeling, or application metrics)? Since these labels are frequently used in grouping recommendations, it would be very helpful to have clear documentation explaining their intended meaning and source. Could you please point me to any existing docs, or consider adding such explanations if they’re currently missing? Looking forward to your guidance. Thanks in advance -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
