[ 
https://issues.apache.org/jira/browse/ARTEMIS-4579?focusedWorklogId=901106&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-901106
 ]

ASF GitHub Bot logged work on ARTEMIS-4579:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/Jan/24 04:28
            Start Date: 23/Jan/24 04:28
    Worklog Time Spent: 10m 
      Work Description: jbertram commented on PR #4752:
URL: 
https://github.com/apache/activemq-artemis/pull/4752#issuecomment-1905272950

   For awhile I've actually discouraged folks from using the "first message" 
metrics. [I discussed this on the ActiveMQ users 
list](https://lists.apache.org/thread/d79n3kbb28k2v4pm7y0kywb5xpvrmpf4) not 
long ago:
   
   > The `getFirstMessageAge` operation is actually fairly "heavy" and not 
generally recommended. Furthermore, the age of the first message isn't 
meaningful in and of itself in this scenario because if the `consumerCount` is 
0 then by definition no messages can be stuck. A robust stuck-message detection 
mechanism must, at the very least, verify that `consumerCount` > 0. Also, 
instead of using the age of the first message I recommend inspecting 
`messagesAcknowledged` over time. For example, if the `consumerCount` > 0 and 
`messagesAcknowledged` remains unchanged for 60 seconds then messages (or more 
likely *consumers*) may be stuck. If you're using Prometheus then I believe you 
can use a [range vector 
selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#range-vector-selectors)
 for this kind of operation.
   
   At this point I'm against adding "first message" metrics for scheduled 
messages because it will be a relatively "heavy" operation due to the 
`synchronized` block. A lot of JMX monitoring tools will simply poll queue 
MBeans which means this new management method may be invoked *a lot*, 
especially on a broker with lots of queues. Over the last few years we've seen 
an increasing number of deployments with many thousands of queues. This is one 
reason we implemented (and generally recommend using) [pluggable 
metrics](https://activemq.apache.org/components/artemis/documentation/latest/metrics.html#metrics)
 which should provide a lighter footprint than JMX and open the door for easier 
integration with tools that specialize in graphing and alerting (e.g. 
Prometheus & Grafana).
   
   Would it be possible for you to use existing metrics to solve your problem 
rather than implementing this new management method?




Issue Time Tracking
-------------------

    Worklog Id:     (was: 901106)
    Time Spent: 1h 10m  (was: 1h)

> Add the *FirstMessage* API for scheduled messages
> -------------------------------------------------
>
>                 Key: ARTEMIS-4579
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-4579
>             Project: ActiveMQ Artemis
>          Issue Type: Improvement
>          Components: API
>    Affects Versions: 2.31.2
>            Reporter: Jan Å mucr
>            Priority: Major
>             Fix For: 2.32.0
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Alerting on issues with messages not being received properly for a period of 
> time is an uneasy task. We use the {{getFirstMessageAge()}} command to 
> trigger alerts in Zabbix, and it works as long as there are no consumers.
> But this approach fails when there are consumers repeatedly failing to 
> receive a message. That message is getting scheduled for redelivery over and 
> over, and even though there still is an old message in the queue to be 
> reported, it's no longer visible via {{getFirstMessage*()}} API.
> The goal here is to add a set of functions working with messages scheduled 
> for delivery:
> {noformat}
> getFirstScheduledMessageAsJSON()
> getFirstScheduledMessageTimestamp()
> getFirstScheduledMessageAge()
> {noformat}
> It may be not the most effective approach but it's quite a convenient one, 
> especially when monitoring a wide set of queues, each with its own set of 
> alerts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to