Hi Brian, thank you very much for the snipped code.. it was just what I needed ... I was trying to translate it in my mind from SQL to prom-SQL but something was not right. Thanks again you have been very useful. you're right when you said: "But it's pretty ugly..." but the IT departmen informed me that outside that time period ... there may be maintenance procedures that could necessarily trigger it! So it's ok. I looked on grafana ... and you can silence them .. but it is not routine, as I told you I must necessarily intervene in the query. But it doesn't bother me.
I want to ask you another question on alertmanager, if you prefer I can open another thread. Anyway ... I have been working on a docker stack app from about 8 months and only now that I am nearing the end I'm dedicating to alerts. Honestly, initially I had used ALERTManager, but in Grafana there is a very similar management but I would say even more advanced in other aspects. Honestly, I have read a dozen articles and posts on the web, but it is not clear to me when it is preferable to use alertmanager over grafana. >From what I understood alertmanager, I see it as a unique hub for managing alerts coming from multiple instances of Prometheus also on other networks, but maybe it's just my opinion as a not profound connoisseur. Thanks again and have a nice day. ALEN Il giorno domenica 26 giugno 2022 alle 09:49:59 UTC+2 Brian Candler ha scritto: > I see; so this is just to workaround the limited functionality of Grafana > alerting. > > Then I guess you can just modify the rule you already have, to use (hour() > + minute()/60). > > e.g. I tested this briefly: > (node_filesystem_avail_bytes < 10000000) and on () (hour() + minute()/60) > >= 6.5 < 19 > > But it's pretty ugly. For a long-running problem, the alert will be > "resolved" at 19:00 and then re-activate at 06:30 the next day. > > If you have a lot of this to do, then you could find out if Grafana can be > plugged into an external system like OpsGenie or PagerDuty (I have no idea > if it can; there is a separate discussion group for Grafana). Or consider > moving to Alertmanager. > > On Sunday, 26 June 2022 at 00:10:40 UTC+1 [email protected] wrote: > >> Hi Brian, and thank you very much for your detailed answer... which I >> have read very carefully several times. >> >> Maybe I forgot a detail in my question, that is: I'm using Grafana! >> Your concepts also related to the muting of the reports are clear to me >> and absolutely correct. These are not related to the alerts in grafana, >> unfortunately, but to the communication points where the recipients of the >> messages are defined. >> >> So to simplify, it would be... in this particular case easier to fix it >> directly in the prom-QL code. >> I would simply like to know how I can also include the 30 minutes only >> from 8:00 AM so that it becomes 8:30 AM... I don't know if exists the right >> syntax in prom-QL >> >> Thanks again and have a nice day. >> ALEN >> >> Il giorno sabato 25 giugno 2022 alle 11:21:51 UTC+2 Brian Candler ha >> scritto: >> >>> Firstly, given that you have put "or vector(0)", I think you may >>> misunderstand how alerting works in Prometheus. >>> >>> PromQL expressions return vectors - a set of 0 or more values. In an >>> alerting expression, the alert is treated as firing if the vector is >>> non-empty - i.e. it contains 1 or more values, regardless of what those >>> values actually are. Therefore, the expression vector(0) gives an alert >>> which fires all of the time, which isn't very useful. >>> >>> Next, PromQL comparison operators are filters, not booleans. Suppose >>> you have the following metrics in your database: >>> >>> node_disk_space{instance="a"} 100 >>> node_disk_space{instance="b"} 200 >>> node_disk_space{instance="c"} 300 >>> >>> The PromQL expression "node_disk_space > 150" returns a vector of 2 >>> values: >>> >>> node_disk_space{instance="b"} 200 >>> node_disk_space{instance="c"} 300 >>> >>> That is, the expression "node_disk_space" returns a vector of all >>> metrics with that metric name, and "node_disk_space > 150" filters it down >>> to just those metrics whose value is over 150. It does not return a "true" >>> or "false" value (or values). >>> >>> Similarly, "and/or/unless" don't work like booleans either. The >>> expression "node_disk_space > 150 or vector(0)" will return the following: >>> >>> node_disk_space{instance="b"} 200 >>> node_disk_space{instance="c"} 300 >>> {} 0 >>> >>> In this case you get a vector of 3 values. The explanation of how "or" >>> works is here: >>> >>> https://prometheus.io/docs/prometheus/latest/querying/operators/#logical-set-binary-operators >>> It's another vector operator, which matches the label sets of the LHS >>> and RHS. >>> >>> Now, let me go back to your original problem about time periods. I >>> think you're approach this the wrong way. >>> >>> I believe the business rule amounts to this: "I only want to receive >>> alerts on this condition if the time falls between 8:30am and 9pm". It's >>> not that the problem doesn't happen outside business hours; it's that the >>> problem isn't important enough to send a notification outside of business >>> hours. >>> >>> Therefore, the right way to handle this is with time periods within >>> alertmanager, to control when the alerts are sent - not within the PromQL >>> expression which determines whether there is a problem or not. >>> >>> The way you do this is with time intervals in alertmanager routing >>> trees. See: >>> https://prometheus.io/docs/alerting/latest/configuration/#route >>> https://prometheus.io/docs/alerting/latest/configuration/#time_interval >>> >>> Not only is this far easier to implement than attempting to do it in >>> PromQL, it's also more flexible - for example you can have the same alert >>> (from the same PromQL alerting rule) sent to different groups depending on >>> the time of day. >>> >>> Note that you can add labels to your alert in the alerting rule to >>> categorise the alert, and you can match on those labels in your alert >>> routing tree. This gives you further flexibility to categorise your alerts >>> in whatever way is useful to you. >>> >>> On Friday, 24 June 2022 at 23:20:42 UTC+1 [email protected] wrote: >>> >>>> Hi, >>>> I'm try to write this simple code for Prometheus >>>> but I don't understand how can I include also minutes... with a valide >>>> range of hour. >>>> >>>> Alert could firing only between: *08:30 AM to all 09:00 P.M*. >>>> >>>> Here below the hours are in CET (+2 from Italy where I'm) >>>> >>>> (count by (exported_instance, counter_instance) >>>> (database_status{job="aaaa", exported_instance="myserver", >>>> status!="ONLINE"}) >>>> and on() hour() >= 6 <= 19 >>>> *......... miss minute .......* >>>> ) or vector(0) >>>> >>>> Thanks Alen >>>> >>> -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/4f20dab8-6d21-4ec7-a538-8f7416ed6834n%40googlegroups.com.

