Hello,
I have just opened https://github.com/prometheus/compliance/pull/81 for the
solution to problem 1. It also modifies the specification to adjust the
expectation on the alert payload sent from the alert-generator.
Thanks,
Ganesh
On Monday, 4 April 2022 at 16:27:58 UTC+5:30 Ganesh Vernekar wrote:
> Hello,
>
> I am getting started on "*Solution to Problem 1:*". I will share the PR
> here once it's ready for review.
>
> Thanks,
> Ganesh
>
>
> On Wednesday, 30 March 2022 at 13:38:01 UTC+5:30 Ganesh Vernekar wrote:
>
>> Hello everyone,
>>
>> I have come across hosted metrics providers (including Grafana Cloud that
>> I am personally looking at) facing issues in being compliant with alert
>> delivery. Here are the main problems that I identify:
>>
>> *Problem 1:* They have some kind of embedded alert routing mechanism
>> ("alertmanager") which you cannot bypass to send directly to external
>> alertmanagers. Allowing external alertmanagers is not trivial and also has
>> security implications. *So the alert payload that it sends might not
>> exactly match what the test suite is expecting.*
>>
>> *Problem 2:* Related to the above, the alert routing mechanism *can add
>> unwanted delays in sending alerts or reflecting the changes in the alert
>> annotations.*
>>
>> Here is what I am proposing for the above problems:
>>
>> *Solution to Problem 1:* Allow cloud providers to have custom unmarshal
>> logic for the alert payload embedded in the test suite. For example,
>> currently we require the alert payload to be directly unmarshalled into
>> []notifier.Alert here
>> <https://github.com/prometheus/compliance/blob/c7c726de89973d77cb491faa1b32cfddf7dcde8a/alert_generator/server.go#L91>.
>>
>> But now we can allow embedding of custom unmarshalling logic such that at
>> the end of the custom logic, it provides *[]notifier.Alert* as the
>> result to the test suite to verify.
>>
>> *Solution to Problem 2:* It is in the hands of cloud providers to take
>> care of it, to allow forwarding of the alerts to the end receiver with
>> minimal delays and to forward all the alerts. For those who use upstream
>> Prometheus Alertmanager, we need support from upstream since Prometheus
>> Alertmanager does not allow forwarding of all alerts that it gets. For that
>> I have requested a feature here
>> <https://github.com/prometheus/alertmanager/issues/2868>.
>>
>> Please let me know if you have any better suggestions and/or any
>> objections to the proposed solutions and/or +1 to the proposed solutions.
>>
>> Thanks,
>> Ganesh
>>
>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-developers/5d4ce8a5-3cbf-4c34-ad23-033c4e2f498cn%40googlegroups.com.