To contextualize this a bit more, this is one of the changes discussed in the 
Alerts Review proposal 
<https://docs.google.com/document/d/1PQKabMx9qoAKQS6qlHJDs2z2B_Bum_KqLYRaZ1pzXGc/edit>.
 We are still seeking feedback for the proposals, so if you haven't 
read/responded yet, this is a great time!

Thanks to Ben for your help moving this forward.

Best,

Brian King
SRE, Data Platform/Search Platform
Wikimedia Foundation
IRC: inflatador


> On Feb 9, 2024, at 9:52 AM, Ben Tullis <[email protected]> wrote:
> 
> Hello,
> 
> This is just a quick message to let you know that we made some changes today 
> to the monitoring configuration of many of the Data Platform Engineering 
> servers. This may affect you if you participate in Ops Week 
> <https://wikitech.wikimedia.org/wiki/Data_Engineering/Ops_week> for Data 
> Engineering and friends.
> 
> By default, all notification alerts from Icinga and Prometheus will now go to 
> [email protected] 
> <https://groups.google.com/a/wikimedia.org/g/data-platform-alerts> instead of 
> [email protected] 
> <https://lists.wikimedia.org/hyperkitty/list/[email protected]/>
> We are working to try to make sure that we can route any alert emails (and 
> IRC pings) to the most appropriate team, principally so that we don't 
> overload the person who is on Ops Week with a lot of messages that would be 
> more appropriately routed to Data Platform SREs.
> 
> Any scheduled tasks related to data pipelines and services critical for data 
> processing are still going to be sent to the 
> [email protected] 
> <https://lists.wikimedia.org/hyperkitty/list/[email protected]/>
>  list, so that's Airflow jobs, Refine tasks, Gobblin, Sqoop, Varnishkafka, 
> Eventlogging etc.
> 
> We haven't made any changes to the monitoring/notification settings of the 
> Search and Query Services servers (Elasticsearch/WDQS/WCQS etc) nor have we 
> made any changes to the Dumps servers. This mainly affects the analytics 
> systems <https://wikitech.wikimedia.org/wiki/Analytics/Systems> and the rest 
> of the Data Engineering team's infrastructure.
> 
> Please do let us know if you have any queries or concerns about this change, 
> or if anything doesn't look right to you.
> 
> You can reach out on Slack at #data-engineering-collab or #data-platform-sre 
> or on IRC at #wikimedia-analytics or #wikimedia-data-platform or to 
> [email protected] 
> <mailto:[email protected]> by email.
> 
> Kind regards,
> Ben
> 
> -- 
>       Ben Tullis (he/him) 
> Senior Site Reliability Engineer 
> Wikimedia Foundation <https://wikimediafoundation.org/>
_______________________________________________
Analytics mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to