[Analytics] Upcoming reboot of stat100[56] and analytics1003 (Hive, Oozie) for kernel security upgrades

2018-03-06 Thread Luca Toscano
Hi everybody, tomorrow EU morning (Wed Mar 7th) I'd need to reboot stat100[56] and analytics1003 for kernel security updates. Hive and Oozie (Analytics Hadoop cluster) will not be available for a (hopefully) brief period of time. Please let me know if there is an important work that you are doing

[Analytics] Eventlogging mysql consumers temporarily stopped due to maintenance

2018-03-06 Thread Luca Toscano
Hi everybody, today, while performing maintenance to the Eventlogging Master database, we ended up in https://phabricator.wikimedia.org/T188991 (TL;DR: two hours of data inserted to the slave database and not the master one). We are working to find a feasible solution to avoid loosing data and get

Re: [Analytics] Eventlogging mysql consumers temporarily stopped due to maintenance

2018-03-06 Thread Luca Toscano
Update: data should now be recovered and everything back on track. Latest data might take a bit of time to catch up since we have just restarted the replication script on the analytics-slave. All the details about the outage in https://phabricator. wikimedia.org/T188991 Thanks! Luca 2018-03-06