This has been completed. FWIW, it looks like auth takes a hit from the point of 
RDS stop and starts recovering about 3 minutes later, with full capacity back 
to normal after 10 mins.

C


_-=^=-_

Chris Kolosiwsky


----- Original Message -----
From: "Chris Kolosiwsky" <[email protected]>
To: [email protected], [email protected]
Sent: Thursday, September 25, 2014 11:11:12 AM
Subject: fxa RDS Failover *today* at 3 PM Central Time (1 PM Pacific)

Folks:

AWS has informed us that our production RDS instances will be rebooted on 
Friday and Saturday this week. They've given us a 6 hour window in which this 
will/can occur. This is being done to address a (rumored) security bug in Xen.

Our prod RDS instance is configured to have a master and hot-slave standby. 
Rebooting our master RDS instance will force a failover to the hot slave 
instance. Sadly, the slave is also scheduled to be rebooted.

I would like to pre-fail to the slave instance today at 3 PM CST to avoid the 
"uncontrolled cable-company-scheduling" reboot of the master. If AWS keeps to 
their proposed schedule, I will be able to fail back to the master 
post-AWS-reboot tomorrow and we will be done.

I expect that we will see a spike in 500's as well as some increased load on 
the token server during this time. The failover itself should take anywhere 
from 60 to 120 seconds. During that time, we will be without a backup server. I 
will be taking a db snapshot prior to the failover event.

Please feel free to ping me in irc with any questions.

C


_-=^=-_

Chris Kolosiwsky


_______________________________________________
Dev-fxacct mailing list
[email protected]
https://mail.mozilla.org/listinfo/dev-fxacct

Reply via email to