This is done and everything should be back to normal. Let me know if
you encounter irregularities!
We will probably do one more firedrill in a week or two to verify that
we have a switch-over process that's faster than today's was. I'll warn
appropriately if that happens.
-Andrew
On 8/19/15 3:52 PM, Andrew Bogott wrote:
Reminder -- this will start in ten minutes. Labs networks may stutter
or be temporarily unavailable during this work.
-Andrew
On 8/13/15 4:59 PM, Andrew Bogott wrote:
Next Wednesday Chase and I are going to have a go at updating our
labs network node. There may be intermittent network interruptions
in communication both between labs instances and between labs and the
outside world.
No action should be required on the part of labs users unless you
have jobs that will time out and die due to network failures. In any
case, I will send an 'all clear' message at the end of the upgrade
with details about what, if any, downtime ensued.
== technical background ==
Labs is currently running with a single nova-network node,
labnet1001. It's proved fairly reliable, but labnet1001 is running
an old OS (Ubuntu Precise) and is a single point of failure.
During the maintenance window we will bring up a new nova-network
node on labnet1002, running Ubuntu Trusty, and then switch existing
labs traffic to the new node. It may be possible to do this with
minimal network interruption, but there are a few minor unknowns in
our plan. In any case, this migration will serve as a
proof-of-concept for possible future emergency failovers.
Presuming all goes well and things land stably on labnet1002,
labnet1001 will be upgraded and maintained as a hot spare.
-Andrew
_______________________________________________
Labs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/labs-l