After many concerted Ops hours, virt1009 appears to be back up and
running along with its respective instances. That means that all
instances that were up yesterday should now be back up and running.
If you find that your instance did /not/ come back up, feel free to
respond to this email, email me off-list, or contact me on IRC tomorrow.
Unfortunately, this email does not entirely conclude the 'reboot
everything' phase, because the NFS servers are still in line for a
restart. That will most likely happen tomorrow and result in temporary
freezes of shared filesystems -- watch this list for updates.
Thanks for your patience with today's action -- security issues like
this are always painful to deal with, but not dealing with them would be
even more painful.
-Andrew
On 1/27/15 3:29 PM, Andrew Bogott wrote:
Update:
Most, but not all, labs instances have now been restarted. Those
instances essential for Tools operation are working -- some tools are
back up, but those that depend on web access are still failing. Yuvi
is working hard on this issue. Coren is unfortunately out sick, so
it's difficult to predict when he'll be available to assist.
One of the hardware nodes, virt1009, is suffering a serious OS
problem, so VMs local to that host are still shut off. We're working
frantically on resolving this issue and I'll update you as soon as
there is any news.
-Andrew
On 1/27/15 9:41 AM, Andrew Bogott wrote:
Due to a newly-discovered security vulnerability[1], I will be
patching and rebooting ALL labs instances today. I'll coordinate
with Coren on tool labs to minimize the breakage; however, everyone
should be on alert to check the status of their services; some things
may need hand-tuning.
I'll send another email when the reboots are completed -- don't
hesitate to speak up on IRC if you have concerns.
-Andrew
[1] http://www.ubuntu.com/usn/usn-2485-1/
_______________________________________________
Labs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/labs-l