In order to rebuild a server of questionable stability, I'm going to
move the following instances on Wednesday:
|+--+-++||
||| Name | Tenant ID | Status | ||
||+--+-+
Reminder: I'm going to start migrating these hosts shortly.
On 9/29/17 3:57 PM, Andrew Bogott wrote:
In order to rebuild a server of questionable stability, I'm going to
move the following instances on Wednesday:
|+--+-++|
This is done now.
On 10/4/17 8:15 AM, Andrew Bogott wrote:
Reminder: I'm going to start migrating these hosts shortly.
On 9/29/17 3:57 PM, Andrew Bogott wrote:
In order to rebuild a server of questionable stability, I'm going to
move the following instances on
Tool-forge users can ignore this email, it only concerns VPS
project owners.
Long ago, the Wikimedia Operations team made the decision to phase
out use of Ubuntu servers in favor of Debian. It's a long, slow process
that is still ongoing, but in production Trusty is running on an
eve
As discussed previously in this list [1] and on phabricator [2], I've
just removed the Ubuntu Trusty image as a default option when creating
new VMs. This is part of a longterm foundation-wide process to
standardize on Debian as the distribution of choice.
Existing Trusty VMs are unaffected b
On 11/29/17 1:31 PM, Chase Pettet wrote:
A series of upgrades and changes have left instances with
'role::puppetmaster::standalone' applied in a broken state. This is
unfortunate because Puppet is unable to fix itself. There is a small
manual update required.
I believe that I've now fixed all
Hello all,
Some tools running on the Toolforge Kubernetes cluster are currently
suffering from network failures. It's not yet fully diagnosed, although
we have some ideas as to how to at least reduce the impact. The
tracking bug is https://phabricator.wikimedia.org/T182722.
We'll send anot
Toolforge services should be back to normal now. The problem is not yet
fully understood, but details will trickle in on the tracking task, below.
On 12/12/17 6:08 PM, Andrew Bogott wrote:
Hello all,
Some tools running on the Toolforge Kubernetes cluster are currently
suffering from
Sometime soon (probably in the next day or two) we will be applying
kernel patches to all VMs and physical hosts in WMCS. This is to address
an urgent security issue[1] , so we'll be skipping the traditional 7-day
warning period -- basically as soon as proper fixes are available we'll
start pat
xy: yandex-proxy01
On 1/4/18 9:28 AM, Andrew Bogott wrote:
Sometime soon (probably in the next day or two) we will be applying
kernel patches to all VMs and physical hosts in WMCS. This is to
address an urgent security issue[1] , so we'll be skipping the
traditional 7-day warning period
ndrew
On 1/11/18 1:02 PM, Andrew Bogott wrote:
In a few minutes I'm going to start the first round of reboots. We're
going to do a subset of the cloud and then make sure there are no bad
effects before doing the remainder on Monday.
The following VMs will be upgraded and reboot
make sure your jobs are still
running after windows like this. The list of VMs from last week
(attached below) are already good to go so they should be unaffected today.
-Andrew
On 1/11/18 3:15 PM, Andrew Bogott wrote:
Today's round of reboots is now finished -- the hosts rebooted are
The reboots are now done and everything is upgraded. So far things seem
back to normal, but visit us in #wikimedia-cloud if you find things amiss.
-Andrew (+ WMCS team)
On 1/16/18 8:57 AM, Andrew Bogott wrote:
Good morning!
The canary reboots last week went well, so we'll be upgradin
As part of a security upgrade, I'll be rebooting the systems that host
Wikitech and Horizon in about two hours, at 14:00 PST (16:00 CST).
Those websites will be briefly unavailable, as will be the Nova api.
This last will cause a brief interruption to the WMF Continuous
Integration system. T
On 1/17/18 2:19 PM, Andrew Bogott wrote:
As part of a security upgrade, I'll be rebooting the systems that host
Wikitech and Horizon in about two hours, at 14:00 PST (16:00 CST).
These reboots are done and everything is back up. Sorry for any
inconvenience caused!
-Andrew
Those web
On 2/14/18 6:58 AM, Chase Pettet wrote:
We lost a KVM host at around 7:20 UTC. Because we use local storage
for instances there are a number of them that are down. Toolforge
suffered a few losses but it seems to have been few enough that
GridEngine and Kubernetes users are unaffected at thi
ry for the downtime!
-Andrew + the WMCS team
On 2/14/18 8:29 AM, Andrew Bogott wrote:
On 2/14/18 6:58 AM, Chase Pettet wrote:
We lost a KVM host at around 7:20 UTC. Because we use local storage
for instances there are a number of them that are down. Toolforge
suffered a few losses but it seems to
On Friday morning my time (10:00 CST, 8:00 PST, 16:00 UTC) I'll be
switching the dns record for wikitech.wikimedia.org to point to a new
server. This change should be largely invisible to users, but there are
a few things to be ready for:
- Most importantly, YOU WILL BE LOGGED OUT of Wikitech
Reminder: This is happening in about an hour.
On 3/7/18 4:14 PM, Andrew Bogott wrote:
On Friday morning my time (10:00 CST, 8:00 PST, 16:00 UTC) I'll be
switching the dns record for wikitech.wikimedia.org to point to a new
server. This change should be largely invisible to users, but
This is done. Quick tests suggest that everything is working fine, but
don't hesitate to contact me if you see any strange behavior.
-Andrew
On 3/9/18 8:57 AM, Andrew Bogott wrote:
Reminder: This is happening in about an hour.
On 3/7/18 4:14 PM, Andrew Bogott wrote:
On Friday morni
tl;dr: Starting on Wednesday, the Horizon UI is going to look a bit
different.
--
On Wednesday next week I'm going to switch Horizon and Toolsadmin
traffic away from their current physical host and over to new hardware.
The change to Toolsadmin will be largely invisible, but the Horizon
swi
About 24 hours from now we're going to reboot a couple of servers[1] in
the cloud infrastructure to apply security updates.
Few WMCS users (and, in particular, no tools users) should notice any
interruption. Nonetheless, a few services will be down:
- New instance creation will fail
- CI t
All done!
On 3/27/18 9:34 AM, Andrew Bogott wrote:
About 24 hours from now we're going to reboot a couple of servers[1]
in the cloud infrastructure to apply security updates.
Few WMCS users (and, in particular, no tools users) should notice any
interruption. Nonetheless, a few services
Next Friday we'll be upgrading our OpenStack cluster. The upgrade
should not interrupt any existing tools or instances, but during the
upgrade it will be impossible to create, delete, or modify WMCS VMs.
I'll start the process at around 02:00 UTC (7AM PDT). The complete
upgrade may t
On 4/5/18 1:17 PM, Andrew Bogott wrote:
Next Friday we'll be upgrading our OpenStack cluster. The upgrade
should not interrupt any existing tools or instances, but during the
upgrade it will be impossible to create, delete, or modify WMCS VMs.
I'll start the process at ar
Next Friday we'll be upgrading our OpenStack cluster. The upgrade
should not interrupt any existing tools or instances, but during the
upgrade it will be impossible to create, delete, or modify WMCS VMs.
I'll start the process at around 14:00 UTC (7AM PDT). The complete
upgrade may t
Reminder: This is starting in a few minutes.
-A
On 4/12/18 8:58 AM, Andrew Bogott wrote:
Next Friday we'll be upgrading our OpenStack cluster. The upgrade
should not interrupt any existing tools or instances, but during the
upgrade it will be impossible to create, delete, or modify
The upgrades are done, and Horizon and CI are re-enabled. Please let me
know if you find any new problems.
-Andrew
On 4/12/18 8:58 AM, Andrew Bogott wrote:
Next Friday we'll be upgrading our OpenStack cluster. The upgrade
should not interrupt any existing tools or instances, but d
As part of some long-deferred routine maintenance, we need to update
(and, in one case, physically move) the network servers that handle all
traffic between WMCS instances. During each change, all WMCS network
traffic (including network access to all tools and VMs) will be
interrupted for seve
Reminder: this outage is happening tomorrow.
On 5/2/18 10:22 AM, Andrew Bogott wrote:
As part of some long-deferred routine maintenance, we need to update
(and, in one case, physically move) the network servers that handle
all traffic between WMCS instances. During each change, all WMCS
The first of these outages is coming up in a few minutes.
On 5/14/18 12:02 PM, Andrew Bogott wrote:
Reminder: this outage is happening tomorrow.
On 5/2/18 10:22 AM, Andrew Bogott wrote:
As part of some long-deferred routine maintenance, we need to update
(and, in one case, physically move
The first of these tasks is done and the network is back up and
running. The outage lasted a bit less than 10 minutes.
There will be another similar outage in a few hours.
-Andrew
On 5/2/18 10:22 AM, Andrew Bogott wrote:
As part of some long-deferred routine maintenance, we need to update
Things are back up and running for the moment. The last switch-over
went poorly so we haven't actually reached our goals yet; there may be
another interruption yet coming up.
-A
On 5/15/18 8:33 AM, Andrew Bogott wrote:
The first of these tasks is done and the network is back up and
ru
de as much warning about that as I can. It's
unlikely to be today, in any case.
Sorry for any inconvenience caused!
-Andrew
On 5/15/18 12:04 PM, Andrew Bogott wrote:
Things are back up and running for the moment. The last switch-over
went poorly so we haven't actually reached our
The next step in this is scheduled for tomorrow at at 15:00 UTC, 8:00AM
in SF. Again, all network service will be interrupted for 5-10 minutes.
Sorry for all the emails! With luck there will only be one more.
-Andrew
On 5/15/18 12:24 PM, Andrew Bogott wrote:
We're leaving things in th
We had a couple of minutes of downtime just now, and everything is back
up. This went a lot better today; this should be the last of these
network interruptions for a while.
-Andrew
On 5/15/18 3:31 PM, Andrew Bogott wrote:
The next step in this is scheduled for tomorrow at at 15:00 UTC,
8
Hello!
The Cloud Services team is traveling quite a bit in the next few weeks:
the Hackathon, the OpenStack Summit, and some personal travel. There
will always be at least one person available for emergencies, but please
be patient if we're slow to respond to requests.
Everyone should be ba
As part of routine security maintenance, we'll be rebooting all VMs and
virtualization hosts next Wednesday starting at 14:00 UTC (7AM San
Francisco time).
Toolforge users should be largely unaffected by this activity. Other
projects (including deployment-prep) will experience sporadic downtim
Reminder: These reboots will start in about 12 hours.
On 5/30/18 10:46 AM, Andrew Bogott wrote:
As part of routine security maintenance, we'll be rebooting all VMs
and virtualization hosts next Wednesday starting at 14:00 UTC (7AM San
Francisco time).
Toolforge users should be la
Bogott wrote:
Reminder: These reboots will start in about 12 hours.
On 5/30/18 10:46 AM, Andrew Bogott wrote:
As part of routine security maintenance, we'll be rebooting all VMs
and virtualization hosts next Wednesday starting at 14:00 UTC (7AM
San Francisco time).
Toolforge users should be la
Hello!
Much of the Cloud Services staff will be traveling and attending
meetings next week. There will always be someone available for
emergencies, but routine support requests may get handled more slowly
than usual.
Things will be back to normal the following Monday, the 25th.
- Andrew +
We're drawing close to a painful migration event[1], during which we
will (probably) have to copy VMs between hosts one project at a time,
largely by hand. For that reason, I'm feeling even stingier than usual
about preserving unused and/or abandoned projects and instances.
It's been a couple
In an attempt to identify abandoned VPS projects, I've created a wiki
page that lists all existing projects, here:
https://wikitech.wikimedia.org/wiki/Cloud_VPS_2018_Purge
Currently 85 projects[2] on that list are unclaimed. If you are a VPS
user, please visit that page and mark any projects
There are currently 57 unclaimed projects on
https://wikitech.wikimedia.org/wiki/Cloud_VPS_2018_Purge. I will start
shutting down unclaimed projects at the beginning of next month, and
those projects will be left behind in the future network migration[1]
and, eventually, deleted.
If you see
Routine network upgrades are scheduled for Thursday which may result in
brief WMCS service interruptions. In particular, Wikitech and Horizon
may stop working, and instance creation/deletion/updating may briefly fail.
The network engineers have reserved a two-hour window beginning at 16:00
UT
18 13:19:36 -0500
From: Andrew Bogott
Reply-To: andrewbog...@gmail.com
To: cloud-annou...@lists.wikimedia.org
Routine network upgrades are scheduled for Thursday which may result in
brief WMCS service interruptions. In particular, Wikitech and Horizon
may stop working, and ins
Hello!
The Wikimedia datacenter team will be performing some routine network
maintenance[1] next Wednesday. This will cause brief, rolling network
interruptions for essentially all tools, services, and virtual servers
-- each physical server will be briefly unplugged as its network cable
is
Beginning next week, the cloud team will start migrating projects to
Neutron[1] in earnest. I will attempt to reach out individually to
affected project admins as well, but here is the upcoming migration
schedule:
Friday, 2018-10-19: analytics
Monday, 2018-10-22: antiharassment, catgra
, mwstake
Thursday, 2018-11-01: planet, pluggableauth, privpol-captcha, qna
Friday, 2018-11-02: reading-web-staging, suggestbot, test-twemproxy,
wikibase-registry, wikibrain, wikidata-primary-sources-tool
On 10/19/18 10:18 AM, Andrew Bogott wrote:
Beginning next week, the cloud team will start
Reminder: This maintenance is starting in about 15 minutes.
On 10/17/18 9:18 AM, Andrew Bogott wrote:
Hello!
The Wikimedia datacenter team will be performing some routine network
maintenance[1] next Wednesday. This will cause brief, rolling network
interruptions for essentially all tools
UPDATE: This maintenance has been postponed a week due to our
datacenter engineer being injured and unable to complete the work.
Instead, this will be happening next Wednesday instead, at the same time
as previously scheduled.
On 10/24/18 8:44 AM, Andrew Bogott wrote:
Reminder: This
Region migration is going smoothly, and it's time to plan out the next
week of moves. For details about what's happening here, consult the
link below[1].
Here is the schedule for the next week of moves:
Monday, 2018-11-05: discovery-stats, globaleducation, hat-imagescalers,
language
Tues
Reminder: This maintenance is happening in about 45 minutes. If all
goes well it should be quick and largely unnoticeable.
On 10/24/18 8:57 AM, Andrew Bogott wrote:
UPDATE: This maintenance has been postponed a week due to our
datacenter engineer being injured and unable to complete the
On 10/31/18 8:16 AM, Andrew Bogott wrote:
Reminder: This maintenance is happening in about 45 minutes. If all
goes well it should be quick and largely unnoticeable.
All done! Let me know if you encounter any network issues; things look
good from my end.
On 10/24/18 8:57 AM, Andrew
We'll be shuffling the VMs that host the Quarry service over to a
new corner of the cloud today. During the move the service will be
unavailable and/or behave erratically.
I don't expect the move to take more than an hour. I'll send a
further notice when things are done.
-Andrew +
On 11/5/18 11:10 AM, Andrew Bogott wrote:
We'll be shuffling the VMs that host the Quarry service over to a new
corner of the cloud today. During the move the service will be
unavailable and/or behave erratically.
This is all done -- Quarry is back up and working.
I don't
It's Monday, which means it's time to schedule another round of project
migrations. For more info about what this is, consult the link below[1].
Here is the schedule for the next week of moves:
Monday, 2018-11-12: Holiday, no activity :)
Tuesday, 2018-11-13: commtech, design, discourse, ger
Next week is a short week in the US, so no project moves will happen.
Here is the schedule for project moves in the following week:
Monday, 2018-11-26: collection-alt-renderer, dumps, extdist, glampipe,
google-api-proxy, hound, lizenzhinweisgenerator
Tuesday, 2018-11-27: osmit, pagemigratio
Hello!
I need to shut down the tools-dev host in order to move it to a
different server. The downtime will be brief, but in the meantime I
recommend people move their work to a different bastion (e.g.
tools-login.wmflabs.org) in order to avoid interruption.
This will happen on or near 15:00
Reminder: This is happening in one hour.
On 11/17/18 1:06 PM, Andrew Bogott wrote:
Hello!
I need to shut down the tools-dev host in order to move it to a
different server. The downtime will be brief, but in the meantime I
recommend people move their work to a different bastion (e.g.
tools
With any luck we'll have some more hardware installed by next week, so
it's time to move more projects! This is probably the last round of
bulk moves; after this it's all special cases for which I'll contact
people directly.
Tuesday, 2018-12-11: maps, wm-bot
Wednesday, 2018-12-12: mwoffline
I recently noticed that some of our standard kvm/nova monitoring never
got copied over from the labvirt puppet code to the cloudvirt puppet
code. Tomorrow I will merge
https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/478113/ to fix that.
Once that patch is merged, icinga will be a bit t
Sorry, all, this was meant for a different list. Feel free to ignore!
-A
On 12/6/18 5:16 PM, Andrew Bogott wrote:
I recently noticed that some of our standard kvm/nova monitoring never
got copied over from the labvirt puppet code to the cloudvirt puppet
code. Tomorrow I will merge
https
Tomorrow I'll be moving the grid engine master node to a new virt host.
That will cause a 15-minute outage during which new jobs (crons, or
things submitted by hand) will fail.
Existing jobs or webservices will be unaffected by the downtime.
I'll start the move at 16:00 UTC on Friday, 2018-
Reminder: this interruption will start in about 30 minutes.
On 12/20/18 2:39 PM, Andrew Bogott wrote:
Tomorrow I'll be moving the grid engine master node to a new virt
host. That will cause a 15-minute outage during which new jobs
(crons, or things submitted by hand) will fail.
Exi
I'm starting this move now.
On 12/21/18 9:32 AM, Andrew Bogott wrote:
Reminder: this interruption will start in about 30 minutes.
On 12/20/18 2:39 PM, Andrew Bogott wrote:
Tomorrow I'll be moving the grid engine master node to a new virt
host. That will cause a 15-minute out
This is done. Please let us know if you encounter new/unexpected
after-effects from this move.
-Andrew
On 12/21/18 10:01 AM, Andrew Bogott wrote:
I'm starting this move now.
On 12/21/18 9:32 AM, Andrew Bogott wrote:
Reminder: this interruption will start in about 30 minutes.
On 12/
I've just moved the VPS bastions to new hosts. These hosts are in the
new network region, and are also running Debian Stretch. This should be
an almost fully-transparent change for all users. On your first use of
ssh you may see a warning like:
Warning: Permanently added the ECDSA host key
We're currently experiencing a mysterious hareware failure in our
datacenter -- three different SSDs failed overnight, two of them in
cloudvirt1018 and one of them in cloudvirt1024. The VMs on 1018 are
down entirely. We may move those on 1024 to another host shortly in
order to guard against
this hardware anyway, out of an abundance
of caution, but that's unlikely to produce further downtime. With luck,
this is the last you'll hear about this.
-Andrew
On 2/13/19 7:25 AM, Andrew Bogott wrote:
We're currently experiencing a mysterious hareware failure in our
datace
I spoke too soon -- we're still working on this. Most of these VMs will
remain down in the meantime.
Sorry for the outage!
On 2/13/19 8:21 AM, Andrew Bogott wrote:
We don't fully understand what happened, but after Giovanni performed
a classic "turning it off and on again&
access it then you're in
luck! If not, stay tuned.
-Andrew
On 2/13/19 9:15 AM, Andrew Bogott wrote:
I spoke too soon -- we're still working on this. Most of these VMs
will remain down in the meantime.
Sorry for the outage!
On 2/13/19 8:21 AM, Andrew Bogott wrote:
We don't
-85b7-37643f03bfea | wikidata-misc |
wikidata-dev
On 2/13/19 11:23 AM, Andrew Bogott wrote:
Here's the latest:
cloudvirt1018 is up and running, and many of its VMs are fine. Many
other VMs are corrupted and won't start up. Some of those VMs will
probably be los
Arturo will
appear there in a few hours.
-Andrew
On 2/13/19 1:50 PM, Andrew Bogott wrote:
Now cloudvirt1024 is dying in earnest, so VMs hosted there will be
down for a while as well. This is, as far as anyone can tell, just a
stupid coincidence.
So far it appears that we are going to be ab
Because bad things come in threes (I'm hoping it's threes and not
sevens) the server that hosts toolsdb is now also misbehaving. Brooke
just now disabled a troubled drive which may have resolved things, but
if the last few hours are any indication then the vast majority of
connection or query a
tl;dr: We're about to disable self-service creation of Debian Jessie
VMs. To request an exception, open a Phabricator ticket specifying your
need and reasons.
--
We're close to polishing off the last few Ubuntu Trusty VMs in the
cloud, which means it's time to start thinking about the upcomi
Good morning!
As a side-effect of our response to the current gerrit vandalism
epidemic, the 2fa integration between Horizon and Wikitech has been
disabled. That means that existing Horizon sessions are still valid but
fresh logins will fail.
This problem is being actively worked on. In th
This issue is resolved now, and Horizon should work as usual. Sorry for
the interruption!
On 3/19/19 9:54 AM, Andrew Bogott wrote:
Good morning!
As a side-effect of our response to the current gerrit vandalism
epidemic, the 2fa integration between Horizon and Wikitech has been
disabled
Tuesday starting at around 17:00 UTC I'm going to relocate the paws and
kubernetes masters to the new network region. While the VMs are
copying, launches of new kubernetes jobs and creation of new PAWS
notebooks will fail.
The outage should last about an hour -- less if everything goes well,
My apologies, the earlier version of this email had an incorrect subject
line. This outage will be happening on Tuesday, not Monday.
-Andrew
On 4/11/19 8:30 PM, Andrew Bogott wrote:
Tuesday starting at around 17:00 UTC I'm going to relocate the paws
and kubernetes masters to the new ne
Reminder: this is happening today, in about three hours.
-Andrew
On 4/11/19 8:30 PM, Andrew Bogott wrote:
Tuesday starting at around 17:00 UTC I'm going to relocate the paws
and kubernetes masters to the new network region. While the VMs are
copying, launches of new kubernetes job
On 4/16/19 7:59 AM, Andrew Otto wrote:
Great! Is this just for Wikitech itself or all ldap/wikitech
authentication?
This notice is related to a change in mediawiki code, so concerns direct
logins to wikitech itself. That said, the 2fa key used by Horizon is
stored in a the wikitech database
This work is still underway. There are some unforeseen issues but we
should be back to normal shortly.
On 4/16/19 9:04 AM, Andrew Bogott wrote:
Reminder: this is happening today, in about three hours.
-Andrew
On 4/11/19 8:30 PM, Andrew Bogott wrote:
Tuesday starting at around 17:00 UTC
This is done now. Paws broke in a thousand ways after the move so it
lagged well behind the expected timeline, but normal function of the
toolforge k8s grid and Paws should be restored.
Let us know if you run into unexpected issues.
-Andrew
On 4/16/19 1:03 PM, Andrew Bogott wrote:
This
The latest Debian version, 10.0 "buster", was officially released a
few days ago[0]. Today, I've built a new Debian buster base image and
made it available in all projects.
The Stretch base image will remain available for some time to
permit compatibility with existing setups, but any
As part of routine networking and OS upgrades, I'll be emptying two
hypervisors (cloudvirt1016 and cloudvirt1017) on Monday and Tuesday, the
22nd and 23rd. This will result in downtime for many VMs as they are
copied and restarted. A complete list of affected instances follows.
I'll
Apologies! This is happening in July rather than August -- about 12 days
from now.
On 7/10/19 2:24 PM, Andrew Bogott wrote:
As part of routine networking and OS upgrades, I'll be emptying two
hypervisors (cloudvirt1016 and cloudvirt1017) on Monday and Tuesday,
the 22nd and 23rd. This
On Friday I'll be moving the toolforge cron server to new hardware.
During the move, any uses of the 'crontab' command will fail
gracelessly. Any cron jobs scheduled to launch during the downtime will
be skipped.
The move should take 5-10 minutes but may take as long as 30 if there
are compl
This is done.
On 7/17/19 3:25 PM, Andrew Bogott wrote:
On Friday I'll be moving the toolforge cron server to new hardware.
During the move, any uses of the 'crontab' command will fail
gracelessly. Any cron jobs scheduled to launch during the downtime
will be skipped.
The mov
Reminder: This is happening today, starting right now.
-Andrew
On 7/10/19 2:24 PM, Andrew Bogott wrote:
As part of routine networking and OS upgrades, I'll be emptying two
hypervisors (cloudvirt1016 and cloudvirt1017) on Monday and Tuesday,
the 22nd and 23rd. This will result in dow
Thanks to 10GbE, this went quite a bit faster than I expected and is now
done. I've confirmed that all affected VMs are up and reachable, but
please let me know if you encounter any unexpected problems from the move.
-Andrew
On 7/22/19 8:30 AM, Andrew Bogott wrote:
Reminder: Th
As part of routine networking and OS upgrades, I'll be emptying two
more hypervisors (cloudvirt1021 and cloudvirt1022) on Monday and Tuesday
next week, the 5th and 6th. This will result in downtime for many VMs
as they are copied and restarted. A complete list of affected instances
follow
55 AM, Andrew Bogott wrote:
As part of routine networking and OS upgrades, I'll be emptying two
more hypervisors (cloudvirt1021 and cloudvirt1022) on Monday and
Tuesday next week, the 5th and 6th. This will result in downtime for
many VMs as they are copied and restarted. A comple
a pre-arranged window, though, if there's sometime that's
better for you. Over the weekend is possible, even.
-A
Cyberpower678
English Wikipedia Account Creation Team
English Wikipedia Administrator
Global User Renamer
On Jul 31, 2019, at 10:26, Andrew Bogott <mailto:abog...@wik
count Creation Team
English Wikipedia Administrator
Global User Renamer
On Jul 31, 2019, at 10:56, Andrew Bogott <mailto:abog...@wikimedia.org>> wrote:
On 7/31/19 9:46 AM, Maximilian Doerr wrote:
Oh please no. cyberbot-db-01 needs to remain up in the next two
weeks at least.
Postponing isn
I will be moving the toolforge grid master on Monday. That will mean
that for a few minutes it will be impossible to submit new grid jobs.
Jobs that are already running will be unaffected.
I'll make the move at 14:00UTC, which is about 7AM Pacific time.
-Andrew
___
Reminder: This is happening today, starting in a few minutes.
-Andrew
On 7/31/19 9:26 AM, Andrew Bogott wrote:
In the interest of finishing up this stage of upgrades, I'm going to
try to also drain cloudvirt1023 during the same window. That includes
the following additional VMs:
ser
These VMs have all finished copying. Please let me know if you see any
ongoing problems that result from the move.
-Andrew
On 8/5/19 7:40 AM, Andrew Bogott wrote:
Reminder: This is happening today, starting in a few minutes.
-Andrew
On 7/31/19 9:26 AM, Andrew Bogott wrote:
In the interest
Later today (starting in a few hours around 18:00 UTC) we'll be
rearranging the puppetmaster setup for most cloud VMs[0]. No tools or
services (other than puppet) should be affected, but some of you might
get grumpy emails about broken puppet runs during the transition, which
I encourage you t
The user-facing parts of this are all done now. New VM creation was
broken for a few hours but should be working properly now.
-Andrew
On 9/9/19 10:09 AM, Andrew Bogott wrote:
Later today (starting in a few hours around 18:00 UTC) we'll be
rearranging the puppetmaster setup for most
1 - 100 of 277 matches
Mail list logo