In the tripleo meeting today we re-affirmed that the tripleo-cd-admins
team is aimed at delivering production-availability clouds - thats how
we know the the tripleo program is succeeding (or not !).

So if you're a member of that team, you're on the hook - effectively
on call, where production issues will take precedence over development
/ bug fixing etc.

We have the following clouds today:
cd-undercloud (baremetal, one per region)
cd-overcloud (KVM in the HP region, not sure yet for the RH region) -
multi region.
ci-overcloud (same as cd-overcloud, and will go away when cd-overcloud
is robust enough).

And we have two users:
 - TripleO ATCs, all of whom are eligible for accounts on *-overcloud
 - TripleO reviewers, indirectly via openstack-infra who provide 99%
of the load on the clouds

Right now when there is a problem, there's no clearly defined 'get
hold of someone' mechanism other than IRC in #tripleo.

And thats pretty good since most of the admins are on IRC most of the time.

But.

There are two holes - a) what if its sunday evening :) and b) what if
someone (for instance Derek) has been troubleshooting a problem, but
needs to go do personal stuff, or you know, sleep. There's no reliable
defined handoff mechanism.

So - I think we need to define two things:
  - a stock way for $randoms to ask for support w/ these clouds that
will be fairly low latency and reliable.
  - a way for us to escalate to each other *even if folk happen to be
away from the keyboard at the time*.
And possibly a third:
  - a way for openstack-infra admins to escalate to us in the event of
OMG things happening. Like, we send 1000 VMs all at once at their git
mirrors or something.

And with that lets open the door for ideas!

-Rob
-- 
Robert Collins <rbtcoll...@hp.com>
Distinguished Technologist
HP Converged Cloud

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to