On 29.8.2017 13:22, Giulio Fidente wrote:
On 08/29/2017 11:14 AM, Jiří Stránský wrote:
Hi owls,
the CI for containerized deployments with Pacemaker is close! In fact,
it works [1][2] (but there are pending changes to merge).
cool :D
I also spotted this which we need for ceph
https://review.openstack.org/#/c/498356/
but I am not sure if we want to enable ceph in this job as we have it
already in a couple of scenarios, more below ...
+1 on keeping it in scenarios if that covers our needs.
The way it's proposed in gerrit currently is to switch the
centos-7-containers-multinode job (featureset010) to deploy with
Pacemaker. What do you think about making this switch as a first step?
(The OVB job is an option too, but that one is considerably closer to
timeouts already, so it may be better left as is.)
+1 on switching the existing job
Later it would be nice to get a proper clustering test with 3
controllers. Should we try and switch the centos-7-ovb-ha-oooq job to
deploy containers on master and stable/pike? (Probably by adding a new
job that only runs on master + Pike, and making the old ovb-ha-oooq only
run upto Ocata, to keep the OVB capacity demands unchanged?) I'd be +1
on that since containers are the intended way of deploying Pike and
beyond. WDYT?
switching OVB to containers from pike seems fine because that's the
indended way as you pointed, yet I would like to enable ceph in the
upgrade job, and it requires multiple MON instances (multiple controllers)
would it make any sense to deploy the pacemaker / ceph combination using
multiple controllers in the upgrade job and drop the standard ovb job
(which doesn't do upgrade) or use it for other purposes?
It makes sense feature-wise to test upgrade with Ceph, i'd say it's a
pretty common and important use case.
However i'm not sure how can we achieve it time-wise in CI. Is it
possible to estimate how much time might the Ceph upgrade add?
A bit of context: Currently our only upgrade check job is non-OVB -
containers-multinode-upgrades-nv. As of late we started hitting
timeouts, and the job only does mixed-version deploy + 1 node AIO
overcloud upgrade (just the main step). It doesn't do undercloud
upgrade, nor compute upgrade, nor converge, and it still times out...
It's a bit difficult to find things to cut off here. :D We could look
into speeding things up (e.g. try to reintroduce selective container
image upload etc.) but i think we might also be approaching the
"natural" deploy+upgrade limits. We might need to bump up the timeouts
if we want to test more things. Though it's not only about capacity of
HW, it could also get unwieldy for devs if we keep increasing the
feedback time from CI, so we're kinda in a tough spot with upgrade CI...
Jirka
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev