On 29.8.2017 13:22, Giulio Fidente wrote:
On 08/29/2017 11:14 AM, Jiří Stránský wrote:
Hi owls,

the CI for containerized deployments with Pacemaker is close! In fact,
it works [1][2] (but there are pending changes to merge).

cool :D

I also spotted this which we need for ceph
https://review.openstack.org/#/c/498356/

but I am not sure if we want to enable ceph in this job as we have it
already in a couple of scenarios, more below ...

+1 on keeping it in scenarios if that covers our needs.


The way it's proposed in gerrit currently is to switch the
centos-7-containers-multinode job (featureset010) to deploy with
Pacemaker. What do you think about making this switch as a first step?
(The OVB job is an option too, but that one is considerably closer to
timeouts already, so it may be better left as is.)

+1 on switching the existing job

Later it would be nice to get a proper clustering test with 3
controllers. Should we try and switch the centos-7-ovb-ha-oooq job to
deploy containers on master and stable/pike? (Probably by adding a new
job that only runs on master + Pike, and making the old ovb-ha-oooq only
run upto Ocata, to keep the OVB capacity demands unchanged?) I'd be +1
on that since containers are the intended way of deploying Pike and
beyond. WDYT?

switching OVB to containers from pike seems fine because that's the
indended way as you pointed, yet I would like to enable ceph in the
upgrade job, and it requires multiple MON instances (multiple controllers)

would it make any sense to deploy the pacemaker / ceph combination using
multiple controllers in the upgrade job and drop the standard ovb job
(which doesn't do upgrade) or use it for other purposes?

It makes sense feature-wise to test upgrade with Ceph, i'd say it's a pretty common and important use case.

However i'm not sure how can we achieve it time-wise in CI. Is it possible to estimate how much time might the Ceph upgrade add?

A bit of context: Currently our only upgrade check job is non-OVB - containers-multinode-upgrades-nv. As of late we started hitting timeouts, and the job only does mixed-version deploy + 1 node AIO overcloud upgrade (just the main step). It doesn't do undercloud upgrade, nor compute upgrade, nor converge, and it still times out... It's a bit difficult to find things to cut off here. :D We could look into speeding things up (e.g. try to reintroduce selective container image upload etc.) but i think we might also be approaching the "natural" deploy+upgrade limits. We might need to bump up the timeouts if we want to test more things. Though it's not only about capacity of HW, it could also get unwieldy for devs if we keep increasing the feedback time from CI, so we're kinda in a tough spot with upgrade CI...

Jirka

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to