On 23/03/17 16:24 +0100, Martin André wrote:
On Wed, Mar 22, 2017 at 2:20 PM, Dan Prince <dpri...@redhat.com> wrote:On Wed, 2017-03-22 at 13:35 +0100, Flavio Percoco wrote:On 22/03/17 13:32 +0100, Flavio Percoco wrote: > On 21/03/17 23:15 -0400, Emilien Macchi wrote: > > Hey, > > > > I've noticed that container jobs look pretty unstable lately; to > > me, > > it sounds like a timeout: > > http://logs.openstack.org/19/447319/2/check-tripleo/gate-tripleo- > > ci-centos-7-ovb-containers-oooq-nv/bca496a/console.html#_2017-03- > > 22_00_08_55_358973 > > There are different hypothesis on what is going on here. Some > patches have > landed to improve the write performance on containers by using > hostpath mounts > but we think the real slowness is coming from the images download. > > This said, this is still under investigation and the containers > squad will > report back as soon as there are new findings.Also, to be more precise, Martin André is looking into this. He also fixed the gate in the last 2 weeks.I spoke w/ Martin on IRC. He seems to think this is the cause of some of the failures: http://logs.openstack.org/32/446432/1/check-tripleo/gate-tripleo-ci-cen tos-7-ovb-containers-oooq-nv/543bc80/logs/oooq/overcloud-controller- 0/var/log/extra/docker/containers/heat_engine/log/heat/heat- engine.log.txt.gz#_2017-03-21_20_26_29_697 Looks like Heat isn't able to create Nova instances in the overcloud due to "Host 'overcloud-novacompute-0' is not mapped to any cell'. This means our cells initialization code for containers may not be quite right... or there is a race somewhere.Here are some findings. I've looked at time measures from CI for https://review.openstack.org/#/c/448533/ which provided the most recent results: * gate-tripleo-ci-centos-7-ovb-ha [1] undercloud install: 23 overcloud deploy: 72 total time: 125 * gate-tripleo-ci-centos-7-ovb-nonha [2] undercloud install: 25 overcloud deploy: 48 total time: 122 * gate-tripleo-ci-centos-7-ovb-updates [3] undercloud install: 24 overcloud deploy: 57 total time: 152 * gate-tripleo-ci-centos-7-ovb-containers-oooq-nv [4] undercloud install: 28 overcloud deploy: 48 total time: 165 (timeout) Looking at the undercloud & overcloud install times, the most task consuming tasks, the containers job isn't doing that bad compared to other OVB jobs. But looking closer I could see that: - the containers job pulls docker images from dockerhub, this process takes roughly 18 min.
I think we can optimize this a bit by having the script that populates the local registry in the overcloud job to run in parallel. The docker daemon can do multiple pulls w/o problems.
- the overcloud validate task takes 10 min more than it should because of the bug Dan mentioned (a fix is in the queue at https://review.openstack.org/#/c/448575/)
+A
- the postci takes a long time with quickstart, 13 min (4 min alone spent on docker log collection) whereas it takes only 3 min when using tripleo.sh
mmh, does this have anything to do with ansible being in between? Or is that time specifically for the part that gets the logs?
Adding all these numbers, we're at about 40 min of additional time for oooq containers job which is enough to cross the CI job limit. There is certainly a lot of room for optimization here and there and I'll explore how we can speed up the containers CI job over the next
Thanks a lot for the update. The time break down is fantastic, Flavio
weeks. Martin [1] http://logs.openstack.org/33/448533/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/d2c1b16/ [2] http://logs.openstack.org/33/448533/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/d6df760/ [3] http://logs.openstack.org/33/448533/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-updates/3b1f795/ [4] http://logs.openstack.org/33/448533/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-containers-oooq-nv/b816f20/DanFlavio _____________________________________________________________________ _____ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubs cribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-- @flaper87 Flavio Percoco
signature.asc
Description: PGP signature
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev