just landed (and didn't timeout in
check nor gate without recheck, so good sigh it helped to mitigate).

I've restore and rechecked some patches that I evacuated from the gate,
please do not restore others or recheck or approve anything for now, and
see how it goes with a few patches.
We're still working with Steve on his patches to optimize the way we deploy
containers on the registry and are investigating how we could make it
faster with a proxy.

Stay tuned and thanks for your patience.

On Wed, Jun 13, 2018 at 5:50 PM, Emilien Macchi <> wrote:

> TL;DR: gate queue was 25h+, we put all patches from gate on standby, do
> not restore/recheck until further announcement.
> We recently enabled the containerized undercloud for multinode jobs and we
> believe this was a bit premature as the container download process wasn't
> optimized so it's not pulling the mirrors for the same containers multiple
> times yet.
> It caused the job runtime to increase and probably the load on
> mirrors hosted by OpenStack Infra to be a bit slower to provide the same
> containers multiple times. The time taken to prepare containers on the
> undercloud and then for the overcloud caused the jobs to randomly timeout
> therefore the gate to fail in a high amount of times, so we decided to
> remove all jobs from the gate by abandoning the patches temporarily (I have
> them in my browser and will restore when things are stable again, please do
> not touch anything).
> Steve Baker has been working on a series of patches that optimize the way
> we prepare the containers but basically the workflow will be:
> - pull containers needed for the undercloud into a local registry, using
> infra mirror if available
> - deploy the containerized undercloud
> - pull containers needed for the overcloud minus the ones already pulled
> for the undercloud, using infra mirror if available
> - update containers on the overcloud
> - deploy the containerized undercloud
> With that process, we hope to reduce the runtime of the deployment and
> therefore reduce the timeouts in the gate.
> To enable it, we need to land in that order: https://review.
> and https://review.openstack.
> org/#/c/568403.
> In the meantime, we are disabling the containerized undercloud recently
> enabled on all scenarios: for
> mitigation with the hope to stabilize things until Steve's patches land.
> Hopefully, we can merge Steve's work tonight/tomorrow and re-enable the
> containerized undercloud on scenarios after checking that we don't have
> timeouts and reasonable deployment runtimes.
> That's the plan we came with, if you have any question / feedback please
> share it.
> --
> Emilien, Steve and Wes

Emilien Macchi
OpenStack Development Mailing List (not for usage questions)

Reply via email to