There's probably some minimal gain in cross compatibility testing to
sticking with the status quo.  The Swift API is old and stable, but I
believe there was some bug in recent history where some return value in
swiftclient changed from a iterable to a generator or something and some
aggressive non-duck type checking broke something somewhere....

I find that bug reports sorta interesting, the reported memory pressure
there doesn't make sense.  Maybe there's some non-
essential middleware configured on that proxy that's causing the workers to
bloat up like that?

-clayg

On Mon, Jun 27, 2016 at 12:30 PM, Emilien Macchi <emil...@redhat.com> wrote:

> Hi,
>
> Today we're re-investigating a CI failure that we had multiple times [1]:
> Swift memory usage grows until it is OOM-killed.
>
> The perimeter of this thread is about our CI and not production
> environments.
> Indeed, our CI is running limited resources while production
> environments should not hit this problem.
>
> After some investigation on #ŧripleo, we found out this scenario was
> happening almost every time since recently:
>
> * undercloud is deployed, glance and swift are running. Glance is
> configured with Swift backend to store images.
> * tripleo CI upload overcloud image into Glance, image is successfully
> uploaded.
> * when overcloud starts deploying, some nodes randomly fail to deploy
> because the undercloud OOM-kills swift-proxy-server that is still
> sending the ovecloud image requested by Glance API. Swift fails,
> Glance fails, overcloud deployment fails with a "No valid hosts
> found".
>
> It's likely due to performances issues in our CI, and there is nothing
> we can do but adding more resources or reducing the number of
> environments, something we won't do at this time, because our recent
> improvements in our CI (more ram, SSD, etc).
>
> As a first iteration, I propose [2] that we stop using Swift as a
> backend for Glance. Indeed, our undercloud is currently single-node, I
> see zero value of using Swift to store the overcloud image.
> If there is a value, then we can add the option to whether or not
> using it (and set it to False in our CI to use file backend, which
> won't lead to OOM).
>
> Note: on the overcloud: we currently support file, swift and rbd
> backends, that you can easily select during your deployment.
>
> [1] https://bugs.launchpad.net/tripleo/+bug/1595916
> [2] https://review.openstack.org/#/c/334555/
> --
> Emilien Macchi
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to