Re: [OpenStack-Infra] Zuul memory leak

Joshua Hesketh Mon, 07 Mar 2016 15:29:45 -0800

Hi Mikhail,

Thank you for the extra details. I'll continue to look into this.


With the daily bumps when you do the log rotation, I assume you aren't
reloading zuul at that point and the freed memory is likely due to another
process?

Cheers,
Josh

On Tue, Mar 8, 2016 at 10:17 AM, Mikhail Medvedev <[email protected]>
wrote:

> On Wed, Feb 10, 2016 at 10:57 AM, James E. Blair <[email protected]>
> wrote:
> > Michael Still <[email protected]> writes:
> >
> >> On Tue, Feb 9, 2016 at 4:59 AM, Joshua Hesketh <
> [email protected]>
> >> wrote:
> >>
> >>> On Thu, Feb 4, 2016 at 2:44 AM, James E. Blair <[email protected]>
> >>> wrote:
> >>>>
> >>>> On the subject of clearing the cache more often, I think we may not
> want
> >>>> to wipe out the cache more often than we do now -- in fact, I think we
> >>>> may want to look into ways to keep from doing even that, because
> >>>> whenever we reload now, Zuul slows down considerably as it has to
> query
> >>>> Gerrit again for all of the data previously in its cache.
> >>>>
> >>>
> >>> I can see a lot of 3rd parties or simpler CI's not needing to reload
> zuul
> >>> very often so this cache would never get cleared. Perhaps cached
> objects
> >>> should have an expiry time (of a day or so) and can be cleaned up
> >>> periodically? Additionally if clearing the cache on a reload is causing
> >>> pain maybe we should move the cache into the scheduler and keep it
> between
> >>> reloads?
> >>>
> >>
> >> Do you guys use oslo at all? I ask because the olso memcache stuff does
> >> exactly this, so it should be trivial to implement if you don't mind
> >> depending on oslo.
> >
> > One of the main things we use the cache for is to ensure that every
> > change is represented by a single Change object in Zuul's memory.  The
> > graph of enqueued Items link to their respective Changes which may link
> > to each other due to dependencies.  When something changes in Gerrit, we
> > want that reflected immediately and consistently in all of the objects
> > in that graph.  Using the cache means that every time we add a new
> > Change object to that graph, we use the same object for a given change.
> >
> > This is why we can't use time-based expiry -- we must not drop objects
> > from the cache if they are still in the graph.  Otherwise we will create
> > new duplicative objects and the ones still in the graph will not be
> > updated.
> >
> > Perhaps we should change these objects to something more ephemeral that
> > can proxy for some other mechanism that can operate more like a
> > traditional cache (with time-based expiry).  But I think changes to this
> > system should happen in Zuulv3 -- it works well enough for Zuulv2 for
> > now.
> >
> > -Jim
> >
>
> We are one of third-party CIs and using "Zuul version: 2.1.1.dev123",
> which is one commit after [1]. That one commit after is not in tree - I am
> applying [2] on top.
>
> The VM has 8GB of RAM. zuul-server memory footprint goes up consistently
> over
> the course of a week. Normally it takes about 3-4 days to get over to 3Gb.
> About a week ago I witnessed zuul-server get to 95% of RAM, at which point
> kernel started killing other processes. The graph [3] memory [3], and it
> reflects zuul-server consumption. The daily bumps on the graph are daily
> cron
> doing log rotation etc, possibly flushing caches.
>
> I can not say 100% that it is still the leak. Could simply be that
> zuul-server
> requires more ram now.
>
> [1]
> https://review.openstack.org/#q,I81ee47524cda71a500c55a95a2280f491b1b63d9,n,z
> [2]
> https://review.openstack.org/#q,If3a418fa2d4993a149d454e02a9b26529e4b6825,n,z
> [3] http://imgur.com/SzqSA1H
>
> Mikhail Medvedev (mmedvede)
>
> _______________________________________________
> OpenStack-Infra mailing list
> [email protected]
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>

_______________________________________________
OpenStack-Infra mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Re: [OpenStack-Infra] Zuul memory leak

Reply via email to