On Thu, Sep 22, 2016 at 1:40 PM, Steven Hardy <sha...@redhat.com> wrote:
> On Thu, Sep 22, 2016 at 04:36:30PM +0200, Gabriele Cerami wrote:
>> Hi,
>> As reported on this bug
>> https://bugs.launchpad.net/tripleo/+bug/1626483
>> HA gate and periodic jobs for master and sometimes newton started to
>> fail for errors related to memory shortage. Memory on undercloud
>> instance was increased to 8G less than a month ago, so the problem
>> needs a different approach to be solved.
>> We have some solutions in store. However, with the release date so
>> close, I don't think it's time for this kind of changes. So I thought
>> it could be a good compromise to temporarily increase the undercloud
>> instance memory to 12G, just for this week, unless there's a rapid way
>> to reduce memory footprint for heat-engine (usually the biggest memory
>> consumer on the undercloud instance)
> If we can avoid it, I'd rather we avoided increasing the ram again - I
> suspect there is an issue with a heat regression as I'm seeing much higher
> memory usage in my local test environment too.
> I did a quick re-test of some local monitoring I did earlier in the cycle
> when we experienced some high memory usage:
> http://people.redhat.com/~shardy/heat/plots/heat_before_after_end_newton.png
> There are three plots there, one early in the cycle, one after some fixes
> which reduced memory usage a lot, then the highest leaky plot is the one I
> just did today.
> So I'm pretty sure we have another heat memory leak to track down.
> If anyone has any historical data of memory usage e.g from periodic CI
> runs, that would be helpful, otherwise we'll have to bisect testing locally
> or derive it from scraping our dstat data from CI run logs.
> Steve.

Steve, I dropped a comment in your Heat bug report, that might be
related to our CI problem:

I hope it helps,
Emilien Macchi

OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

Reply via email to