On 03/28/2017 08:57 AM, Clark Boylan wrote:
1. Libvirt crashes: http://status.openstack.org/elastic-recheck/#1643911
and http://status.openstack.org/elastic-recheck/#1646779

Libvirt is randomly crashing during the job which causes things to fail
(for obvious reasons). To address this will likely require someone with
experience debugging libvirt since it's most likely a bug isolated to
libvirt. We're looking for someone familiar with libvirt internals to
drive the effort to fix this issue,

Ok, from the bug [1] we're seeing malloc() corruption.

While I agree that a coredump is not that likely to help, I would also
like to come to that conclusion after inspecting a coredump :) I've
found things in the heap before that give clues as to what real
problems are.

To this end, I've proposed [2] to keep coredumps.  It's a little
hackish but I think gets the job done. [3] enables this and saves any
dumps to the logs in d-g.

As suggested, running under valgrind would be great but probably
impractical until we narrow it down a little.  Another thing I've had
some success with is electric fence [4] which puts boundaries around
allocations so out-of-bounds access hits at the time of access.  I've
proposed [5] to try this out, but it's not looking particularly
promising unfortunately.  I'm open to suggestions, for example maybe
something like tcalloc might give us a different failure and could be
another clue.  If we get something vaguely reliable here, our best bet
might be to run a parallel non-voting job on all changes to see what
we can pick up.

-i

[1] https://bugs.launchpad.net/nova/+bug/1643911
[2] https://review.openstack.org/451128
[3] https://review.openstack.org/451219
[4] http://elinux.org/Electric_Fence
[5] https://review.openstack.org/451136

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to