Re: [openstack-dev] [nova] How to debug no valid host failures with placement

Chris Friesen Wed, 01 Aug 2018 10:25:05 -0700

On 08/01/2018 11:17 AM, Ben Nemec wrote:



On 08/01/2018 11:23 AM, Chris Friesen wrote:

The fact that there is no real way to get the equivalent of the old detailed
scheduler logs is a known shortcoming in placement, and will become more of a
problem if/when we move more complicated things like CPU pinning, hugepages,
and NUMA-awareness into placement.

The problem is that getting useful logs out of placement would require
significant development work.


Yeah, in my case I only had one compute node so it was obvious what the problem
was, but if I had a scheduling failure on a busy cloud with hundreds of nodes I
don't see how you would ever track it down.  Maybe we need to have a discussion
with operators about how often they do post-mortem debugging of this sort of 
thing?

For Wind River's Titanium Cloud it was enough of an issue that we customized thescheduler to emit detailed logs on scheduler failure.

We started upstreaming it[1] but the effort stalled out when the upstream folksrequested major implementation changes.


Chris


[1] https://blueprints.launchpad.net/nova/+spec/improve-sched-logging

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] How to debug no valid host failures with placement

Reply via email to