On 08/01/2018 11:17 AM, Ben Nemec wrote:
On 08/01/2018 11:23 AM, Chris Friesen wrote:
The fact that there is no real way to get the equivalent of the old detailed
scheduler logs is a known shortcoming in placement, and will become more of a
problem if/when we move more complicated things like CPU pinning, hugepages,
and NUMA-awareness into placement.
The problem is that getting useful logs out of placement would require
significant development work.
Yeah, in my case I only had one compute node so it was obvious what the problem
was, but if I had a scheduling failure on a busy cloud with hundreds of nodes I
don't see how you would ever track it down. Maybe we need to have a discussion
with operators about how often they do post-mortem debugging of this sort of
For Wind River's Titanium Cloud it was enough of an issue that we customized the
scheduler to emit detailed logs on scheduler failure.
We started upstreaming it but the effort stalled out when the upstream folks
requested major implementation changes.
OpenStack Development Mailing List (not for usage questions)