On 22 February 2016 at 17:38, Sean Dague <s...@dague.net> wrote:
> On 02/22/2016 12:20 PM, Daniel P. Berrange wrote:
>> On Mon, Feb 22, 2016 at 12:07:37PM -0500, Sean Dague wrote:
>>> On 02/22/2016 10:43 AM, Chris Friesen wrote:
>>>> Hi all,
>>>>
>>>> We've recently run into some interesting behaviour that I thought I
>>>> should bring up to see if we want to do anything about it.
>>>>
>>>> Basically the problem seems to be that nova-compute is doing disk I/O
>>>> from the main thread, and if it blocks then it can block all of
>>>> nova-compute (since all eventlets will be blocked).  Examples that we've
>>>> found include glance image download, file renaming, instance directory
>>>> creation, opening the instance xml file, etc.  We've seen nova-compute
>>>> block for upwards of 50 seconds.
>>>>
>>>> Now the specific case where we hit this is not a production
>>>> environment.  It's only got one spinning disk shared by all the guests,
>>>> the guests were hammering on the disk pretty hard, the IO scheduler for
>>>> the instance disk was CFQ which seems to be buggy in our kernel.
>>>>
>>>> But the fact remains that nova-compute is doing disk I/O from the main
>>>> thread, and if the guests push that disk hard enough then nova-compute
>>>> is going to suffer.
>>>>
>>>> Given the above...would it make sense to use eventlet.tpool or similar
>>>> to perform all disk access in a separate OS thread?  There'd likely be a
>>>> bit of a performance hit, but at least it would isolate the main thread
>>>> from IO blocking.
>>>
>>> Making nova-compute more robust is fine, though the reality is once you
>>> IO starve a system, a lot of stuff is going to fall over weird.
>>>
>>> So there has to be a tradeoff of the complexity of any new code vs. what
>>> it gains. I think individual patches should be evaluated as such, or a
>>> spec if this is going to get really invasive.
>>
>> There are OS level mechanisms (eg cgroups blkio controller) for doing
>> I/O priorization that you could use to give Nova higher priority over
>> the VMs, to reduce (if not eliminate) the possibility that a busy VM
>> can inflict a denial of service on the mgmt layer.  Of course figuring
>> out how to use that mechanism correctly is not entirely trivial.
>>
>> I think it is probably worth focusing effort in that area, before jumping
>> into making all the I/O related code in Nova more complicated. eg have
>> someone investigate & write up recommendation in Nova docs for how to
>> configure the host OS & Nova such that VMs cannot inflict an I/O denial
>> of service attack on the mgmt service.
>
> +1 that would be much nicer.
>
> We've got some set of bugs in the tracker right now which are basically
> "after the compute node being at loadavg of 11 for an hour, nova-compute
> starts failing". Having some basic methodology to use Linux
> prioritization on the worker process would mitigate those quite a bit,
> and could be used by all users immediately, vs. complex nova-compute
> changes which would only apply to new / upgraded deploys.
>

+1

Does that turn into improved deployment docs that cover how you do
that on various platforms?

Maybe some tools to help with that also go in here?
http://git.openstack.org/cgit/openstack/osops-tools-generic/

Thanks,
John

PS
FWIW, how xenapi runs nova-compute in VM has a similar outcome, albeit
in a more heavy handed way.

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to