On Mon, Feb 22, 2016, at 12:15 PM, Mike Bayer wrote: > > > On 02/22/2016 11:30 AM, Chris Friesen wrote: > > On 02/22/2016 11:17 AM, Jay Pipes wrote: > >> On 02/22/2016 10:43 AM, Chris Friesen wrote: > >>> Hi all, > >>> > >>> We've recently run into some interesting behaviour that I thought I > >>> should bring up to see if we want to do anything about it. > >>> > >>> Basically the problem seems to be that nova-compute is doing disk I/O > >>> from the main thread, and if it blocks then it can block all of > >>> nova-compute (since all eventlets will be blocked). Examples that we've > >>> found include glance image download, file renaming, instance directory > >>> creation, opening the instance xml file, etc. We've seen nova-compute > >>> block for upwards of 50 seconds. > >>> > >>> Now the specific case where we hit this is not a production > >>> environment. It's only got one spinning disk shared by all the guests, > >>> the guests were hammering on the disk pretty hard, the IO scheduler for > >>> the instance disk was CFQ which seems to be buggy in our kernel. > >>> > >>> But the fact remains that nova-compute is doing disk I/O from the main > >>> thread, and if the guests push that disk hard enough then nova-compute > >>> is going to suffer. > >>> > >>> Given the above...would it make sense to use eventlet.tpool or similar > >>> to perform all disk access in a separate OS thread? There'd likely be a > >>> bit of a performance hit, but at least it would isolate the main thread > >>> from IO blocking. > >> > >> This is probably a good idea, but will require quite a bit of code > >> change. I > >> think in the past we've taken the expedient route of just exec'ing > >> problematic > >> code in a greenthread using utils.spawn(). > > > > I'm not an expert on eventlet, but from what I've seen this isn't > > sufficient to deal with disk access in a robust way. > > > > It's my understanding that utils.spawn() will result in the code running > > in the same OS thread, but in a separate eventlet greenthread. If that > > code tries to access the disk via a potentially-blocking call the > > eventlet subsystem will not jump to another greenthread. Because of > > this it can potentially block the whole OS thread (and thus all other > > greenthreads running in that OS thread). > > not sure what utils.spawn() does but if it is in fact an "exec" (or if > Jay is suggesting that an exec() be used within) then the code would be > in a different process entirely, and communicating with it becomes an > issue of pipe IO over unix sockets which IIRC can do non blocking.
utils.spawn() is just a wrapper around eventlet.spawn(), mostly there to be stubbed out in testing. > > > > > > I think we need to eventlet.tpool for disk IO (or else fork a whole > > separate process). Basically we need to ensure that the main OS thread > > never issues a potentially-blocking syscall. > > tpool would probably be easier (and more performant because no socket > needed). > > > > > > Chris > > > > __________________________________________________________________________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev