On Mon, Mar 05, 2018 at 02:52:16PM +0000, Dr. David Alan Gilbert wrote: > * Peter Lieven (p...@kamp.de) wrote: > > Am 05.03.2018 um 12:45 schrieb Stefan Hajnoczi: > > > On Thu, Feb 22, 2018 at 12:13:50PM +0100, Peter Lieven wrote: > > >> I stumbled across the MAX_INFLIGHT_IO field that was introduced in 2015 > > >> and was curious what was the reason > > >> to choose 512MB as readahead? The question is that I found that the > > >> source VM gets very unresponsive I/O wise > > >> while the initial 512MB are read and furthermore seems to stay > > >> unreasponsive if we choose a high migration speed > > >> and have a fast storage on the destination VM. > > >> > > >> In our environment I modified this value to 16MB which seems to work > > >> much smoother. I wonder if we should make > > >> this a user configurable value or define a different rate limit for the > > >> block transfer in bulk stage at least? > > > I don't know if benchmarks were run when choosing the value. From the > > > commit description it sounds like the main purpose was to limit the > > > amount of memory that can be consumed. > > > > > > 16 MB also fulfills that criteria :), but why is the source VM more > > > responsive with a lower value? > > > > > > Perhaps the issue is queue depth on the storage device - the block > > > migration code enqueues up to 512 MB worth of reads, and guest I/O has > > > to wait? > > > > That is my guess. Especially if the destination storage is faster we > > basically alsways have > > 512 I/Os in flight on the source storage. > > > > Does anyone mind if the reduce that value to 16MB or do we need a better > > mechanism? > > We've got migration-parameters these days; you could connect it to one > of those fairly easily I think. > Try: grep -i 'cpu[-_]throttle[-_]initial' for an example of one that's > already there. > Then you can set it to whatever you like.
It would be nice to solve the performance problem without adding a tuneable. On the other hand, QEMU has no idea what the queue depth of the device is. Therefore it cannot prioritize guest I/O over block migration I/O. 512 parallel requests is much too high. Most parallel I/O benchmarking is done at 32-64 queue depth. I think that 16 parallel requests is a reasonable maximum number for a background job. We need to be clear though that the purpose of this change is unrelated to the original 512 MB memory footprint goal. It just happens to touch the same constant but the goal is now to submit at most 16 I/O requests in parallel to avoid monopolizing the I/O device. Stefan
Description: PGP signature