------- Comment From [email protected] 2017-08-15 09:26 EDT-------
Here is what I am observing, and what leads me to think that "cfq" may not 
(yet) be a good choice for the default io-sched.

The test exerciser, HTX (https://github.com/open-power/HTX - POWER arch
only), causes stress on CFQ during certain cycles. I set the debug
timeout threshold for completion of I/Os at 60 seconds (upon timeout,
debugging is printed and then io_schedule() called after which more
debugging is printed). It is known that I/O delays seem to vary
continuously throughout the range, but using a timeout lower than 60
just produced too much output.

During certain cycles, where about 1 million I/Os per hour are being
performed on each disk, we see timeouts being triggered. Essentially,
the timeout happens because CFQ has not even submitted the I/O to SCSI
yet. Earlier debugging showed that the once the I/O actually gets
submitted to SCSI it completes promptly.

Without the patch 5be6b75610ce these I/Os could (sometimes) take an hour
or more to get submitted to SCSI. With the patch, that delay time seems
to max out at around 110 seconds, which is a great improvement however
still indicates a problem.

I typically see about 400-500 I/Os trip the 60-second timeout during a
given ~2 hour cycle (estimated 4 million I/Os total), so it is not a
huge percentage. However, the I/Os affected seem to be related, possibly
by process or thread, and so this could be detrimental to an
application. Note, the number of I/Os taking between 30 and 60 seconds
is not known, but is expected to be much higher. Even 30 seconds may be
an undesirable number. It's not clear just how CFQ chooses the delay
value and what overrides it.

On a run with the scheduler set to "deadline", I never see any I/Os trip
the 60-second timeout.

I think this shows undesirable behavior in CFQ, possibly a bug, and that
it should not be the default scheduler - especially for servers. Is
there some evidence that shows CFQ to be better than deadline in
general?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1709889/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to