Jens,

While working on one issue of less IOPs for sequential READ/WRITE, I found 
interesting stuffs which was causing performance drop for sequential IO.  I did 
some reverse engineering on Block layer code to understand how to get benefit 
from any sysfs parameters settings, but could not find anything useful to solve 
this issue. 

I have described problem statement and root cause of this issue in this mail 
thread. 

Problem statement - "Cannot achieve Sequential read/write performance, because 
of back merge is not happening frequently"

Here is my understanding of back merge done in elevator.

Linux block layer is responsible for merging/sorting of the IO with the help of 
Elevator hook in OS + IO scheduler.
IO scheduler does not have any role in merging sequential IO. It is done in 
Elevator hook, so choosing any IO scheduler in Linux will not help (OR you can 
consider that behavior will be unchanged irrespective of IO scheduler). Any 
sequential IO will be merge at Elevator code path.

1. When IO comes from upper layer, it will be queued at Elevator/IO scheduler 
level.  It will also add IO into hash look up which will be used for merge and 
other purpose.
2. Elevator code will search any outstanding IO (in the queue of the same 
layer). If there is any chance to merge it, it will perform  (BACK MERGE) merge 
3. If there is no merge possible, IO will be queued to the next level (which is 
IO scheduler).
4. In IO completion Path, IO scheduler will post IO to the Driver queue, if at 
all there is any outstanding IO. (There are many other condition, but this is 
very common code path)

To merge more command, #2 should find more outstanding in hash table look up. 
This is possible if flow control start either at block layer/Driver level.
It means, driver/block layer forcefully delay IO submission to next level and 
give more chance at elevator code to merge more IO via accumulating more IO 
from user space.

If I manually change queue depth of the device to lower value (between 1-8), 
which is doing only Sequential IO.. I am able to see maximum IO coming to the 
driver after merge and it eventually increase the IOPs.

Is there any way to increase possibility of merged IO coming from block layer 
to the Low level driver ?

Thanks, Kashyap
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to