On Fri, Nov 14 2008, Rusty Russell wrote:
> This allows more requests to fit in the descriptor ring.
> 
> Copying 1.7M kernel image 100 times (with sync between)
>   Before: totsegs = 55661 totlen = 148859962 avg. 2674
>   After: totsegs = 36097 totlen = 139439355 avg: 3862
> 
> Unfortunately, this coalescing is done at blk_rq_map_sg() which is too
> late to be optimal: requests have already been limited to the value set
> by blk_queue_max_hw_segments().  For us, that value reflects the
> number of sg slots we can handle (ie. after clustering).
> 
> I suspect other drivers have the same issue.  Jens?

blk_queue_max_hw_segments() is the size of your sg list, so yes the
block layer will stop merging more into a request if we go beyond that.
But it tracks merging along the way, so I don't see why there's a
discrepancy between the two ends? Unless there's a bug there, of
course...

Queue clustering is on by default though when you allocate your queue,
so I'm surprised you see a difference by doing:

+       /* Gather adjacent buffers to minimize sg length. */
+       queue_flag_set(QUEUE_FLAG_CLUSTER, vblk->disk->queue);

did test_bit(QUEUE_FLAG_CLUSTER, &vblk->disk->queue->queue_flags) really
return 0 before?

-- 
Jens Axboe

_______________________________________________
Virtualization mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/virtualization

Reply via email to