http://lwn.net/Articles/157208/Some block layer patchesLinux I/O schedulers are charged with presenting I/O requests to block devices in an optimal order. There are currently four schedulers in the kernel, each with a different notion of "optimal." All of them, however, maintain a "dispatch queue," being the list of requests which have been selected for submission to the device. Each scheduler currently maintains its own dispatch queue. Tejun Heo has decided that the proliferation of dispatch queues is a wasteful duplication of code, so he has implemented a generic dispatch queue to bring things back together. The unification of the dispatch queues helps to ensure that all I/O schedulers implement queues with the same semantics. It also simplifies the schedulers by freeing them of the need to deal with non-filesystem requests. In general, the developers have been heard to say, recently, that the block subsystem is not really about block devices; it is, instead, a generic message queueing mechanism. The generic dispatch queue code helps to take things in that direction. Tejun Heo has also reimplemented the I/O barrier code. The result should be much improved barrier handling, but it also involves some API changes visible to block drivers. The new code recognizes that different devices will support barriers in different ways. There are three variables which are taken into account:
A block driver will tell the system about how its device operates with blk_queue_ordered(), which has a new prototype: typedef void (prepare_flush_fn)(request_queue_t *q,
struct request *rq);
int blk_queue_ordered(request_queue_t *q, unsigned ordered,
prepare_flush_fn *prepare_flush_fn,
unsigned gfp_mask);
The ordered parameter describes how barriers to be implemented; it has values like QUEUE_ORDERED_DRAIN_FLUSH to indicate that barriers are implemented by stopping the queue, and that flushes are required both before and after the barrier; or QUEUE_ORDERED_TAG, which says that ordered tags handle everything. The prepare_flush_fn() will be called to do whatever is required to make a specific operation force a flush to physical media. See Tejun's documentation patch for more details. With the above information in hand, the block layer can handle the implementation of barrier requests. As long as the driver implements flushes when requested and recognizes I/O requests requiring the FUA mode (a helper function blk_fua_rq() is provided for this purpose), the rest is taken care of at the higher levels. The barrier patch also adds an uptodate parameter to end_that_request_last(). This API change, which will affect most block drivers, is necessary to enable drivers to signal errors for non-filesystem requests. The conversation on the lists suggests that both of the above patches are headed for the mainline sooner or later. Mike Christie's block layer multipath patch may take a little longer, however. The question of where multipath support should be implemented has often been discussed; more recently, the seeming consensus was that the device mapper layer was the right place. The result was that the device mapper multipath patches were merged early this year. So it is a bit surprising to see the issue come back now. Mike has a few reasons for wanting to implement multipath at the lower level. These include:
A number of code simplifications are also said to result from the new organization. The new multipath code is essentially a repackaging of the device mapper code, reworked to deal with the block layer from underneath. It not being proposed for merging at this time, or even for serious review. So far, there has been little discussion of this patch. |
