From: Jens Axboe <[EMAIL PROTECTED]>
Subject: Re: [PATCH 4/4] bidi support: bidirectional request
Date: Mon, 30 Apr 2007 13:11:57 +0200
> On Sun, Apr 29 2007, James Bottomley wrote:
> > On Sun, 2007-04-29 at 18:48 +0300, Boaz Harrosh wrote:
> > > FUJITA Tomonori wrote:
> > > > From: Boaz Harrosh <[EMAIL PROTECTED]>
> > > > Subject: [PATCH 4/4] bidi support: bidirectional request
> > > > Date: Sun, 15 Apr 2007 20:33:28 +0300
> > > >
> > > >> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> > > >> index 645d24b..16a02ee 100644
> > > >> --- a/include/linux/blkdev.h
> > > >> +++ b/include/linux/blkdev.h
> > > >> @@ -322,6 +322,7 @@ struct request {
> > > >> void *end_io_data;
> > > >>
> > > >> struct request_io_part uni;
> > > >> + struct request_io_part bidi_read;
> > > >> };
> > > >
> > > > Would be more straightforward to have:
> > > >
> > > > struct request_io_part in;
> > > > struct request_io_part out;
> > > >
> > >
> > > Yes I wish I could do that. For bidi supporting drivers this is the most
> > > logical.
> > > But for the 99.9% of uni-directional drivers, calling rq_uni(), and being
> > > some what on
> > > the hotish paths, this means we will need a pointer to a uni
> > > request_io_part.
> > > This is bad because:
> > > 1st- There is no defined stage in a request life where to definitely set
> > > that pointer,
> > > specially in the preparation stages.
> > > 2nd- hacks like scsi_error.c/scsi_send_eh_cmnd() will not work at all.
> > > Now this is a
> > > very bad spot already, and I have a short term fix for it in the
> > > SCSI-bidi patches
> > > (not sent yet) but a more long term solution is needed. Once such
> > > hacks are
> > > cleaned up we can do what you say. This is exactly why I use the
> > > access functions
> > > rq_uni/rq_io/rq_in/rq_out and not open code access.
> >
> > I'm still not really convinced about this approach. The primary job of
> > the block layer is to manage and merge READ and WRITE requests. It
> > serves a beautiful secondary function of queueing for arbitrary requests
> > it doesn't understand (REQ_TYPE_BLOCK_PC or REQ_TYPE_SPECIAL ... or
> > indeed any non REQ_TYPE_FS).
> >
> > bidirectional requests fall into the latter category (there's nothing
> > really we can do to merge them ... they're just transported by the block
> > layer). The only unusual feature is that they carry two bios. I think
> > the drivers that actually support bidirectional will be a rarity, so it
> > might even be advisable to add it to the queue capability (refuse
> > bidirectional requests at the top rather than perturbing all the drivers
> > to process them).
> >
> > So, what about REQ_TYPE_BIDIRECTIONAL rather than REQ_BIDI? That will
> > remove it from the standard path and put it on the special command type
> > path where we can process it specially. Additionally, if you take this
> > approach, you can probably simply chain the second bio through
> > req->special as an additional request in the stream. The only thing
> > that would then need modification would be the dequeue of the block
> > driver (it would have to dequeue both requests and prepare them) and
> > that needs to be done only for drivers handling bidirectional requests.
>
> I agree, I'm really not crazy about shuffling the entire request setup
> around just for something as exotic as bidirection commands. How about
> just keeping it simple - have a second request linked off the first one
> for the second data phase? So keep it completely seperate, not just
> overload ->special for 2nd bio list.
>
> So basically just add a struct request pointer, so you can do rq =
> rq->next_rq or something for the next data phase. I bet this would be a
> LOT less invasive as well, and we can get by with a few helpers to
> support it.
This patch tried this approach. It's just for seeing how it works.
I added bidi support to open-iscsi and bsg and tested this patch
lightly. I've attached only a patch for the block layer and scsl-ml.
You can get all the patches are:
http://www.kernel.org/pub/linux/kernel/people/tomo/bidi
If we go with this approach, we need just minor changes to the block
layer. The overloading rq->special approach needs more but it's
reasonable too.
I need to add the proper error handling code, which might be a bit
tricky, but I think that it will not be so complicated.
> And it should definitely be a request type.
I'm not sure about this. I think that bidi can't be a request type to
trace bidi pc requests (we have bidi special requests like SMP). I use
REQ_BIDI though I've not implemented bidi trace code.
>From 7d278323ff8aad86fb82c823538f7ddfb6ded11c Mon Sep 17 00:00:00 2001
From: FUJITA Tomonori <[EMAIL PROTECTED]>
Date: Wed, 2 May 2007 03:55:56 +0900
Subject: [PATCH] add bidi support
Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
block/ll_rw_blk.c | 1 +
drivers/scsi/scsi_lib.c | 72 +++++++++++++++++++++++++++++++++++++++-------
include/linux/blkdev.h | 7 ++++
include/scsi/scsi_cmnd.h | 9 ++++++
4 files changed, 78 insertions(+), 11 deletions(-)
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index cf8752a..82842d6 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -256,6 +256,7 @@ static void rq_init(request_queue_t *q,
rq->end_io = NULL;
rq->end_io_data = NULL;
rq->completion_data = NULL;
+ rq->next_rq = NULL;
}
/**
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index be8e655..96541cb 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -701,34 +701,36 @@ static struct scsi_cmnd *scsi_end_reques
return NULL;
}
-struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask)
+static struct scatterlist *do_scsi_alloc_sgtable(unsigned short use_sg,
+ unsigned short *sglist_len,
+ gfp_t gfp_mask)
{
struct scsi_host_sg_pool *sgp;
struct scatterlist *sgl;
- BUG_ON(!cmd->use_sg);
+ BUG_ON(!use_sg);
- switch (cmd->use_sg) {
+ switch (use_sg) {
case 1 ... 8:
- cmd->sglist_len = 0;
+ *sglist_len = 0;
break;
case 9 ... 16:
- cmd->sglist_len = 1;
+ *sglist_len = 1;
break;
case 17 ... 32:
- cmd->sglist_len = 2;
+ *sglist_len = 2;
break;
#if (SCSI_MAX_PHYS_SEGMENTS > 32)
case 33 ... 64:
- cmd->sglist_len = 3;
+ *sglist_len = 3;
break;
#if (SCSI_MAX_PHYS_SEGMENTS > 64)
case 65 ... 128:
- cmd->sglist_len = 4;
+ *sglist_len = 4;
break;
#if (SCSI_MAX_PHYS_SEGMENTS > 128)
case 129 ... 256:
- cmd->sglist_len = 5;
+ *sglist_len = 5;
break;
#endif
#endif
@@ -737,11 +739,15 @@ #endif
return NULL;
}
- sgp = scsi_sg_pools + cmd->sglist_len;
+ sgp = scsi_sg_pools + *sglist_len;
sgl = mempool_alloc(sgp->pool, gfp_mask);
return sgl;
}
+struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask)
+{
+ return do_scsi_alloc_sgtable(cmd->use_sg, &cmd->sglist_len, gfp_mask);
+}
EXPORT_SYMBOL(scsi_alloc_sgtable);
void scsi_free_sgtable(struct scatterlist *sgl, int index)
@@ -778,6 +784,9 @@ static void scsi_release_buffers(struct
if (cmd->use_sg)
scsi_free_sgtable(cmd->request_buffer, cmd->sglist_len);
+ if (cmd->ext_request_buffer.use_sg)
+ scsi_free_sgtable(cmd->ext_request_buffer.request_buffer,
+ cmd->ext_request_buffer.sglist_len);
/*
* Zero these out. They now point to freed memory, and it is
* dangerous to hang onto the pointers.
@@ -1106,9 +1115,48 @@ static int scsi_setup_blk_pc_cmnd(struct
BUG_ON(!req->nr_phys_segments);
+ if (blk_bidi_rq(req)) {
+ BUG_ON(!req->next_rq);
+
+ if (rq_data_dir(req) != WRITE ||
+ rq_data_dir(req->next_rq) != READ) {
+ scsi_release_buffers(cmd);
+ scsi_put_command(cmd);
+ return BLKPREP_KILL;
+ }
+ }
+
ret = scsi_init_io(cmd);
if (unlikely(ret))
return ret;
+
+ if (blk_bidi_rq(req)) {
+ struct scsi_data_buffer *sdb = &cmd->ext_request_buffer;
+ struct scatterlist *sgpnt;
+ int count;
+
+ sdb->use_sg = req->nr_phys_segments;
+
+ sgpnt = do_scsi_alloc_sgtable(sdb->use_sg,
+ &sdb->sglist_len,
+ GFP_ATOMIC);
+ if (unlikely(!sgpnt)) {
+ scsi_release_buffers(cmd);
+ scsi_unprep_request(req);
+ return BLKPREP_DEFER;
+ }
+
+ sdb->request_buffer = sgpnt;
+ sdb->request_bufflen = req->next_rq->data_len;
+ count = blk_rq_map_sg(req->q, req->next_rq, sgpnt);
+ if (unlikely(count > sdb->use_sg)) {
+ scsi_free_sgtable(sgpnt, sdb->sglist_len);
+ scsi_release_buffers(cmd);
+ scsi_put_command(cmd);
+ return BLKPREP_KILL;
+ }
+ sdb->use_sg = count;
+ }
} else {
BUG_ON(req->data_len);
BUG_ON(req->data);
@@ -1122,7 +1170,9 @@ static int scsi_setup_blk_pc_cmnd(struct
BUILD_BUG_ON(sizeof(req->cmd) > sizeof(cmd->cmnd));
memcpy(cmd->cmnd, req->cmd, sizeof(cmd->cmnd));
cmd->cmd_len = req->cmd_len;
- if (!req->data_len)
+ if (blk_bidi_rq(req))
+ cmd->sc_data_direction = DMA_BIDIRECTIONAL;
+ else if (!req->data_len)
cmd->sc_data_direction = DMA_NONE;
else if (rq_data_dir(req) == WRITE)
cmd->sc_data_direction = DMA_TO_DEVICE;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 83dcd8c..9d3bb4a 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -199,6 +199,7 @@ enum rq_flag_bits {
__REQ_ALLOCED, /* request came from our alloc pool */
__REQ_RW_META, /* metadata io request */
__REQ_NR_BITS, /* stops here */
+ __REQ_BIDI, /* bidirectional io request */
};
#define REQ_RW (1 << __REQ_RW)
@@ -219,6 +220,7 @@ #define REQ_ORDERED_COLOR (1 << __REQ_OR
#define REQ_RW_SYNC (1 << __REQ_RW_SYNC)
#define REQ_ALLOCED (1 << __REQ_ALLOCED)
#define REQ_RW_META (1 << __REQ_RW_META)
+#define REQ_BIDI (1 << __REQ_BIDI)
#define BLK_MAX_CDB 16
@@ -313,6 +315,9 @@ struct request {
*/
rq_end_io_fn *end_io;
void *end_io_data;
+
+ /* for bidi */
+ struct request *next_rq;
};
/*
@@ -478,6 +483,7 @@ #define QUEUE_FLAG_DEAD 5 /* queue bein
#define QUEUE_FLAG_REENTER 6 /* Re-entrancy avoidance */
#define QUEUE_FLAG_PLUGGED 7 /* queue is plugged */
#define QUEUE_FLAG_ELVSWITCH 8 /* don't use elevator, just do FIFO */
+#define QUEUE_FLAG_BIDI 9 /* bidirectional requests */
enum {
/*
@@ -542,6 +548,7 @@ #define blk_pm_request(rq) \
#define blk_sorted_rq(rq) ((rq)->cmd_flags & REQ_SORTED)
#define blk_barrier_rq(rq) ((rq)->cmd_flags & REQ_HARDBARRIER)
#define blk_fua_rq(rq) ((rq)->cmd_flags & REQ_FUA)
+#define blk_bidi_rq(rq) ((rq)->cmd_flags & REQ_BIDI)
#define list_entry_rq(ptr) list_entry((ptr), struct request, queuelist)
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index a2e0c10..0e259e5 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -11,6 +11,12 @@ struct scatterlist;
struct Scsi_Host;
struct scsi_device;
+struct scsi_data_buffer {
+ unsigned short use_sg; /* Number of pieces of scatter-gather */
+ unsigned short sglist_len; /* size of malloc'd scatter-gather list */
+ void *request_buffer; /* Actual requested buffer */
+ unsigned request_bufflen; /* Actual request size */
+};
/* embedded in scsi_cmnd */
struct scsi_pointer {
@@ -117,6 +123,9 @@ #define SCSI_SENSE_BUFFERSIZE 96
unsigned char tag; /* SCSI-II queued command tag */
unsigned long pid; /* Process ID, starts at 0. Unique per host. */
+
+ /* bidi in buffer */
+ struct scsi_data_buffer ext_request_buffer;
};
extern struct scsi_cmnd *scsi_get_command(struct scsi_device *, gfp_t);
--
1.4.3.2
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html