[PATCH 2/4] bidi support: fix req-cmd == INT cases
- we have unearthed very old bugs in stale drivers that still used request-cmd as a READ|WRITE int - these drivers should probably go away... Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] Signed-off-by: Benny Halevy [EMAIL PROTECTED] --- drivers/acorn/block/fd1772.c |2 +- drivers/acorn/block/mfmhd.c |8 drivers/block/amiflop.c |2 +- drivers/block/nbd.c |2 +- drivers/cdrom/aztcd.c|2 +- drivers/cdrom/cm206.c|2 +- drivers/cdrom/gscd.c |2 +- drivers/cdrom/mcdx.c |2 +- drivers/cdrom/optcd.c|2 +- drivers/cdrom/sjcd.c |2 +- drivers/ide/legacy/hd.c |4 ++-- 11 files changed, 15 insertions(+), 15 deletions(-) diff --git a/drivers/acorn/block/fd1772.c b/drivers/acorn/block/fd1772.c index 674bf81..1717679 100644 --- a/drivers/acorn/block/fd1772.c +++ b/drivers/acorn/block/fd1772.c @@ -1246,7 +1246,7 @@ repeat: del_timer(motor_off_timer); ReqCnt = 0; -ReqCmd = CURRENT-cmd; +ReqCmd = rq_uni_rw_dir(CURRENT); ReqBlock = CURRENT-sector; ReqBuffer = CURRENT-buffer; setup_req_params(drive); diff --git a/drivers/acorn/block/mfmhd.c b/drivers/acorn/block/mfmhd.c index 689a4c3..50001eb 100644 --- a/drivers/acorn/block/mfmhd.c +++ b/drivers/acorn/block/mfmhd.c @@ -439,7 +439,7 @@ static void mfm_rw_intr(void) a choice of command end or some data which is ready to be collected */ /* I think we have to transfer data while the interrupt line is on and its not any other type of interrupt */ -if (CURRENT-cmd == WRITE) { +if (rq_uni_rw_dir(CURRENT) == WRITE) { extern void hdc63463_writedma(void); if ((hdc63463_dataleft = 0) (!(mfm_status STAT_CED))) { printk(mfm_rw_intr: Apparent DMA write request when no more to DMA\n); @@ -799,7 +799,7 @@ static void issue_request(unsigned int block, unsigned int nsect, raw_cmd.head = start_head; raw_cmd.cylinder = track / p-heads; raw_cmd.cmdtype = CURRENT-cmd; -raw_cmd.cmdcode = CURRENT-cmd == WRITE ? CMD_WD : CMD_RD; +raw_cmd.cmdcode = rq_uni_rw_dir(CURRENT) == WRITE ? CMD_WD : CMD_RD; raw_cmd.cmddata[0] = dev + 1;/* DAG: +1 to get US */ raw_cmd.cmddata[1] = raw_cmd.head; raw_cmd.cmddata[2] = raw_cmd.cylinder 8; @@ -830,7 +830,7 @@ static void issue_request(unsigned int block, unsigned int nsect, hdc63463_dataleft = nsect * 256;/* Better way? */ DBG(mfm%c: %sing: CHS=%d/%d/%d, sectors=%d, buffer=0x%08lx (%p)\n, - raw_cmd.dev + 'a', (CURRENT-cmd == READ) ? read : writ, + raw_cmd.dev + 'a', dma_dir_to_string(rq_dma_dir(CURRENT)), raw_cmd.cylinder, raw_cmd.head, raw_cmd.sector, nsect, (unsigned long) Copy_buffer, CURRENT); @@ -917,7 +917,7 @@ static void mfm_request(void) DBG(mfm_request: block after offset=%d\n, block); -if (CURRENT-cmd != READ CURRENT-cmd != WRITE) { +if (!dma_uni_dir(rq_dma_dir(CURRENT))) { printk(unknown mfm-command %d\n, CURRENT-cmd); end_request(CURRENT, 0); Busy = 0; diff --git a/drivers/block/amiflop.c b/drivers/block/amiflop.c index 54f2fb3..fa0da1f 100644 --- a/drivers/block/amiflop.c +++ b/drivers/block/amiflop.c @@ -1363,7 +1363,7 @@ static void redo_fd_request(void) #ifdef DEBUG printk(fd: sector %ld + %d requested for %s\n, CURRENT-sector,cnt, - (CURRENT-cmd==READ)?read:write); + (rq_uni_rw_dir(CURRENT) == READ) ? read : write); #endif block = CURRENT-sector + cnt; if ((int)block floppy-blocks) { diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 411e138..fc5e1b2 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -411,7 +411,7 @@ static void nbd_clear_que(struct nbd_device *lo) /* * We always wait for result of write, for now. It would be nice to make it optional * in future - * if ((req-cmd == WRITE) (lo-flags NBD_WRITE_NOCHK)) + * if ((rq_uni_rw_dir(req) == WRITE) (lo-flags NBD_WRITE_NOCHK)) * { printk( Warning: Ignoring result!\n); nbd_end_request( req ); } */ diff --git a/drivers/cdrom/aztcd.c b/drivers/cdrom/aztcd.c index 1f9fb7a..8ccae77 100644 --- a/drivers/cdrom/aztcd.c +++ b/drivers/cdrom/aztcd.c @@ -229,7 +229,7 @@ static struct request_queue *azt_queue; static int current_valid(void) { return CURRENT -CURRENT-cmd == READ +rq_rw_dir(CURRENT) == READ CURRENT-sector != -1; } diff --git a/drivers/cdrom/cm206.c b/drivers/cdrom/cm206.c index 2301311..1bdf0b7 100644 --- a/drivers/cdrom/cm206.c +++ b/drivers/cdrom/cm206.c @@ -851,7 +851,7 @@ static void do_cm206_request(request_queue_t * q) if (!req) return; -if (req-cmd != READ) { +if (rq_rw_dir(req) != READ) { debug((Non-read command %d on cdrom\n, req-cmd)); end_request(req, 0
Re: [PATCH 4/4] bidi support: bidirectional request
FUJITA Tomonori wrote: From: Boaz Harrosh [EMAIL PROTECTED] Subject: [PATCH 4/4] bidi support: bidirectional request Date: Sun, 15 Apr 2007 20:33:28 +0300 diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 645d24b..16a02ee 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -322,6 +322,7 @@ struct request { void *end_io_data; struct request_io_part uni; +struct request_io_part bidi_read; }; Would be more straightforward to have: struct request_io_part in; struct request_io_part out; Yes I wish I could do that. For bidi supporting drivers this is the most logical. But for the 99.9% of uni-directional drivers, calling rq_uni(), and being some what on the hotish paths, this means we will need a pointer to a uni request_io_part. This is bad because: 1st- There is no defined stage in a request life where to definitely set that pointer, specially in the preparation stages. 2nd- hacks like scsi_error.c/scsi_send_eh_cmnd() will not work at all. Now this is a very bad spot already, and I have a short term fix for it in the SCSI-bidi patches (not sent yet) but a more long term solution is needed. Once such hacks are cleaned up we can do what you say. This is exactly why I use the access functions rq_uni/rq_io/rq_in/rq_out and not open code access. /* @@ -600,6 +601,34 @@ static inline struct request_io_part* rq_uni(struct request* req) return req-uni; } +static inline struct request_io_part* rq_out(struct request* req) +{ +WARN_ON_BIDI_FLAG(req); +return req-uni; +} + +static inline struct request_io_part* rq_in(struct request* req) +{ +WARN_ON_BIDI_FLAG(req); +if (likely(rq_dma_dir(req) != DMA_BIDIRECTIONAL)) +return req-uni; + +if (likely(req-cmd_flags REQ_BIDI)) +return req-bidi_read; + +return req-uni; +} + +static inline struct request_io_part* rq_io(struct request* req, +enum dma_data_direction dir) +{ +if (dir == DMA_FROM_DEVICE) +return rq_in(req); + +WARN_ON( (dir != DMA_TO_DEVICE) (dir != DMA_NONE) ); +return req-uni; +} static inline struct request_io_part* rq_io(struct request* req) { return (req is WRITE) ? req-out : req-in; } Again I'm all for it. But this is a to deep of a change. Too many things changing at once. If we keep the access functions than it does not matter, we can do it later. Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/4] bidi support: request_io_part
Boaz Harrosh wrote: - Extract all I/O members of struct request into a request_io_part member. - Define API to access the I/O part - Adjust block layer accordingly. - Change all users to new API. Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] Signed-off-by: Benny Halevy [EMAIL PROTECTED] --- Patch is attached compressed because of size It looks like this patch is very big and is hard for review/maintenance. I was thinking of a way it could be divided into small patches but still make it compile and run at each stage/patch for bisects. It could be done in three stages: 1. Make a dummy API that mimics the new API but still lets old drivers/code compile. 2. Stage 2 - convert driver by driver or group by group to new API. this can be done in an arbitrary number of patches. 3. Final stage. do the actual move of members and implement the new API. At this stage, if any drivers are not converted, (out-of-tree drivers), they will not compile. Please tell me if you need this done? should I send the all patchset or just this one divided? (Below is a demonstration of 1st and 3rd stages at blkdev.h) FIRST_STAGE diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index c1121d2..579ee2d 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -235,23 +235,6 @@ struct request { unsigned int cmd_flags; enum rq_cmd_type_bits cmd_type; - /* Maintain bio traversal state for part by part I/O submission. -* hard_* are block layer internals, no driver should touch them! -*/ - - sector_t sector;/* next sector to submit */ - sector_t hard_sector; /* next sector to complete */ - unsigned long nr_sectors; /* no. of sectors left to submit */ - unsigned long hard_nr_sectors; /* no. of sectors left to complete */ - /* no. of sectors left to submit in the current segment */ - unsigned int current_nr_sectors; - - /* no. of sectors left to complete in the current segment */ - unsigned int hard_cur_sectors; - - struct bio *bio; - struct bio *biotail; - struct hlist_node hash; /* merge hash */ /* * The rb_node is only used inside the io scheduler, requests @@ -273,22 +256,11 @@ struct request { struct gendisk *rq_disk; unsigned long start_time; - /* Number of scatter-gather DMA addr+len pairs after -* physical address coalescing is performed. -*/ - unsigned short nr_phys_segments; - - /* Number of scatter-gather addr+len pairs after -* physical and DMA remapping hardware coalescing is performed. -* This is the number of scatter-gather entries the driver -* will actually have to deal with after DMA mapping is done. -*/ - unsigned short nr_hw_segments; - unsigned short ioprio; void *special; - char *buffer; + char *buffer; /* FIXME: should be Deprecated */ + void *data; /* FIXME: should be Deprecated */ int tag; int errors; @@ -301,9 +273,7 @@ struct request { unsigned int cmd_len; unsigned char cmd[BLK_MAX_CDB]; - unsigned int data_len; unsigned int sense_len; - void *data; void *sense; unsigned int timeout; @@ -314,8 +284,49 @@ struct request { */ rq_end_io_fn *end_io; void *end_io_data; + +/* + request io members. FIXME: will go later in the patchset into a sub-structure +*/ +/* struct request_io_part uni; */ +/* struct request_io_part { */ + unsigned int data_len; + + /* Maintain bio traversal state for part by part I/O submission. +* hard_* are block layer internals, no driver should touch them! +*/ + sector_t sector;/* next sector to submit */ + sector_t hard_sector; /* next sector to complete */ + unsigned long nr_sectors; /* no. of sectors left to submit */ + unsigned long hard_nr_sectors; /* no. of sectors left to complete */ + /* no. of sectors left to submit in the current segment */ + unsigned int current_nr_sectors; + + /* no. of sectors left to complete in the current segment */ + unsigned int hard_cur_sectors; + + struct bio *bio; + struct bio *biotail; + + /* Number of scatter-gather DMA addr+len pairs after +* physical address coalescing is performed. +*/ + unsigned short nr_phys_segments; + + /* Number of scatter-gather addr+len pairs after +* physical and DMA remapping hardware coalescing is performed. +* This is the number of scatter-gather entries the driver +* will actually have to deal with after DMA mapping is done. +*/ + unsigned short nr_hw_segments; +/* }; */ }; +/* FIXME: here only for the duration of the patchset
Re: [PATCH 4/4] bidi support: bidirectional request
. I have tested these patches with IBM OSD driver with bidi-enabled SCSI-ml and iSCSI. All OSD and iSCSI tests pass. The overall adjustments are minor. Since I have basically used the old core bidi code I can say with confidence that I have not lost the stability gained by the testers/developers that are using bidi already. I would like to summarize on why a second request hanging on the first request is less than optimal solution. 1. A full request is not needed. Only the io members are needed. 2. A request is an expensive resource in the kernel. (Allocation, queuing, locking ...) which is a waste if you need to use bidi. 3. Error handling is a mess. Both in the building and in the recovery of io requests. Especially considering the double layering of SCSI-ml. (With struct scsi_cmnd already hanging on req-special) 4. Lots of other code, not at all touched by this patch, will have to change so they safely ignore the extra brain-dead request. 5. Bugs can always creep into ll_rw_blk.c since it is not clear from the code itself what functions are allowed and safe to use with the second io-only request and what functions are only allowed on the main request. With my approach the division of code is very clear. Concerning what James said about bidi capability been a property of the Q of only these devices that support it. Yes! Block level should not allow bidi access to devices that do not support it. Otherwise, through bsg, (when it will be available), user-mode can DOS the system by sending bidi commands to legacy devices. How should a device advertise this capability? Please note that these patches are over 2.6.21-rc5 linux-2.6-block tree and will need to be updated and cleaned for proper submission. Please every one comment so we can proceed in the direction of the final solution. Pros are as welcome as Cons ;) Thanks in advance Boaz Harrosh From 73c94d6b7e41523d44e7787617c8a1abb351326f Mon Sep 17 00:00:00 2001 From: Boaz Harrosh [EMAIL PROTECTED](none) Date: Sun, 29 Apr 2007 16:11:11 +0300 Subject: [PATCH] rq_direction - is_sync and rw flags cleanup - is_sync is it's own bool in call to elev{,ator}_may_queue{,fn} - set some policy on when rw flag is set - alloc starts as read (0) - get_request() or __make_request() will set to write acourding to parameter or bio information --- block/as-iosched.c |2 +- block/cfq-iosched.c |6 +++--- block/elevator.c |4 ++-- block/ll_rw_blk.c| 39 +-- include/linux/elevator.h |4 ++-- 5 files changed, 21 insertions(+), 34 deletions(-) diff --git a/block/as-iosched.c b/block/as-iosched.c index ef12627..824d93e 100644 --- a/block/as-iosched.c +++ b/block/as-iosched.c @@ -1285,7 +1285,7 @@ static void as_work_handler(struct work_struct *work) spin_unlock_irqrestore(q-queue_lock, flags); } -static int as_may_queue(request_queue_t *q, int rw) +static int as_may_queue(request_queue_t *q, int rw, int is_sync) { int ret = ELV_MQUEUE_MAY; struct as_data *ad = q-elevator-elevator_data; diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index b6491c0..1392ee9 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -226,7 +226,7 @@ static inline pid_t cfq_queue_pid(struct task_struct *task, int rw, int is_sync) /* * Use the per-process queue, for read requests and syncronous writes */ - if (!(rw REQ_RW) || is_sync) + if (!(rw == WRITE) || is_sync) return task-pid; return CFQ_KEY_ASYNC; @@ -1787,14 +1787,14 @@ static inline int __cfq_may_queue(struct cfq_queue *cfqq) return ELV_MQUEUE_MAY; } -static int cfq_may_queue(request_queue_t *q, int rw) +static int cfq_may_queue(request_queue_t *q, int rw, int is_sync) { struct cfq_data *cfqd = q-elevator-elevator_data; struct task_struct *tsk = current; struct cfq_queue *cfqq; unsigned int key; - key = cfq_queue_pid(tsk, rw, rw REQ_RW_SYNC); + key = cfq_queue_pid(tsk, rw, is_sync); /* * don't force setup of a queue from here, as a call to may_queue diff --git a/block/elevator.c b/block/elevator.c index 96a00c8..eae857f 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -845,12 +845,12 @@ void elv_put_request(request_queue_t *q, struct request *rq) e-ops-elevator_put_req_fn(rq); } -int elv_may_queue(request_queue_t *q, int rw) +int elv_may_queue(request_queue_t *q, int rw, int is_sync) { elevator_t *e = q-elevator; if (e-ops-elevator_may_queue_fn) - return e-ops-elevator_may_queue_fn(q, rw); + return e-ops-elevator_may_queue_fn(q, rw, is_sync); return ELV_MQUEUE_MAY; } diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c index 3de0695..32daa55 100644 --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -1958,18 +1958,14 @@ static inline void blk_free_request
Re: [PATCH 6/7] blk_end_request: remove/unexport end_that_request_*
On Wed, Sep 05 2007 at 2:13 +0300, Kiyoshi Ueda [EMAIL PROTECTED] wrote: Hi, On Tue, 4 Sep 2007 17:25:14 -0400, Halevy, Benny [EMAIL PROTECTED] wrote: We suspect we'll still need the extern entry points for handling the bidi request in the scsi_io_completion() path as we only want to call end_that_request_chunk on req-next_rq and never end_that_request_last. (see http://www.bhalevy.com/open-osd/download/linux-2.6.23-rc2_and_iscsi-iscsi-2007_08_09/0005-SCSI-bidi-support.patch) If this patch-set is merged, there may be other way to do that. For tricky drivers, special interface, blk_end_request_callback(), is added in the patch 5/7. (http://marc.info/?l=linux-kernelm=118860027714753w=2) Currently, only user of the interface is ide-cd (cdrom_newpc_intr()). It needs to call only end_that_request_first() too. With the patch 7/7, you can set your own handler in rq-end_io() to complete the request by your own way. Thanks, Kiyoshi Ueda That will not work, as I will have no means of releasing the BIOs of the bidi request, which can not use end_request(). I guess as Jens said it's OK to remove them now, and later we can just add end_that_request_first(), will be enough. Or we can patch end_request() to also call __end_that_request_first(req-next_rq) if not NULL. Jens which method do you prefer? I will adjust my patches accordingly. Thanks Boaz Harrosh - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] blk_end_request: changing normal drivers
On Sat, Sep 01 2007 at 1:42 +0300, Kiyoshi Ueda [EMAIL PROTECTED] wrote: This patch converts normal drivers, which complete request in a standard way shown below, to use blk_end_request(). a) end_that_request_{chunk/first} spin_lock_irqsave() (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request()) end_that_request_last() spin_unlock_irqrestore() = blk_end_request() b) spin_lock_irqsave() end_that_request_{chunk/first} (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request()) end_that_request_last() spin_unlock_irqrestore() = spin_lock_irqsave() __blk_end_request() spin_unlock_irqsave() c) end_that_request_last() = __blk_end_request() Signed-off-by: Kiyoshi Ueda [EMAIL PROTECTED] Signed-off-by: Jun'ichi Nomura [EMAIL PROTECTED] --- arch/arm/plat-omap/mailbox.c|9 ++--- arch/um/drivers/ubd_kern.c | 10 +- block/elevator.c|4 ++-- block/ll_rw_blk.c | 15 +-- drivers/block/DAC960.c |6 ++ drivers/block/floppy.c |8 +++- drivers/block/lguest_blk.c |5 + drivers/block/nbd.c |4 +--- drivers/block/ps3disk.c |6 +- drivers/block/sunvdc.c |5 + drivers/block/sx8.c |4 +--- drivers/block/ub.c |4 ++-- drivers/block/viodasd.c |5 + drivers/block/xen-blkfront.c|5 ++--- drivers/cdrom/viocd.c |5 + drivers/ide/ide-cd.c|6 +++--- drivers/ide/ide-io.c| 22 +++--- drivers/message/i2o/i2o_block.c |8 ++-- drivers/mmc/card/block.c| 24 +--- drivers/mmc/card/queue.c|4 ++-- drivers/s390/block/dasd.c |4 +--- drivers/s390/char/tape_block.c |3 +-- drivers/scsi/ide-scsi.c |8 drivers/scsi/scsi_lib.c | 13 ++--- 24 files changed, 57 insertions(+), 130 deletions(-) diff -rupN 02-sect2byte-macro/arch/arm/plat-omap/mailbox.c 03-normal-caller-change/arch/arm/plat-omap/mailbox.c --- 02-sect2byte-macro/arch/arm/plat-omap/mailbox.c 2007-08-13 00:25:24.0 -0400 +++ 03-normal-caller-change/arch/arm/plat-omap/mailbox.c 2007-08-23 17:51:33.0 -0400 @@ -117,7 +117,8 @@ static void mbox_tx_work(struct work_str ... Please Separate this patch to at least three patches. 1. Block layer - block/elevator.c block/ll_rw_blk.c 2. SCSI-ML - drivers/scsi/scsi_lib.c 3. Normal drivers, can also be separated into logical groups like scsi/block etc.. This is because these patches introduce conflicts to lots of queued work I have, and if you separate them it will help me with my giting. Also I think that this is logical. Block-layer and scsi_lib are not drivers, the Subject of the patch does not match. Thanks Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH ver2 2/2] libata-scsi: convert to use the data buffer accessors
simple search-and-replace of direct scsi_cmnd access to use the data buffer accessors. Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] --- drivers/ata/libata-scsi.c | 18 +- 1 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 6145a64..d23a181 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -450,8 +450,8 @@ static struct ata_queued_cmd *ata_scsi_qc_new(struct ata_device *dev, qc-scsicmd = cmd; qc-scsidone = done; - qc-__sg = (struct scatterlist *) cmd-request_buffer; - qc-n_elem = cmd-use_sg; + qc-__sg = scsi_sglist(cmd); + qc-n_elem = scsi_sg_count(cmd); } else { cmd-result = (DID_OK 16) | (QUEUE_FULL 1); done(cmd); @@ -1493,13 +1493,13 @@ static int ata_scsi_translate(struct ata_device *dev, struct scsi_cmnd *cmd, /* data is present; dma-map it */ if (cmd-sc_data_direction == DMA_FROM_DEVICE || cmd-sc_data_direction == DMA_TO_DEVICE) { - if (unlikely(cmd-request_bufflen 1)) { + if (unlikely(scsi_bufflen(cmd) 1)) { ata_dev_printk(dev, KERN_WARNING, WARNING: zero len r/w req\n); goto err_did; } - ata_sg_init(qc, cmd-request_buffer, cmd-use_sg); + ata_sg_init(qc, scsi_sglist(cmd), scsi_sg_count(cmd)); qc-dma_dir = cmd-sc_data_direction; } @@ -1553,7 +1553,7 @@ static unsigned int ata_scsi_rbuf_get(struct scsi_cmnd *cmd, u8 **buf_out) u8 *buf; unsigned int buflen; - struct scatterlist *sg = (struct scatterlist *) cmd-request_buffer; + struct scatterlist *sg = scsi_sglist(cmd); if (sg) { buf = kmap_atomic(sg-page, KM_IRQ0) + sg-offset; @@ -1580,7 +1580,7 @@ static unsigned int ata_scsi_rbuf_get(struct scsi_cmnd *cmd, u8 **buf_out) static inline void ata_scsi_rbuf_put(struct scsi_cmnd *cmd, u8 *buf) { - struct scatterlist *sg = (struct scatterlist *) cmd-request_buffer; + struct scatterlist *sg = scsi_sglist(cmd); if (sg) kunmap_atomic(buf - sg-offset, KM_IRQ0); } @@ -2383,7 +2383,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc) } qc-tf.command = ATA_CMD_PACKET; - qc-nbytes = scmd-request_bufflen; + qc-nbytes = scsi_bufflen(scmd); /* check whether ATAPI DMA is safe */ if (!using_pio ata_check_atapi_dma(qc)) @@ -2618,7 +2618,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc) case ATA_CMD_WRITE_LONG_ONCE: if (tf-protocol != ATA_PROT_PIO || tf-nsect != 1) goto invalid_fld; - qc-sect_size = scmd-request_bufflen; + qc-sect_size = scsi_bufflen(scmd); } /* @@ -2648,7 +2648,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc) * TODO: find out if we need to do more here to * cover scatter/gather case. */ - qc-nbytes = scmd-request_bufflen; + qc-nbytes = scsi_bufflen(scmd); /* request result TF */ qc-flags |= ATA_QCFLAG_RESULT_TF; -- 1.5.3.1 - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH ver2 1/2] libata-scsi: Remove !use_sg code paths
This is a minimal patch needed to remove use of !use_sg but it is not a complete clean up of the !use_sg paths. Libata-core still has the qc-flags ATA_QCFLAG_SG and !qc-n_elem code paths. Perhaps an ata maintainer would have a go at it. - TODO: further cleanup of qc-flags ATA_QCFLAG_SG and !qc-n_elem code paths in libata-core - TODO: Use scsi_dma_{map,unmap} where applicable. --- drivers/ata/libata-scsi.c | 31 +-- 1 files changed, 9 insertions(+), 22 deletions(-) diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index e836476..6145a64 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -450,13 +450,8 @@ static struct ata_queued_cmd *ata_scsi_qc_new(struct ata_device *dev, qc-scsicmd = cmd; qc-scsidone = done; - if (cmd-use_sg) { - qc-__sg = (struct scatterlist *) cmd-request_buffer; - qc-n_elem = cmd-use_sg; - } else if (cmd-request_bufflen) { - qc-__sg = qc-sgent; - qc-n_elem = 1; - } + qc-__sg = (struct scatterlist *) cmd-request_buffer; + qc-n_elem = cmd-use_sg; } else { cmd-result = (DID_OK 16) | (QUEUE_FULL 1); done(cmd); @@ -1504,11 +1499,7 @@ static int ata_scsi_translate(struct ata_device *dev, struct scsi_cmnd *cmd, goto err_did; } - if (cmd-use_sg) - ata_sg_init(qc, cmd-request_buffer, cmd-use_sg); - else - ata_sg_init_one(qc, cmd-request_buffer, - cmd-request_bufflen); + ata_sg_init(qc, cmd-request_buffer, cmd-use_sg); qc-dma_dir = cmd-sc_data_direction; } @@ -1562,15 +1553,14 @@ static unsigned int ata_scsi_rbuf_get(struct scsi_cmnd *cmd, u8 **buf_out) u8 *buf; unsigned int buflen; - if (cmd-use_sg) { - struct scatterlist *sg; + struct scatterlist *sg = (struct scatterlist *) cmd-request_buffer; - sg = (struct scatterlist *) cmd-request_buffer; + if (sg) { buf = kmap_atomic(sg-page, KM_IRQ0) + sg-offset; buflen = sg-length; } else { - buf = cmd-request_buffer; - buflen = cmd-request_bufflen; + buf = NULL; + buflen = 0; } *buf_out = buf; @@ -1590,12 +1580,9 @@ static unsigned int ata_scsi_rbuf_get(struct scsi_cmnd *cmd, u8 **buf_out) static inline void ata_scsi_rbuf_put(struct scsi_cmnd *cmd, u8 *buf) { - if (cmd-use_sg) { - struct scatterlist *sg; - - sg = (struct scatterlist *) cmd-request_buffer; + struct scatterlist *sg = (struct scatterlist *) cmd-request_buffer; + if (sg) kunmap_atomic(buf - sg-offset, KM_IRQ0); - } } /** -- 1.5.3.1 - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] killing sg_last(), and discussion
On Wed, Oct 31 2007 at 10:49 +0200, Jeff Garzik [EMAIL PROTECTED] wrote: I looked into killing sg_last(), but really, this is the best its gonna get (moving sg_last to libata-core.c). You could maybe kill one use with caching, but in the other sg_last() callsites there isn't another s/g loop we can stick a last_sg = sg; into. libata is stuck because we undertake the highly unusual operation of fiddling with the final S/G element, to enforce 32-bit alignment. Of course we could eliminate all that nasty fiddling/padding completely, including sg_last(), if other areas of the kernel would guarantee ahead of time that buffer lengths are always a multiple of 4 Jeff OK Now I'm confused. I thought that ULD's can give you SG's that are actually longer than bufflen and that, at the end, the bufflen should govern the transfer length. Now FS_PC commands are sector aligned so you do not have problems with that. The BLOCK_PC commands have 2 main sources that I know of one is sg bsg from user mode that can easily enforce 4 bytes alignment. The second is kernel services which 80% of these are done by scsi_execute(). All These can be found and fixed. Starting with scsi_execute(). Another place can be blk_rq_map_sg(), since all IO's are bio based. It can enforce alignment too. I would start by sticking a WARN_ON(qc-pad_len) and see if it triggers, what are the sources of that. Please note that the code already has a 4 bytes alignment assumption about the start of the transfer, other wise the first SG can also have a none aligned length, which is not checked for. Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] killing sg_last(), and discussion
On Wed, Oct 31 2007 at 12:29 +0200, Jeff Garzik [EMAIL PROTECTED] wrote: Boaz Harrosh wrote: On Wed, Oct 31 2007 at 10:49 +0200, Jeff Garzik [EMAIL PROTECTED] wrote: I looked into killing sg_last(), but really, this is the best its gonna get (moving sg_last to libata-core.c). You could maybe kill one use with caching, but in the other sg_last() callsites there isn't another s/g loop we can stick a last_sg = sg; into. libata is stuck because we undertake the highly unusual operation of fiddling with the final S/G element, to enforce 32-bit alignment. Of course we could eliminate all that nasty fiddling/padding completely, including sg_last(), if other areas of the kernel would guarantee ahead of time that buffer lengths are always a multiple of 4 Jeff OK Now I'm confused. I thought that ULD's can give you SG's that are actually longer than bufflen and that, at the end, the bufflen should govern the transfer length. Now FS_PC commands are sector aligned so you do not have problems with that. The BLOCK_PC commands have 2 main sources that I know of one is sg bsg from user mode that can easily enforce 4 bytes alignment. The second is kernel services which 80% of these are done by scsi_execute(). All These can be found and fixed. Starting with scsi_execute(). Another place can be blk_rq_map_sg(), since all IO's are bio based. It can enforce alignment too. I would start by sticking a WARN_ON(qc-pad_len) and see if it triggers, what are the sources of that. The whole qc-pad_len etc. machinery was added because it solved problems in the field with ATAPI devices. So sr or some userland application is sending lengths that are not padded to 32-bit boundary, probably because plenty of trivial commands can send or return odd amounts of data. This used to be irrelevant, but now with SATA, even PIO data xfer (normally what is used for non-READ/WRITE CDBs) must be 32-bit aligned because both SATA DMA and SATA PIO are converted into dword-based SATA FIS's on the wire. Jeff 2 things 1. Than why not fix blk_rq_map_sg() to enforce the alignment. Also I bet that these problems in the field are from pre 2.6.18 kernels, and this is no longer the case. Why not put that WARN_ON(qc-pad_len) and prove me wrong. 2. Just checking bufflen is enough. Since you are already assuming that first SG's offset is aligned, than if last SG's length is odd than so is bufflen. (You are already assuming that SG's total length matches bufflen) Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc3-mm2: Result: hostbyte=0x01 driverbyte=0x00\nend_request: I/O error
On Thu, Nov 29 2007 at 1:36 +0200, Andrew Morton [EMAIL PROTECTED] wrote: On Wed, 28 Nov 2007 16:14:21 -0700 Matthew Wilcox [EMAIL PROTECTED] wrote: On Wed, Nov 28, 2007 at 01:40:36PM -0800, Andrew Morton wrote: On Wed, 28 Nov 2007 23:01:31 +0300 Alexey Dobriyan [EMAIL PROTECTED] wrote: Reliably spams dmesg with end_request() horrors. This happens when git starts checking out linux tree to fresh ext2 partition. Disk is several month old and there were no prolems with, say, 2.6.24-rc3: Could you try reverting 6f5391c283d7fdcf24bf40786ea79061919d1e1d and see if the problem still exists? That's not completely trivial.. I did a hand-made revert against 2.6.24-rc3-mm2 (below) but some other patch in there causes: drivers/scsi/scsi_lib.c: In function 'scsi_blk_pc_done': drivers/scsi/scsi_lib.c:1251: error: 'struct scsi_cmnd' has no member named 'request_bufflen' That would be the bidi patches. You need to use scsi_bufflen(cmd) instead of cmd-request_bufflen. (See below) Do you need that I send a patch? --- a/drivers/scsi/scsi.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d +++ a/drivers/scsi/scsi.c @@ -59,7 +59,6 @@ #include scsi/scsi_cmnd.h #include scsi/scsi_dbg.h #include scsi/scsi_device.h -#include scsi/scsi_driver.h #include scsi/scsi_eh.h #include scsi/scsi_host.h #include scsi/scsi_tcq.h @@ -379,8 +378,9 @@ void scsi_log_send(struct scsi_cmnd *cmd scsi_print_command(cmd); if (level 3) { printk(KERN_INFO buffer = 0x%p, bufflen = %d, - queuecommand 0x%p\n, + done = 0x%p, queuecommand 0x%p\n, scsi_sglist(cmd), scsi_bufflen(cmd), + cmd-done, cmd-device-host-hostt-queuecommand); } @@ -667,12 +667,6 @@ void __scsi_done(struct scsi_cmnd *cmd) blk_complete_request(rq); } -/* Move this to a header if it becomes more generally useful */ -static struct scsi_driver *scsi_cmd_to_driver(struct scsi_cmnd *cmd) -{ - return *(struct scsi_driver **)cmd-request-rq_disk-private_data; -} - /** * scsi_finish_command - cleanup and pass command back to upper layer * @cmd: the command @@ -685,8 +679,6 @@ void scsi_finish_command(struct scsi_cmn { struct scsi_device *sdev = cmd-device; struct Scsi_Host *shost = sdev-host; - struct scsi_driver *drv; - unsigned int good_bytes; scsi_device_unbusy(sdev); @@ -712,13 +704,7 @@ void scsi_finish_command(struct scsi_cmn Notifying upper driver of completion (result %x)\n, cmd-result)); - good_bytes = scsi_bufflen(cmd); -if (cmd-request-cmd_type != REQ_TYPE_BLOCK_PC) { - drv = scsi_cmd_to_driver(cmd); - if (drv-done) - good_bytes = drv-done(cmd); - } - scsi_io_completion(cmd, good_bytes); + cmd-done(cmd); } EXPORT_SYMBOL(scsi_finish_command); diff -puN drivers/scsi/scsi_error.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d drivers/scsi/scsi_error.c --- a/drivers/scsi/scsi_error.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d +++ a/drivers/scsi/scsi_error.c @@ -1697,6 +1697,7 @@ scsi_reset_provider(struct scsi_device * scmd-scsi_done = scsi_reset_provider_done_command; memset(scmd-sdb, 0, sizeof(scmd-sdb)); + scmd-done = NULL; scmd-cmd_len = 0; diff -puN drivers/scsi/scsi_lib.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d drivers/scsi/scsi_lib.c --- a/drivers/scsi/scsi_lib.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d +++ a/drivers/scsi/scsi_lib.c @@ -944,6 +944,7 @@ void scsi_end_bidi_request(struct scsi_c scsi_finalize_request(cmd, 1); } +EXPORT_SYMBOL(scsi_io_completion); /* * Function:scsi_io_completion() @@ -1238,6 +1239,18 @@ static struct scsi_cmnd *scsi_get_cmd_fr return cmd; } +static void scsi_blk_pc_done(struct scsi_cmnd *cmd) +{ + BUG_ON(!blk_pc_request(cmd-request)); + /* + * This will complete the whole command with uptodate=1 so + * as far as the block layer is concerned the command completed + * successfully. Since this is a REQ_BLOCK_PC command the + * caller should check the request's errors value + */ + scsi_io_completion(cmd, cmd-request_bufflen); + scsi_io_completion(cmd, scsi_bufflen(cmd)); +} + int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req) { struct scsi_cmnd *cmd; @@ -1285,6 +1298,7 @@ int scsi_setup_blk_pc_cmnd(struct scsi_d cmd-transfersize = req-data_len; cmd-allowed = req-retries; cmd-timeout_per_command = req-timeout; + cmd-done =
Re: [PATCH 01/28] blk_end_request: add new request completion interface (take 3)
On Sat, Dec 01 2007 at 1:24 +0200, Kiyoshi Ueda [EMAIL PROTECTED] wrote: This patch adds 2 new interfaces for request completion: o blk_end_request() : called without queue lock o __blk_end_request() : called with queue lock held Some device drivers call some generic functions below between end_that_request_{first/chunk} and end_that_request_last(). o add_disk_randomness() o blk_queue_end_tag() o blkdev_dequeue_request() These are called in the blk_end_request() as a part of generic request completion. So all device drivers become to call above functions. Normal drivers can be converted to use blk_end_request() in a standard way shown below. a) end_that_request_{chunk/first} spin_lock_irqsave() (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request()) end_that_request_last() spin_unlock_irqrestore() = blk_end_request() b) spin_lock_irqsave() end_that_request_{chunk/first} (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request()) end_that_request_last() spin_unlock_irqrestore() = spin_lock_irqsave() __blk_end_request() spin_unlock_irqsave() c) end_that_request_last() = __blk_end_request() Signed-off-by: Kiyoshi Ueda [EMAIL PROTECTED] Signed-off-by: Jun'ichi Nomura [EMAIL PROTECTED] comments in body --- block/ll_rw_blk.c | 67 + include/linux/blkdev.h |2 + 2 files changed, 69 insertions(+) Index: 2.6.24-rc3-mm2/block/ll_rw_blk.c === --- 2.6.24-rc3-mm2.orig/block/ll_rw_blk.c +++ 2.6.24-rc3-mm2/block/ll_rw_blk.c @@ -3769,6 +3769,73 @@ void end_request(struct request *req, in } EXPORT_SYMBOL(end_request); +static void complete_request(struct request *rq, int uptodate) +{ + if (blk_rq_tagged(rq)) + blk_queue_end_tag(rq-q, rq); + + /* rq-queuelist of dequeued request should be list_empty() */ + if (!list_empty(rq-queuelist)) + blkdev_dequeue_request(rq); + + end_that_request_last(rq, uptodate); +} + +/** + * blk_end_request - Helper function for drivers to complete the request. + * @rq: the request being processed + * @uptodate: 1 for success, 0 for I/O error, 0 for specific error + * @nr_bytes: number of bytes to complete + * + * Description: + * Ends I/O on a number of bytes attached to @rq. + * If @rq has leftover, sets it up for the next range of segments. + * + * Return: + * 0 - we are done with this request + * 1 - still buffers pending for this request + **/ +int blk_end_request(struct request *rq, int uptodate, int nr_bytes) I always hated that uptodate boolean with possible negative error value. I guess it was done for backward compatibility of then users of end_that_request_first(). But since you are introducing a new API then this is not the case. Just have regular status int where 0 means ALL_OK and negative value means error code. Just my $0.02. +{ + struct request_queue *q = rq-q; + unsigned long flags = 0UL; + + if (blk_fs_request(rq) || blk_pc_request(rq)) { + if (__end_that_request_first(rq, uptodate, nr_bytes)) + return 1; + } + + add_disk_randomness(rq-rq_disk); + + spin_lock_irqsave(q-queue_lock, flags); + complete_request(rq, uptodate); + spin_unlock_irqrestore(q-queue_lock, flags); + + return 0; +} +EXPORT_SYMBOL_GPL(blk_end_request); + +/** + * __blk_end_request - Helper function for drivers to complete the request. + * + * Description: + * Must be called with queue lock held unlike blk_end_request(). + **/ +int __blk_end_request(struct request *rq, int uptodate, int nr_bytes) +{ + if (blk_fs_request(rq) || blk_pc_request(rq)) { + if (__end_that_request_first(rq, uptodate, nr_bytes)) + return 1; + } + + add_disk_randomness(rq-rq_disk); + + complete_request(rq, uptodate); + + return 0; +} +EXPORT_SYMBOL_GPL(__blk_end_request); + static void blk_rq_bio_prep(struct request_queue *q, struct request *rq, struct bio *bio) { Index: 2.6.24-rc3-mm2/include/linux/blkdev.h === --- 2.6.24-rc3-mm2.orig/include/linux/blkdev.h +++ 2.6.24-rc3-mm2/include/linux/blkdev.h @@ -725,6 +725,8 @@ static inline void blk_run_address_space * for parts of the original function. This prevents * code duplication in drivers. */ +extern int blk_end_request(struct request *rq, int uptodate, int nr_bytes); +extern int __blk_end_request(struct request *rq, int uptodate, int nr_bytes); extern int end_that_request_first(struct request *, int, int); extern int end_that_request_chunk(struct request *, int, int); extern void end_that_request_last(struct request *, int); I
Re: [PATCH 27/28] blk_end_request: changing scsi mid-layer for bidi (take 3)
On Sat, Dec 01 2007 at 1:35 +0200, Kiyoshi Ueda [EMAIL PROTECTED] wrote: This patch converts bidi of scsi mid-layer to use blk_end_request(). rq-next_rq represents a pair of bidi requests. (There are no other use of 'next_rq' of struct request.) For both requests in the pair, end_that_request_chunk() should be called before end_that_request_last() is called for one of them. Since the calls to end_that_request_first()/chunk() and end_that_request_last() are packaged into blk_end_request(), the handling of next_rq completion has to be moved into blk_end_request(), too. Bidi sets its specific value to rq-data_len before the request is completed so that upper-layer can read it. This setting must be between end_that_request_chunk() and end_that_request_last(), because rq-data_len may be used in end_that_request_chunk() by blk_trace and so on. To satisfy the requirement, use blk_end_request_callback() which is added in PATCH 25 only for the tricky drivers. If bidi didn't reuse rq-data_len and added new members to request for the specific value, it could set before end_that_request_chunk() and use the standard blk_end_request() like below. void scsi_end_bidi_request(struct scsi_cmnd *cmd) { struct request *req = cmd-request; rq-resid = scsi_out(cmd)-resid; rq-next_rq-resid = scsi_in(cmd)-resid; if (blk_end_request(req, 1, req-data_len)) BUG(); scsi_release_buffers(cmd); scsi_next_command(cmd); } Signed-off-by: Kiyoshi Ueda [EMAIL PROTECTED] Signed-off-by: Jun'ichi Nomura [EMAIL PROTECTED] --- block/ll_rw_blk.c | 18 + drivers/scsi/scsi_lib.c | 66 2 files changed, 52 insertions(+), 32 deletions(-) Index: 2.6.24-rc3-mm2/drivers/scsi/scsi_lib.c === --- 2.6.24-rc3-mm2.orig/drivers/scsi/scsi_lib.c +++ 2.6.24-rc3-mm2/drivers/scsi/scsi_lib.c @@ -629,28 +629,6 @@ void scsi_run_host_queues(struct Scsi_Ho scsi_run_queue(sdev-request_queue); } -static void scsi_finalize_request(struct scsi_cmnd *cmd, int uptodate) -{ - struct request_queue *q = cmd-device-request_queue; - struct request *req = cmd-request; - unsigned long flags; - - add_disk_randomness(req-rq_disk); - - spin_lock_irqsave(q-queue_lock, flags); - if (blk_rq_tagged(req)) - blk_queue_end_tag(q, req); - - end_that_request_last(req, uptodate); - spin_unlock_irqrestore(q-queue_lock, flags); - - /* - * This will goose the queue request function at the end, so we don't - * need to worry about launching another command. - */ - scsi_next_command(cmd); -} - /* * Function:scsi_end_request() * @@ -921,6 +899,20 @@ void scsi_release_buffers(struct scsi_cm EXPORT_SYMBOL(scsi_release_buffers); /* + * Called from blk_end_request_callback() after all DATA in rq and its next_rq + * are completed before rq is completed/freed. + */ +static int scsi_end_bidi_request_cb(struct request *rq) +{ + struct scsi_cmnd *cmd = rq-special; + + rq-data_len = scsi_out(cmd)-resid; + rq-next_rq-data_len = scsi_in(cmd)-resid; + + return 0; +} + +/* * Bidi commands Must be complete as a whole, both sides at once. * If part of the bytes were written and lld returned * scsi_in()-resid and/or scsi_out()-resid this information will be left @@ -931,22 +923,32 @@ void scsi_end_bidi_request(struct scsi_c { struct request *req = cmd-request; - end_that_request_chunk(req, 1, req-data_len); - req-data_len = scsi_out(cmd)-resid; - - end_that_request_chunk(req-next_rq, 1, req-next_rq-data_len); - req-next_rq-data_len = scsi_in(cmd)-resid; - - scsi_release_buffers(cmd); - /* *FIXME: If ll_rw_blk.c is changed to also put_request(req-next_rq) - * in end_that_request_last() then this WARN_ON must be removed. + * in blk_end_request() then this WARN_ON must be removed. * for now, upper-driver must have registered an end_io. */ WARN_ON(!req-end_io); - scsi_finalize_request(cmd, 1); + /* + * blk_end_request() family take care of data completion of next_rq. + * + * req-data_len and req-next_rq-data_len must be set after + * all data are completed, since they may be referenced during + * the data completion process. + * So use the callback feature of blk_end_request() here. + * + * NOTE: If bidi doesn't reuse the data_len field for upper-layer's + * reference (e.g. adds new members for it to struct request), + * we can use the standard blk_end_request() interface here. + */ + if (blk_end_request_callback(req, 1, req-data_len, + scsi_end_bidi_request_cb)) + /*
Re: [PATCH 00/28] blk_end_request: full I/O completion handler (take 3)
On Tue, Dec 04 2007 at 14:16 +0200, Jens Axboe [EMAIL PROTECTED] wrote: On Fri, Nov 30 2007, Kiyoshi Ueda wrote: Hello Jens, The following is the updated patch-set for blk_end_request(). Changes since the last version are only minor updates to catch up with the base kernel changes. Do you agree the implementation of blk_end_request()? If there's no problem, could you merge it to your tree? Or does it have to be merged to -mm tree first? Boaz, Could you review the newly added PATCH 27 which converts the bidi part, and give me your comments? It uses blk_end_request_callback() in PATCH 25, which was only for the tricky ide-cd driver. If bidi added a 'resid' member to struct request instead of reusing 'data_len' for the other purpose, it could use the standard blk_end_request() instead. -- Changes from the previous post - Changes between take2 and take3: o Rebased on top of 2.6.24-rc3-mm2 OK, so this means that I can't apply it unfortunately. It depends on other patches in -mm (bidi). SCSI sits on block, so the best approach imho is to base this patchset on mainline so I can include the block bits. I was wishing that the bidi work can go into 2.6.25, I guess that's James to say. If so than it is not important what order they go in. Or the patchset can be submitted in two parts, with scsi and remove-of- old-API in a later stage. Or *rant* Boaz can just rebase his work again *rant*. Kiyoshi, It's OK, if you have a maintainer that is willing to accept your work then go head, My code can wait, no problem. Just resolve the resid issue in scsi_io_completion() (See my other mail) Thanks for doing this Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 27/28] blk_end_request: changing scsi mid-layer for bidi (take 3)
On Thu, Dec 06 2007 at 2:26 +0200, Kiyoshi Ueda [EMAIL PROTECTED] wrote: Hi Boaz, On Tue, 04 Dec 2007 15:39:12 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote: On Sat, Dec 01 2007 at 1:35 +0200, Kiyoshi Ueda [EMAIL PROTECTED] wrote: This patch converts bidi of scsi mid-layer to use blk_end_request(). rq-next_rq represents a pair of bidi requests. (There are no other use of 'next_rq' of struct request.) For both requests in the pair, end_that_request_chunk() should be called before end_that_request_last() is called for one of them. Since the calls to end_that_request_first()/chunk() and end_that_request_last() are packaged into blk_end_request(), the handling of next_rq completion has to be moved into blk_end_request(), too. Bidi sets its specific value to rq-data_len before the request is completed so that upper-layer can read it. This setting must be between end_that_request_chunk() and end_that_request_last(), because rq-data_len may be used in end_that_request_chunk() by blk_trace and so on. To satisfy the requirement, use blk_end_request_callback() which is added in PATCH 25 only for the tricky drivers. If bidi didn't reuse rq-data_len and added new members to request for the specific value, it could set before end_that_request_chunk() and use the standard blk_end_request() like below. void scsi_end_bidi_request(struct scsi_cmnd *cmd) { struct request *req = cmd-request; rq-resid = scsi_out(cmd)-resid; rq-next_rq-resid = scsi_in(cmd)-resid; if (blk_end_request(req, 1, req-data_len)) BUG(); scsi_release_buffers(cmd); scsi_next_command(cmd); } ... snip ... rq-data_len = scsi_out(cmd)-resid is Not Just a problem of bidi it is a General problem of scsi residual handling, and user code. Even today before any bidi. at scsi_lib.c at scsi_io_completion() we do req-data_len = scsi_get_resid(cmd); ( or: req-data_len = cmd-resid; depends which version you look) And then call scsi_end_request() which calls __end_that_request_first/last So it is assumed even today that req-data_len is not touched by __end_that_request_first/last unless __end_that_request_first returned that there is more work to do and the command is resubmitted in which case the resid information is discarded. So if the regular resid handling is acceptable - Set req-data_len before the call to __end_that_request_first/last, or blk_end_request() in your case, then here goes your second client of the _callback and it can be removed. But if it is found that req-data_len is touched and the resid information gets lost, than it should be fixed for the common uni-io case, by - for example - pass resid to the blk_end_request() function. (So in any way the _callback can go) Thank you for the explanation of scsi's rq-data_len usage. I see that scsi usually uses rq-data_len for cmd-resid. I have investigated the possibility of setting data_len before the call to blk_end_request. But no matter whether data_len is touched or not, we need a callback for bidi. So I would like to go with the current patch. I explained the reason and some details below. As far as I can see, rq-data_len is just referenced by blk_add_trace_rq() in __end_that_request_first(), not modified. And I don't change any logic around there in the block-layer. So there shouldn't be any critical problem for scsi residual handing. (although I'm not sure that scsi expectes cmd-resid to be traced by blk_trace.) Anyway, I see that it is no critical problem for bidi to set cmd-resid to rq-data_len before blk_end_request() call. But if I do that, blk_end_request() can't get the next_rq's size to complete in its code below. +/* Bidi request must be completed as a whole */ +if (blk_bidi_rq(rq) +__end_that_request_first(rq-next_rq, uptodate, + blk_rq_bytes(rq-next_rq))) +return 1; So I will have to move next_rq completion to bidi and use _callback() anyway like the following. - static int dummy_cb(struct request *rq) { return 1; } void scsi_end_bidi_request(struct scsi_cmnd *cmd) { struct request *req = cmd-request; unsigned int dlen = req-data_len; unsigned int next_dlen = req-next_rq-data_len; req-data_len = scsi_out(cmd)-resid; req-next_rq-data_len = scsi_in(cmd)-resid; /* Complete only DATA of next_rq using _callback and dummy function */ if (!blk_end_request_callback(req-next_rq, 1, next_dlen, dummy_cb)) BUG(); if (blk_end_request(req, 1, dlen)) BUG(); scsi_release_buffers(cmd); scsi_next_command(cmd); } - I prefer the current patch rather than the code like above, since the code calls
Re: [PATCH 27/28] blk_end_request: changing scsi mid-layer for bidi (take 3)
On Thu, Dec 06 2007 at 21:44 +0200, Kiyoshi Ueda [EMAIL PROTECTED] wrote: Hi Boaz, Jens, On Thu, 06 Dec 2007 11:24:44 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote: Index: 2.6.24-rc3-mm2/drivers/scsi/scsi_lib.c snip No I don't like it. The only client left for blk_end_request_callback() is bidi, ide-cd (cdrom_newpc_intr) is another client. So I can't drop blk_end_request_callback() even if bidi doesn't use it. It looks like all is needed for the ide-cd case is a flag that says don't complete the request. And the call-back is not actually used. (Inspecting the last: [PATCH 26/28] blk_end_request: changing ide-cd (take 3)) The same exact flag could also help the bidi case. Perhaps have an API for that? snip Index: 2.6.24-rc3-mm2/drivers/scsi/scsi_lib.c === --- 2.6.24-rc3-mm2.orig/drivers/scsi/scsi_lib.c +++ 2.6.24-rc3-mm2/drivers/scsi/scsi_lib.c @@ -629,28 +629,6 @@ void scsi_run_host_queues(struct Scsi_Ho scsi_run_queue(sdev-request_queue); } -static void scsi_finalize_request(struct scsi_cmnd *cmd, int uptodate) -{ - struct request_queue *q = cmd-device-request_queue; - struct request *req = cmd-request; - unsigned long flags; - - add_disk_randomness(req-rq_disk); - - spin_lock_irqsave(q-queue_lock, flags); - if (blk_rq_tagged(req)) - blk_queue_end_tag(q, req); - - end_that_request_last(req, uptodate); - spin_unlock_irqrestore(q-queue_lock, flags); - - /* - * This will goose the queue request function at the end, so we don't - * need to worry about launching another command. - */ - scsi_next_command(cmd); -} - /* * Function:scsi_end_request() * @@ -930,23 +908,25 @@ EXPORT_SYMBOL(scsi_release_buffers); void scsi_end_bidi_request(struct scsi_cmnd *cmd) { struct request *req = cmd-request; + unsigned int dlen = req-data_len; + unsigned int next_dlen = req-next_rq-data_len; - end_that_request_chunk(req, 1, req-data_len); req-data_len = scsi_out(cmd)-resid; - - end_that_request_chunk(req-next_rq, 1, req-next_rq-data_len); req-next_rq-data_len = scsi_in(cmd)-resid; - scsi_release_buffers(cmd); - /* *FIXME: If ll_rw_blk.c is changed to also put_request(req-next_rq) - * in end_that_request_last() then this WARN_ON must be removed. + * in blk_end_bidi_request() then this WARN_ON must be removed. * for now, upper-driver must have registered an end_io. */ WARN_ON(!req-end_io); - scsi_finalize_request(cmd, 1); + if (blk_end_bidi_request(req, 1, dlen, next_dlen)) + /* req has not been completed */ + BUG(); + + scsi_release_buffers(cmd); + scsi_next_command(cmd); } /* Index: 2.6.24-rc3-mm2/block/ll_rw_blk.c === --- 2.6.24-rc3-mm2.orig/block/ll_rw_blk.c +++ 2.6.24-rc3-mm2/block/ll_rw_blk.c @@ -3792,24 +3792,17 @@ static void complete_request(struct requ if (!list_empty(rq-queuelist)) blkdev_dequeue_request(rq); + if (blk_bidi_rq(rq) !rq-end_io) + __blk_put_request(rq-next_rq-q, rq-next_rq); + end_that_request_last(rq, uptodate); } -/** - * blk_end_request - Helper function for drivers to complete the request. - * @rq: the request being processed - * @uptodate: 1 for success, 0 for I/O error, 0 for specific error - * @nr_bytes: number of bytes to complete - * - * Description: - * Ends I/O on a number of bytes attached to @rq. - * If @rq has leftover, sets it up for the next range of segments. - * - * Return: - * 0 - we are done with this request - * 1 - still buffers pending for this request - **/ -int blk_end_request(struct request *rq, int uptodate, int nr_bytes) +/* + * Internal function + */ +static int blk_end_io(struct request *rq, int uptodate, int nr_bytes, + int bidi_bytes, int (drv_callback)(struct request *)) { struct request_queue *q = rq-q; unsigned long flags = 0UL; @@ -3817,8 +3810,17 @@ int blk_end_request(struct request *rq, if (blk_fs_request(rq) || blk_pc_request(rq)) { if (__end_that_request_first(rq, uptodate, nr_bytes)) return 1; + + /* Bidi request must be completed as a whole */ + if (blk_bidi_rq(rq) + __end_that_request_first(rq-next_rq, uptodate, bidi_bytes)) + return 1; } + /* Special feature for tricky drivers */ + if (drv_callback drv_callback(rq)) + return 1; + add_disk_randomness(rq-rq_disk); spin_lock_irqsave(q-queue_lock, flags); @@ -3827,6 +3829,32 @@ int blk_end_request(struct request *rq, return 0; } + +int blk_end_bidi_request
Re: [PATCH 20/30] blk_end_request: changing xsysace (take 4)
Kiyoshi Ueda wrote: This patch converts xsysace to use blk_end_request interfaces. Related 'uptodate' arguments are converted to 'error'. xsysace is a little bit different from normal drivers. xsysace driver has a state machine in it. It calls end_that_request_first() and end_that_request_last() from different states. (ACE_FSM_STATE_REQ_TRANSFER and ACE_FSM_STATE_REQ_COMPLETE, respectively.) However, those states are consecutive and without any interruption inbetween. So we can just follow the standard conversion rule (b) mentioned in the patch subject [PATCH 01/30] blk_end_request: add new request completion interface. Cc: Grant Likely [EMAIL PROTECTED] Signed-off-by: Kiyoshi Ueda [EMAIL PROTECTED] Signed-off-by: Jun'ichi Nomura [EMAIL PROTECTED] --- drivers/block/xsysace.c |5 + 1 files changed, 1 insertion(+), 4 deletions(-) Index: 2.6.24-rc4/drivers/block/xsysace.c === --- 2.6.24-rc4.orig/drivers/block/xsysace.c +++ 2.6.24-rc4/drivers/block/xsysace.c @@ -703,7 +703,7 @@ static void ace_fsm_dostate(struct ace_d /* bio finished; is there another one? */ i = ace-req-current_nr_sectors; - if (end_that_request_first(ace-req, 1, i)) { + if (__blk_end_request(ace-req, 0, i)) { end_that_request_first() took sectors __blk_end_request() now takes bytes /* dev_dbg(ace-dev, next block; h=%li c=%i\n, * ace-req-hard_nr_sectors, * ace-req-current_nr_sectors); @@ -718,9 +718,6 @@ static void ace_fsm_dostate(struct ace_d break; case ACE_FSM_STATE_REQ_COMPLETE: - /* Complete the block request */ - blkdev_dequeue_request(ace-req); - end_that_request_last(ace-req, 1); ace-req = NULL; /* Finished request; go to idle state */ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 20/30] blk_end_request: changing xsysace (take 4)
On Wed, Dec 12 2007 at 19:06 +0200, Kiyoshi Ueda [EMAIL PROTECTED] wrote: Hi, On Wed, 12 Dec 2007 11:09:12 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote: Index: 2.6.24-rc4/drivers/block/xsysace.c === --- 2.6.24-rc4.orig/drivers/block/xsysace.c +++ 2.6.24-rc4/drivers/block/xsysace.c @@ -703,7 +703,7 @@ static void ace_fsm_dostate(struct ace_d /* bio finished; is there another one? */ i = ace-req-current_nr_sectors; - if (end_that_request_first(ace-req, 1, i)) { + if (__blk_end_request(ace-req, 0, i)) { end_that_request_first() took sectors __blk_end_request() now takes bytes Thank you for pointing it out! And I'm very sorry for the bug. I have checked all conversions between sectors and bytes through all patches again, and I found no other miss conversions. Below is the revised patch for xsysace. Thanks, Kiyoshi Ueda NP, I know how it is, after you rebased your work for the who's counting times, you become blind. You need fresh eyes to look at it. Thanks for doing all this. Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Current git -- kaboom [bisect] seems IDE related.
On Sun, Feb 10 2008 at 16:43 +0200, Christoph Hellwig [EMAIL PROTECTED] wrote: On Sun, Feb 10, 2008 at 02:38:46PM +0100, Bartlomiej Zolnierkiewicz wrote: The OOPS is most likely (again) my fault - I was rushing out to push out the fix and memset() line didn't get converted. The new patch works fine for me. I prepared the new patch, documented it and started looking into SCSI build breakage... and I no longer feel comfortable with the hack :( It seems that fixing IDE properly will be easier than auditing the whole SCSI for all the weird assumptions on rq-cmd[] size (James?) so I'm back to the code, in the meantime here's the updated patch: Yeah, this is quite nasty. I'll attach the patch below which just rejects a command in scsi_setup_blk_pc_cmnd if it's too large for the scsi_cmnd cmnd array. This is probably enough but I haven't audited all of the scsi code yet. But as James said this is too much of a memory vastage to put it into the tree. Long-term the Panasas folks have looked into killing the scsi_cmnd.cmnd filed entirely and make the struct request.cmd field dynamically sized which would solve your problem, but probably won't be ready for 2.6.25. snip As far as I'm concerned it is very ready, and I have sent a last version for inclusion into 2.6.25. - There is a very minor patch-ability problem between last patchset and scsi-misc I will resend the pachset as a reply to this mail. - Since I never got any comments from Jens or James, this code was never accepted into -mm. So it was not widely tested. Though I have thrown every test I can on these patches. But that is still, a very limited testing. If people have a bit of spare time, please review. For some of us it is very important Thanks Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] scsi: varlen extended and vendor-specific cdbs
Add support for variable-length, extended, and vendor specific CDBs to scsi-ml. It is now possible for initiators and ULD's to issue these types of commands. LLDs need not change much. All they need is to raise the .max_cmd_len to the longest command they support (see iscsi patches). - clean-up some code paths that did not expect commands to be larger than 16, and change cmd_len members' type to short as char is not enough. - Add support for varlen_cdb in scsi_execute. Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] Signed-off-by: Benny Halevy [EMAIL PROTECTED] --- block/scsi_ioctl.c |4 ++-- drivers/scsi/constants.c | 10 +++--- drivers/scsi/scsi.c | 22 +++--- drivers/scsi/scsi_lib.c | 27 ++- include/scsi/scsi.h | 40 +--- include/scsi/scsi_cmnd.h |2 +- include/scsi/scsi_host.h |8 +++- 7 files changed, 75 insertions(+), 38 deletions(-) diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c index 9675b34..a1d7070 100644 --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -33,13 +33,13 @@ #include scsi/scsi_cmnd.h /* Command group 3 is reserved and should never be used. */ -const unsigned char scsi_command_size[8] = +const unsigned char scsi_command_size_tbl[8] = { 6, 10, 10, 12, 16, 12, 10, 10 }; -EXPORT_SYMBOL(scsi_command_size); +EXPORT_SYMBOL(scsi_command_size_tbl); #include scsi/sg.h diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c index 403a7f2..9785d73 100644 --- a/drivers/scsi/constants.c +++ b/drivers/scsi/constants.c @@ -28,7 +28,6 @@ #define SERVICE_ACTION_OUT_12 0xa9 #define SERVICE_ACTION_IN_16 0x9e #define SERVICE_ACTION_OUT_16 0x9f -#define VARIABLE_LENGTH_CMD 0x7f @@ -210,7 +209,7 @@ static void print_opcode_name(unsigned char * cdbp, int cdb_len) cdb0 = cdbp[0]; switch(cdb0) { case VARIABLE_LENGTH_CMD: - len = cdbp[7] + 8; + len = scsi_varlen_cdb_length(cdbp); if (len 10) { printk(short variable length command, len=%d ext_len=%d, len, cdb_len); @@ -300,7 +299,7 @@ static void print_opcode_name(unsigned char * cdbp, int cdb_len) cdb0 = cdbp[0]; switch(cdb0) { case VARIABLE_LENGTH_CMD: - len = cdbp[7] + 8; + len = scsi_varlen_cdb_length(cdbp); if (len 10) { printk(short opcode=0x%x command, len=%d ext_len=%d, cdb0, len, cdb_len); @@ -335,10 +334,7 @@ void __scsi_print_command(unsigned char *cdb) int k, len; print_opcode_name(cdb, 0); - if (VARIABLE_LENGTH_CMD == cdb[0]) - len = cdb[7] + 8; - else - len = COMMAND_SIZE(cdb[0]); + len = scsi_command_size(cdb); /* print out all bytes in cdb */ for (k = 0; k len; ++k) printk( %02x, cdb[k]); diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index fecba05..944fafa 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -79,15 +79,6 @@ static void scsi_done(struct scsi_cmnd *cmd); #define MIN_RESET_PERIOD (15*HZ) /* - * Macro to determine the size of SCSI command. This macro takes vendor - * unique commands into account. SCSI commands in groups 6 and 7 are - * vendor unique and we will depend upon the command length being - * supplied correctly in cmd_len. - */ -#define CDB_SIZE(cmd) (cmd)-cmnd[0] 5) 7) 6) ? \ - COMMAND_SIZE((cmd)-cmnd[0]) : (cmd)-cmd_len) - -/* * Note - the initial logging level can be set here to log events at boot time. * After the system is up, you may enable logging via the /proc interface. */ @@ -525,6 +516,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd) unsigned long flags = 0; unsigned long timeout; int rtn = 0; + unsigned cmd_len; /* check if the device is still usable */ if (unlikely(cmd-device-sdev_state == SDEV_DEL)) { @@ -606,9 +598,17 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd) * Before we queue this command, check if the command * length exceeds what the host adapter can handle. */ - if (CDB_SIZE(cmd) cmd-device-host-max_cmd_len) { + cmd_len = cmd-cmd_len; + if (!cmd_len) { + BUG_ON(cmd-cmnd[0] == VARIABLE_LENGTH_CMD); + cmd_len = COMMAND_SIZE((cmd)-cmnd[0]); + } + + if (cmd_len cmd-device-host-max_cmd_len) { SCSI_LOG_MLQUEUE(3, - printk(queuecommand : command too long.\n)); + printk(queuecommand : command too long. + cdb_size=%d host-max_cmd_len=%d\n, + cmd-cmd_len, cmd-device-host-max_cmd_len)); cmd-result = (DID_ABORT 16); scsi_done
[PATCH 2/3] block layer varlen-cdb
- add varlen_cdb and varlen_cdb_len to hold a large user cdb if needed. They start as empty. Allocation of buffer must be done by user and held until request execution is done. - Since there can be either a fix_length command up to 16 bytes or a variable_length, larger then 16 bytes, commands but never both, we hold the two types in a union to save space. The presence of varlen_cdb_len and cmd_len==0 signals a varlen_cdb mode. - Please use added rq_{set,get}_varlen_cdb() to set every thing up in a way that will not confuse drivers that do not support varlen_cdb's. - Note that this patch does not add any size to struct request since the unsigned cmd_len is split here to 2 ushorts, which is more then enough. Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] --- block/blk-core.c |2 ++ include/linux/blkdev.h | 27 +-- 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 4afb39c..1c5cfa7 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -124,6 +124,8 @@ void rq_init(struct request_queue *q, struct request *rq) rq-end_io_data = NULL; rq-completion_data = NULL; rq-next_rq = NULL; + rq-varlen_cdb_len = 0; + rq-varlen_cdb = NULL; } static void req_bio_endio(struct request *rq, struct bio *bio, diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 90392a9..a8a6c20 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -211,8 +211,15 @@ struct request { /* * when request is used as a packet command carrier */ - unsigned int cmd_len; - unsigned char cmd[BLK_MAX_CDB]; + unsigned short cmd_len; + unsigned short varlen_cdb_len; /* length of varlen_cdb buffer */ + union { + unsigned char cmd[BLK_MAX_CDB]; + unsigned char *varlen_cdb;/* an optional variable-length cdb. + * points to a user buffer that must + * be valid until end of request + */ + }; unsigned int data_len; unsigned int sense_len; @@ -478,6 +485,22 @@ enum { #define rq_is_sync(rq) (rq_data_dir((rq)) == READ || (rq)-cmd_flags REQ_RW_SYNC) #define rq_is_meta(rq) ((rq)-cmd_flags REQ_RW_META) +static inline void rq_set_varlen_cdb(struct request *rq, u8 *cdb, short cdb_len) +{ + rq-cmd_len = 0; /* Make sure legacy drivers don't get confused */ + rq-varlen_cdb_len = cdb_len; + rq-varlen_cdb = cdb; +} + +/* If ! a varlen_cdb than return NULL */ +static inline u8 *rq_get_varlen_cdb(struct request *rq) +{ + if (!rq-cmd_len rq-varlen_cdb_len) + return rq-varlen_cdb; + else + return NULL; +} + static inline int blk_queue_full(struct request_queue *q, int rw) { if (rw == READ) -- 1.5.3.3 - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHSET 0/3] varlen extended and vendor-specific cdbs
Submitted is a patchset for adding support for variable-length, extended, and vendor specific CDBs. It should now cover the entire range of the SCSI standard. (and/or any other use of command packets in block devices) They are based on scsi-misc. Difference from last time, is at struct request. I did a smallish hack to save the extra pointer space. The pointer now shares it's space with the request-cmd, as they are mutual exclusive. The flag to switch between them is when cmd_len == 0 and varlen_cdb_len != 0 I've added 2 accessors to hide the mess. I think this approach should be safe. I can easily go back to the separate pointer, but I figured a little hack is better then a bloat. This is on top of the size shrink to struct scsi_cmnd gained in the first patch. We save upto 12 bytes on 32 bit ARCHs So over all, this cleans things up, and add fixtures without any extra cost. [1/3] Let scsi_cmnd-cmnd use request-cmd buffer Here I let scsi_cmnd-cmnd point to the space allocated by request-cmd, instead of copying the data. The scsi_cmnd-cmd_len is guaranteed to contain the right length of the command. I have tried to go over every single place in the kernel that uses scsi_cmnd-cmnd and make sure it looks sane. Surprisingly to me, that was not at all bad. I hope I did not miss anything. I've tested on an x86_64 machine booting from a sata disk and ran the iscsi regression tests as well as my bidi and varlen tests on top of the complete patchset and all tests passed. [2/3] block layer varlen-cd Here I added an option to use a user supplied buffer for an arbitrary large command. Buffer must be kept valid until execution of request ends. The pointer to the buffer shares it's space with the fixed length command, so the size of struct request does not change. [3/3] scsi: varlen extended and vendor-specific cdbs Adds support for variable length, extended, and vendor-specific cdbs at the scsi mid-layer. Bart: If you need this infrastructure see second patch for an easy to use inline accessors on struct request. If you then need to use them in a scsi LLD, thats easy same as before only cmd_len will be bigger then 16. If you need to use them in a block LLD. You need to use the accessors and an extra if() statement. See 3rd patch on how scsi_lib.c uses it. Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Subject: [PATCH 1/3] Let scsi_cmnd-cmnd use request-cmd buffer
- struct scsi_cmnd had a 16 bytes command buffer of its own. This is an unnecessary duplication and copy of request's cmd. It is probably left overs from the time that scsi_cmnd could function without a request attached. So clean that up. - Once above is done, few places, apart from scsi-ml, needed adjustments due to changing the data type of scsi_cmnd-cmnd. - Lots of drivers still use MAX_COMMAND_SIZE. So I have left that #define but equate it to BLK_MAX_CDB. The way I see it and is reflected in the patch below is. MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB as per the SCSI standard and is not related to the implementation. BLK_MAX_CDB. - The allocated space at the request level (*)fixed-length here means commands that their size can be determined by their opcode and the CDB does not carry a length specifier, like the VARIABLE_LENGTH_CMD(0x7f) command. This is actually not exactly true and the SCSI standard also defines extended commands and vendor specific commands that can be bigger than 16 bytes. The kernel will support these using the same infrastructure used for VARLEN CDB's. So in effect MAX_COMMAND_SIZE means the maximum size command scsi-ml supports without specifying a cmd_len by ULD's Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] --- drivers/firewire/fw-sbp2.c |2 +- drivers/s390/scsi/zfcp_dbf.c |2 +- drivers/s390/scsi/zfcp_fsf.c |2 +- drivers/scsi/53c700.c|6 +++--- drivers/scsi/a100u2w.c |2 +- drivers/scsi/hptiop.c|6 +++--- drivers/scsi/ibmvscsi/ibmvscsi.c |2 +- drivers/scsi/initio.c|2 +- drivers/scsi/qla1280.c |4 ++-- drivers/scsi/scsi_error.c| 14 -- drivers/scsi/scsi_lib.c |5 +++-- drivers/scsi/scsi_tgt_lib.c |2 ++ drivers/usb/storage/isd200.c |2 ++ include/scsi/scsi_cmnd.h |9 +++-- include/scsi/scsi_eh.h |4 ++-- 15 files changed, 38 insertions(+), 26 deletions(-) diff --git a/drivers/firewire/fw-sbp2.c b/drivers/firewire/fw-sbp2.c index 19ece9b..1d9602b 100644 --- a/drivers/firewire/fw-sbp2.c +++ b/drivers/firewire/fw-sbp2.c @@ -1254,7 +1254,7 @@ static int sbp2_scsi_queuecommand(struct scsi_cmnd *cmd, scsi_done_fn_t done) memset(orb-request.command_block, 0, sizeof(orb-request.command_block)); - memcpy(orb-request.command_block, cmd-cmnd, COMMAND_SIZE(*cmd-cmnd)); + memcpy(orb-request.command_block, cmd-cmnd, cmd-cmd_len); orb-base.callback = complete_command_orb; orb-base.request_bus = diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c index 701046c..7a7f619 100644 --- a/drivers/s390/scsi/zfcp_dbf.c +++ b/drivers/s390/scsi/zfcp_dbf.c @@ -724,7 +724,7 @@ _zfcp_scsi_dbf_event_common(const char *tag, const char *tag2, int level, rec-scsi_result = scsi_cmnd-result; rec-scsi_cmnd = (unsigned long)scsi_cmnd; rec-scsi_serial = scsi_cmnd-serial_number; - memcpy(rec-scsi_opcode, scsi_cmnd-cmnd, + memcpy(rec-scsi_opcode, scsi_cmnd-cmnd, min((int)scsi_cmnd-cmd_len, ZFCP_DBF_SCSI_OPCODE)); rec-scsi_retries = scsi_cmnd-retries; diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c index 0dff058..1abbac5 100644 --- a/drivers/s390/scsi/zfcp_fsf.c +++ b/drivers/s390/scsi/zfcp_fsf.c @@ -4220,7 +4220,7 @@ zfcp_fsf_send_fcp_command_task_handler(struct zfcp_fsf_req *fsf_req) ZFCP_LOG_TRACE(scpnt-result =0x%x, command was:\n, scpnt-result); ZFCP_HEX_DUMP(ZFCP_LOG_LEVEL_TRACE, - (void *) scpnt-cmnd, scpnt-cmd_len); + scpnt-cmnd, scpnt-cmd_len); ZFCP_LOG_TRACE(%i bytes sense data provided by FCP\n, fcp_rsp_iu-fcp_sns_len); diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c index f4c4fe9..f5a9add 100644 --- a/drivers/scsi/53c700.c +++ b/drivers/scsi/53c700.c @@ -599,7 +599,7 @@ NCR_700_scsi_done(struct NCR_700_Host_Parameters *hostdata, (struct NCR_700_command_slot *)SCp-host_scribble; dma_unmap_single(hostdata-dev, slot-pCmd, -sizeof(SCp-cmnd), DMA_TO_DEVICE); +MAX_COMMAND_SIZE, DMA_TO_DEVICE); if (slot-flags == NCR_700_FLAG_AUTOSENSE) { char *cmnd = NCR_700_get_sense_cmnd(SCp-device); #ifdef NCR_700_DEBUG @@ -1004,7 +1004,7 @@ process_script_interrupt(__u32 dsps, __u32 dsp, struct scsi_cmnd *SCp
Re: Subject: [PATCH 1/3] Let scsi_cmnd-cmnd use request-cmd buffer
On Tue, Feb 12 2008 at 19:45 +0200, Christoph Hellwig [EMAIL PROTECTED] wrote: On Sun, Feb 10, 2008 at 09:05:17PM +0200, Boaz Harrosh wrote: - Lots of drivers still use MAX_COMMAND_SIZE. So I have left that #define but equate it to BLK_MAX_CDB. The way I see it and is reflected in the patch below is. MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB as per the SCSI standard and is not related to the implementation. BLK_MAX_CDB. - The allocated space at the request level (*)fixed-length here means commands that their size can be determined by their opcode and the CDB does not carry a length specifier, like the VARIABLE_LENGTH_CMD(0x7f) command. This is actually not exactly true and the SCSI standard also defines extended commands and vendor specific commands that can be bigger than 16 bytes. The kernel will support these using the same infrastructure used for VARLEN CDB's. So in effect MAX_COMMAND_SIZE means the maximum size command scsi-ml supports without specifying a cmd_len by ULD's A comment like this should be near the declaration of MAX_COMMAND_SIZE +#define MAX_COMMAND_SIZE 16 +#if (MAX_COMMAND_SIZE BLK_MAX_CDB) +# error MAX_COMMAND_SIZE can not be smaller than BLK_MAX_CDB +#endif No tabs between the # and the rest of the cpp command, please. Either nothing or a single space as indentation instead. Except for those two small nitpicks this looks very good to me. Nice memory saving aswel. Agree with both comments. Thanks for the review, will fix in the next submission. Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] block layer varlen-cdb
On Tue, Feb 12 2008 at 19:54 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote: On Tue, Feb 12 2008 at 19:48 +0200, Christoph Hellwig [EMAIL PROTECTED] wrote: On Sun, Feb 10, 2008 at 09:09:41PM +0200, Boaz Harrosh wrote: - add varlen_cdb and varlen_cdb_len to hold a large user cdb if needed. They start as empty. Allocation of buffer must be done by user and held until request execution is done. - Since there can be either a fix_length command up to 16 bytes or a variable_length, larger then 16 bytes, commands but never both, we hold the two types in a union to save space. The presence of varlen_cdb_len and cmd_len==0 signals a varlen_cdb mode. this one I'm a bit confused by, why can't we just set the length of the variable length command in cmd_len aswell, and if cmd_len the length of the cmd array it's a variable length command? Note that this is both to keep the logic simpler and not to grow struct request further. Especially for the rather rare case of a bidi command. Because this will be dangerous for the Legacy block devices. Unlike scsi drivers block drivers do not have a .max_cmnd_len and upper layer will not check to make sure that the device supports the longer command. If such a command goes through, lets say bsg the drivers do blindly memcpy(,,rq-cmd_len) and will crash. Better safe then sorry, at no cost. Boaz - PS: the struct request does not grow. Because before cmd_len was unsigned but now both cmd_len and varlen_cdb_len are short so it is the same as before. Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] block layer varlen-cdb
On Tue, Feb 12 2008 at 19:48 +0200, Christoph Hellwig [EMAIL PROTECTED] wrote: On Sun, Feb 10, 2008 at 09:09:41PM +0200, Boaz Harrosh wrote: - add varlen_cdb and varlen_cdb_len to hold a large user cdb if needed. They start as empty. Allocation of buffer must be done by user and held until request execution is done. - Since there can be either a fix_length command up to 16 bytes or a variable_length, larger then 16 bytes, commands but never both, we hold the two types in a union to save space. The presence of varlen_cdb_len and cmd_len==0 signals a varlen_cdb mode. this one I'm a bit confused by, why can't we just set the length of the variable length command in cmd_len aswell, and if cmd_len the length of the cmd array it's a variable length command? Note that this is both to keep the logic simpler and not to grow struct request further. Especially for the rather rare case of a bidi command. Because this will be dangerous for the Legacy block devices. Unlike scsi drivers block drivers do not have a .max_cmnd_len and upper layer will not check to make sure that the device supports the longer command. If such a command goes through, lets say bsg the drivers do blindly memcpy(,,rq-cmd_len) and will crash. Better safe then sorry, at no cost. Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Subject: [PATCH 1/3] Let scsi_cmnd-cmnd use request-cmd buffer
On Tue, Feb 12 2008 at 21:41 +0200, James Bottomley [EMAIL PROTECTED] wrote: On Sun, 2008-02-10 at 21:05 +0200, Boaz Harrosh wrote: - struct scsi_cmnd had a 16 bytes command buffer of its own. This is an unnecessary duplication and copy of request's cmd. It is probably left overs from the time that scsi_cmnd could function without a request attached. So clean that up. - Once above is done, few places, apart from scsi-ml, needed adjustments due to changing the data type of scsi_cmnd-cmnd. - Lots of drivers still use MAX_COMMAND_SIZE. So I have left that #define but equate it to BLK_MAX_CDB. The way I see it and is reflected in the patch below is. MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB as per the SCSI standard and is not related to the implementation. BLK_MAX_CDB. - The allocated space at the request level (*)fixed-length here means commands that their size can be determined by their opcode and the CDB does not carry a length specifier, like the VARIABLE_LENGTH_CMD(0x7f) command. This is actually not exactly true and the SCSI standard also defines extended commands and vendor specific commands that can be bigger than 16 bytes. The kernel will support these using the same infrastructure used for VARLEN CDB's. So in effect MAX_COMMAND_SIZE means the maximum size command scsi-ml supports without specifying a cmd_len by ULD's When we do this, what happens to the minority of drivers that need the command in DMAable memory ... or have you audited them all and we can now dump the DMA pool allocation for SCSI commands? James Am I right in assuming that I only need to audited the drivers that have .unchecked_isa_dma set? I will redo this audit again, and report back. Boaz - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html