Re: status of drivers/scsi/scsi_tgt_lib.c
On Sun, 26 Jan 2014 23:50:38 -0800 Christoph Hellwig h...@infradead.org wrote: Do you know if either SRP or ibmvscsi are used regularly these days? Does tgtd userspace still support this interface? It doesn't. I think that ibm pseries switched to virtual fc driver from virtual srp long ago. Only very old pseries could use the driver. I'm not sure there are any users of the driver. I guess that you could try to remove the driver to see if someone would complain. Should we try to throw in a deprecation warnings for this merge window and see if anyone cares, or go ahead and remove it entirely? Either is fine by me. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: status of drivers/scsi/scsi_tgt_lib.c
Hey, On Fri, 24 Jan 2014 05:15:25 -0800 Christoph Hellwig h...@infradead.org wrote: Do you know if either SRP or ibmvscsi are used regularly these days? Does tgtd userspace still support this interface? It doesn't. I think that ibm pseries switched to virtual fc driver from virtual srp long ago. Only very old pseries could use the driver. I'm not sure there are any users of the driver. I guess that you could try to remove the driver to see if someone would complain. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] [SCSI] bnx2fc: zero out sense buffer properly
On Sat, 18 Aug 2012 11:46:37 +0300 Dan Carpenter dan.carpen...@oracle.com wrote: -sense_buffer used to be an array but it changed to pointer in de25deb180 [SCSI] use dynamically allocated sense buffer. This call to memset() needs to be updated as well. Signed-off-by: Dan Carpenter dan.carpen...@oracle.com diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c index 73f231c..8d4626c 100644 --- a/drivers/scsi/bnx2fc/bnx2fc_io.c +++ b/drivers/scsi/bnx2fc/bnx2fc_io.c @@ -1807,7 +1807,7 @@ static void bnx2fc_parse_fcp_rsp(struct bnx2fc_cmd *io_req, fcp_sns_len = SCSI_SENSE_BUFFERSIZE; } - memset(sc_cmd-sense_buffer, 0, sizeof(sc_cmd-sense_buffer)); + memset(sc_cmd-sense_buffer, 0, SCSI_SENSE_BUFFERSIZE); I guess that you can remove the line instead. IIRC, scsi-ml does it for LLDs. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] ps3rom: Simplify fill_from_dev_buffer()
On Mon, 25 Feb 2008 14:24:31 +0100 (CET) Geert Uytterhoeven [EMAIL PROTECTED] wrote: Subject: [PATCH] ps3rom: Simplify fill_from_dev_buffer() As we no longer need to calculate the data length of the whole scatterlist, we can abort the loop earlier and coalesce req_len and act_len into one variable, making fill_from_dev_buffer() more similar to fetch_to_dev_buffer(). I'll add new APIs to copy data between a sg list and a buffer soon after cleaning up (and fixing) some drivers on this area. I plan to remove fill_from_dev_buffer and fetch_to_dev_buffer in ps3rom. As you know, they are same functions that scsi_debug uses. There are other drivers that need similar functions. I expect my ps3rom patch to be applied to the scsi-fixes tree so I didn't change much as you did in this patch. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] aacraid: READ_CAPACITY_16 shouldn't trust allocation length in cdb
When aacraid spoofs READ_CAPACITY_16, it assumes that the data length in the sg list is equal to allocation length in cdb. But sg can put any value in scb so the driver needs to check both the data length in the sg list and allocation length in cdb. If allocation length is larger than the response length that the driver expects, it clears the data buffer in the sg list to zero but it doesn't need to do. Just setting resid is fine. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Mark Salyzyn [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] --- drivers/scsi/aacraid/aachba.c | 22 +++--- 1 files changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c index c05092f..b9fc9b1 100644 --- a/drivers/scsi/aacraid/aachba.c +++ b/drivers/scsi/aacraid/aachba.c @@ -2047,6 +2047,7 @@ int aac_scsi_cmd(struct scsi_cmnd * scsicmd) { u64 capacity; char cp[13]; + unsigned int alloc_len; dprintk((KERN_DEBUG READ CAPACITY_16 command.\n)); capacity = fsa_dev_ptr[cid].size - 1; @@ -2063,18 +2064,17 @@ int aac_scsi_cmd(struct scsi_cmnd * scsicmd) cp[10] = 2; cp[11] = 0; cp[12] = 0; - aac_internal_transfer(scsicmd, cp, 0, - min_t(size_t, scsicmd-cmnd[13], sizeof(cp))); - if (sizeof(cp) scsicmd-cmnd[13]) { - unsigned int len, offset = sizeof(cp); - memset(cp, 0, offset); - do { - len = min_t(size_t, scsicmd-cmnd[13] - offset, - sizeof(cp)); - aac_internal_transfer(scsicmd, cp, offset, len); - } while ((offset += len) scsicmd-cmnd[13]); - } + alloc_len = ((scsicmd-cmnd[10] 24) ++ (scsicmd-cmnd[11] 16) ++ (scsicmd-cmnd[12] 8) + scsicmd-cmnd[13]); + + alloc_len = min_t(size_t, alloc_len, sizeof(cp)); + aac_internal_transfer(scsicmd, cp, 0, alloc_len); + + if (alloc_len scsi_bufflen(scsicmd)) + scsi_set_resid(scsicmd, + scsi_bufflen(scsicmd) - alloc_len); /* Do not cache partition table for arrays */ scsicmd-device-removable = 1; -- 1.5.3.7 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] scsi_debug: fix wrong resid calculation bug
sg driver rounds up the length in struct scatterlist to be a multiple of 512 in some conditions. So LLDs can't use the data length in a sg list to calculate residual. Instead, the length in struct scsi_cmnd should be used. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Douglas Gilbert [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] --- drivers/scsi/scsi_debug.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index d1777a9..691efd9 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -651,7 +651,7 @@ static int fill_from_dev_buffer(struct scsi_cmnd * scp, unsigned char * arr, if (sdb-resid) sdb-resid -= act_len; else - sdb-resid = req_len - act_len; + sdb-resid = scsi_bufflen(scp) - act_len; return 0; } -- 1.5.3.7 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ps3rom: fix wrong resid calculation bug
sg driver rounds up the length in struct scatterlist to be a multiple of 512 in some conditions. So LLDs can't use the data length in a sg list to calculate residual. Instead, the length in struct scsi_cmnd should be used. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Geert Uytterhoeven [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] --- drivers/scsi/ps3rom.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/ps3rom.c b/drivers/scsi/ps3rom.c index 0cd614a..6944bda 100644 --- a/drivers/scsi/ps3rom.c +++ b/drivers/scsi/ps3rom.c @@ -124,7 +124,7 @@ static int fill_from_dev_buffer(struct scsi_cmnd *cmd, const void *buf) } req_len += sgpnt-length; } - scsi_set_resid(cmd, req_len - act_len); + scsi_set_resid(cmd, scsi_bufflen(cmd) - act_len); return 0; } -- 1.5.3.7 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] stex: stex_direct_copy shouldn't call dma_map_sg
stex_direct_copy copies an in-kernel buffer to a sg list in order to spoof some SCSI commands. stex_direct_copy calls dma_map_sg and then stex_internal_copy with the value that dma_map_sg returned. It calls scsi_kmap_atomic_sg to copy data. scsi_kmap_atomic_sg doesn't see sg-dma_length so if dma_map_sg merges sg entries, stex_internal_copy gets the smaller number of sg entries than the acutual number, which means it wrongly think that the data length in the sg list is shorter than the actual length. stex_direct_copy shouldn't call dma_map_sg and it doesn't need since this code path doesn't involve dma transfers. This patch removes stex_direct_copy and simply calls stex_internal_copy with the actual number of sg entries. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Ed Lin [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] --- drivers/scsi/stex.c | 34 -- 1 files changed, 12 insertions(+), 22 deletions(-) diff --git a/drivers/scsi/stex.c b/drivers/scsi/stex.c index 72f6d80..4b6861c 100644 --- a/drivers/scsi/stex.c +++ b/drivers/scsi/stex.c @@ -461,23 +461,6 @@ static void stex_internal_copy(struct scsi_cmnd *cmd, } } -static int stex_direct_copy(struct scsi_cmnd *cmd, - const void *src, size_t count) -{ - size_t cp_len = count; - int n_elem = 0; - - n_elem = scsi_dma_map(cmd); - if (n_elem 0) - return 0; - - stex_internal_copy(cmd, src, cp_len, n_elem, ST_TO_CMD); - - scsi_dma_unmap(cmd); - - return cp_len == count; -} - static void stex_controller_info(struct st_hba *hba, struct st_ccb *ccb) { struct st_frame *p; @@ -569,8 +552,10 @@ stex_queuecommand(struct scsi_cmnd *cmd, void (* done)(struct scsi_cmnd *)) unsigned char page; page = cmd-cmnd[2] 0x3f; if (page == 0x8 || page == 0x3f) { - stex_direct_copy(cmd, ms10_caching_page, - sizeof(ms10_caching_page)); + size_t cp_len = sizeof(ms10_caching_page); + stex_internal_copy(cmd, ms10_caching_page, + cp_len, scsi_sg_count(cmd), + ST_TO_CMD); cmd-result = DID_OK 16 | COMMAND_COMPLETE 8; done(cmd); } else @@ -599,8 +584,10 @@ stex_queuecommand(struct scsi_cmnd *cmd, void (* done)(struct scsi_cmnd *)) if (id != host-max_id - 1) break; if (lun == 0 (cmd-cmnd[1] INQUIRY_EVPD) == 0) { - stex_direct_copy(cmd, console_inq_page, - sizeof(console_inq_page)); + size_t cp_len = sizeof(console_inq_page); + stex_internal_copy(cmd, console_inq_page, + cp_len, scsi_sg_count(cmd), + ST_TO_CMD); cmd-result = DID_OK 16 | COMMAND_COMPLETE 8; done(cmd); } else @@ -609,6 +596,7 @@ stex_queuecommand(struct scsi_cmnd *cmd, void (* done)(struct scsi_cmnd *)) case PASSTHRU_CMD: if (cmd-cmnd[1] == PASSTHRU_GET_DRVVER) { struct st_drvver ver; + size_t cp_len = sizeof(ver); ver.major = ST_VER_MAJOR; ver.minor = ST_VER_MINOR; ver.oem = ST_OEM; @@ -616,7 +604,9 @@ stex_queuecommand(struct scsi_cmnd *cmd, void (* done)(struct scsi_cmnd *)) ver.signature[0] = PASSTHRU_SIGNATURE; ver.console_id = host-max_id - 1; ver.host_no = hba-host-host_no; - cmd-result = stex_direct_copy(cmd, ver, sizeof(ver)) ? + stex_internal_copy(cmd, ver, cp_len, + scsi_sg_count(cmd), ST_TO_CMD); + cmd-result = sizeof(ver) == cp_len ? DID_OK 16 | COMMAND_COMPLETE 8 : DID_ERROR 16 | COMMAND_COMPLETE 8; done(cmd); -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] stex: stex_internal_copy should be called with sg_count in struct st_ccb
stex_internal_copy copies an in-kernel buffer to a sg list by using scsi_kmap_atomic_sg. Some functions calls stex_internal_copy with sg_count in struct st_ccb, which is the value that dma_map_sg returned. However it might be shorter than the actual number of sg entries (if the IOMMU merged the sg entries). scsi_kmap_atomic_sg doesn't see sg-dma_length so stex_internal_copy should be called with the actual number of sg entries (i.e. scsi_sg_count), because if the sg entries were merged, stex_direct_copy wrongly think that the data length in the sg list is shorter than the actual length. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Ed Lin [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] --- drivers/scsi/stex.c | 10 ++ 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/stex.c b/drivers/scsi/stex.c index 4b6861c..654430e 100644 --- a/drivers/scsi/stex.c +++ b/drivers/scsi/stex.c @@ -467,7 +467,8 @@ static void stex_controller_info(struct st_hba *hba, struct st_ccb *ccb) size_t count = sizeof(struct st_frame); p = hba-copy_buffer; - stex_internal_copy(ccb-cmd, p, count, ccb-sg_count, ST_FROM_CMD); + stex_internal_copy(ccb-cmd, p, count, scsi_sg_count(ccb-cmd), + ST_FROM_CMD); memset(p-base, 0, sizeof(u32)*6); *(unsigned long *)(p-base) = pci_resource_start(hba-pdev, 0); p-rom_addr = 0; @@ -485,7 +486,8 @@ static void stex_controller_info(struct st_hba *hba, struct st_ccb *ccb) p-subid = hba-pdev-subsystem_vendor 16 | hba-pdev-subsystem_device; - stex_internal_copy(ccb-cmd, p, count, ccb-sg_count, ST_TO_CMD); + stex_internal_copy(ccb-cmd, p, count, scsi_sg_count(ccb-cmd), + ST_TO_CMD); } static void @@ -699,7 +701,7 @@ static void stex_copy_data(struct st_ccb *ccb, if (ccb-cmd == NULL) return; stex_internal_copy(ccb-cmd, - resp-variable, count, ccb-sg_count, ST_TO_CMD); + resp-variable, count, scsi_sg_count(ccb-cmd), ST_TO_CMD); } static void stex_ys_commands(struct st_hba *hba, @@ -724,7 +726,7 @@ static void stex_ys_commands(struct st_hba *hba, count = STEX_EXTRA_SIZE; stex_internal_copy(ccb-cmd, hba-copy_buffer, - count, ccb-sg_count, ST_FROM_CMD); + count, scsi_sg_count(ccb-cmd), ST_FROM_CMD); inq_data = (ST_INQ *)hba-copy_buffer; if (inq_data-DeviceTypeQualifier != 0) ccb-srb_status = SRB_STATUS_SELECTION_TIMEOUT; -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] ips: sg chaining support to the path to non I/O commands
On Tue, 19 Feb 2008 04:39:00 -0800 Salyzyn, Mark [EMAIL PROTECTED] wrote: ACK Thanks! Other RAID drivers (eg: aacraid) makes the assumption that commands in these paths (INQUIRY, READ CAPACITY, MODE SENSE etc spoofing) are single scatter gather elements and have yet to be bitten. I agree with Fujita-san about the practical unlikelihood. The fix does not incur any change in code overhead, so it is worth hardening! Can one create a request via /dev/sg for a buffer smaller than 256 and manage to create a multi-element scatter gather? I think that we can do post 2.6.24 since we relaxed the default alignment requirements (from 511 to 3). So a buffer more than 8 bytes can leads to multi-element scatter gathers though it rarely happens. Shortly I will submit the helper functions to copy data between sg list and a buffer. It can replace aac_internal_transfer but it's not for scsi-rc-fixes. If you worry that aac_internal_transfer can't handle multiple sg entries, you can do something like this, I think: diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c index ae5f74f..73fa7c2 100644 --- a/drivers/scsi/aacraid/linit.c +++ b/drivers/scsi/aacraid/linit.c @@ -455,6 +455,12 @@ static int aac_slave_configure(struct scsi_device *sdev) return 0; } +static int aac_slave_alloc(struct scsi_device *sdev) +{ + blk_queue_update_dma_alignment(sdev-request_queue, (512 - 1)); + return 0; +} + /** * aac_change_queue_depth - alter queue depths * @sdev: SCSI device we are considering @@ -1015,6 +1021,7 @@ static struct scsi_host_template aac_driver_template = { .queuecommand = aac_queuecommand, .bios_param = aac_biosparm, .shost_attrs= aac_attrs, + .slave_alloc= aac_slave_alloc, .slave_configure= aac_slave_configure, .change_queue_depth = aac_change_queue_depth, .sdev_attrs = aac_dev_attrs, = Here's a malicious version of sg_inq, which tries to create multiple sg entries: = #include stdio.h #include stdlib.h #include string.h #include malloc.h #include fcntl.h #include sys/ioctl.h #include sys/types.h #include sys/stat.h #include scsi/sg.h #define INQ_REPLY_LEN 96 #define INQ_CMD_LEN 6 int main(int argc, char **argv) { struct sg_io_hdr hdr; unsigned char scb[INQ_CMD_LEN] = {0x12, 0, 0, 0, INQ_REPLY_LEN, 0}; unsigned char sense[32]; void *buf; int offset = 4096 - 4; int fd, ret; buf = valloc(8192); memset(hdr, 0, sizeof(hdr)); hdr.interface_id = 'S'; hdr.cmd_len = sizeof(scb); hdr.mx_sb_len = sizeof(sense); hdr.dxfer_direction = SG_DXFER_FROM_DEV; hdr.dxfer_len = INQ_REPLY_LEN; hdr.dxferp = buf + offset; hdr.cmdp = scb; hdr.sbp = sense; hdr.flags |= SG_FLAG_DIRECT_IO; fd = open(argv[1], O_RDONLY); if (fd 0) { fprintf(stderr, fail to open %s\n, argv[1]); return fd; } ret = ioctl(fd, SG_IO, hdr); if (ret 0) { fprintf(stderr, fail to ioctl %d\n, ret); return ret; } return ret; } - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ips.c broken since 2.6.23 on x86_64?
ips did scsi_add_host(sh, NULL) so scsi_dma_map uses shost_gendev.parent that isn't initialized properly, then the kernel crashes. 2.6.23 and 2.6.24 have this bug. We can fix this by calling scsi_add_host with pdev-dev, in the standard way (like the following way) but this bug was fixed in the current Linus tree by: commit 2551a13e61d3c3df6c2da6de5a3ece78e6d67111 Author: Jeff Garzik [EMAIL PROTECTED] Date: Thu Dec 13 16:14:10 2007 -0800 [SCSI] ips: handle scsi_add_host() failure, and other err cleanups James, the legitimate way to fix stable trees is sending this commit (not sending a patch that was not committed upstream)? On Mon, 18 Feb 2008 22:32:46 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote: On Sun, 17 Feb 2008 15:37:02 -0800 Tim Pepper [EMAIL PROTECTED] wrote: On Mon 19 Feb at 07:31:56 +0900 [EMAIL PROTECTED] said: Can you apply the 0001 and 0002 against 2.6.24 and see how it works? If it works well, then please apply the 0001, 0002 and 0003. Fujita-san, I've started through the patches in order, cumulatively and after applying 0005 things break. I wont be able to test anything else until tomorrow when I can phycisally reset the machine... Great, thanks a lot! Can you apply this patch after the 0005 patch and see how it works? If it works, then please continue to test 0006, 0007 ... diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c index 05bb6ea..39cdd68 100644 --- a/drivers/scsi/ips.c +++ b/drivers/scsi/ips.c @@ -6906,7 +6906,7 @@ ips_register_scsi(int index) sh-max_channel = ha-nbus - 1; sh-can_queue = ha-max_cmds - 1; - scsi_add_host(sh, NULL); + scsi_add_host(sh, ha-pcidev-dev); scsi_scan_host(sh); return 0; -- 1.5.3.7 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ips: fix data buffer accessors conversion bug
There is one more bug in ips. I think that this needs to go to scsi-rc-fixes, 2.6.24-stable, and 2.6.23-stable though we might rarely hit this bug. = From: FUJITA Tomonori [EMAIL PROTECTED] Date: Tue, 19 Feb 2008 16:03:47 +0900 Subject: [PATCH] ips: fix data buffer accessors conversion bug This fixes a bug that can't handle a passthru command with more than two sg entries. Big thanks to Tim Pepper for debugging the problem. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Tim Pepper [EMAIL PROTECTED] Cc: Salyzyn, Mark [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] --- drivers/scsi/ips.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c index bb152fb..7ed568f 100644 --- a/drivers/scsi/ips.c +++ b/drivers/scsi/ips.c @@ -1576,7 +1576,7 @@ ips_make_passthru(ips_ha_t *ha, struct scsi_cmnd *SC, ips_scb_t *scb, int intr) METHOD_TRACE(ips_make_passthru, 1); scsi_for_each_sg(SC, sg, scsi_sg_count(SC), i) -length += sg[i].length; + length += sg-length; if (length sizeof (ips_passthru_t)) { /* wrong size */ -- 1.5.3.7 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ips.c broken since 2.6.23 on x86_64?
On Mon, 18 Feb 2008 19:48:49 -0800 Tim Pepper [EMAIL PROTECTED] wrote: On Feb 18, 2008 4:11 PM, FUJITA Tomonori [EMAIL PROTECTED] wrote: Can you please help me just once more? 2.6.25-rc2 fixed this bug in a bit different way by chance. Please test 2.6.25-rc2 with the attached patch to make sure that ips in 2.6.25 works well. Confirmed...the patch below against 2.6.25-rc2 also works for me. Thanks a lot! I'll make sure that the ips driver in 2.6.23-stable, 2.6.24-stable (and 2.6.25) will work well. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ips: sg chaining support to the path to non I/O commands
Here is another ips patch, but not a bug fix. = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH] ips: sg chaining support to the path to non I/O commands I overlooked ips_scmd_buf_write and ips_scmd_buf_read when I converted ips to use the data buffer accessors. ips is unlikely to use sg chaining (especially in this path) since a) this path is used only for non I/O commands (with little data transfer), b) ips's sg_tablesize is set to just 17. Thanks to Tim Pepper for testing this patch. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Salyzyn, Mark [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] --- drivers/scsi/ips.c | 18 ++ 1 files changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c index 7ed568f..e5467a4 100644 --- a/drivers/scsi/ips.c +++ b/drivers/scsi/ips.c @@ -3510,15 +3510,16 @@ ips_scmd_buf_write(struct scsi_cmnd *scmd, void *data, unsigned int count) struct scatterlist *sg = scsi_sglist(scmd); for (i = 0, xfer_cnt = 0; - (i scsi_sg_count(scmd)) (xfer_cnt count); i++) { -min_cnt = min(count - xfer_cnt, sg[i].length); + (i scsi_sg_count(scmd)) (xfer_cnt count); + i++, sg = sg_next(sg)) { +min_cnt = min(count - xfer_cnt, sg-length); /* kmap_atomic() ensures addressability of the data buffer.*/ /* local_irq_save() protects the KM_IRQ0 address slot. */ local_irq_save(flags); -buffer = kmap_atomic(sg_page(sg[i]), KM_IRQ0) + sg[i].offset; + buffer = kmap_atomic(sg_page(sg), KM_IRQ0) + sg-offset; memcpy(buffer, cdata[xfer_cnt], min_cnt); -kunmap_atomic(buffer - sg[i].offset, KM_IRQ0); + kunmap_atomic(buffer - sg-offset, KM_IRQ0); local_irq_restore(flags); xfer_cnt += min_cnt; @@ -3543,15 +3544,16 @@ ips_scmd_buf_read(struct scsi_cmnd *scmd, void *data, unsigned int count) struct scatterlist *sg = scsi_sglist(scmd); for (i = 0, xfer_cnt = 0; - (i scsi_sg_count(scmd)) (xfer_cnt count); i++) { -min_cnt = min(count - xfer_cnt, sg[i].length); + (i scsi_sg_count(scmd)) (xfer_cnt count); + i++, sg = sg_next(sg)) { + min_cnt = min(count - xfer_cnt, sg-length); /* kmap_atomic() ensures addressability of the data buffer.*/ /* local_irq_save() protects the KM_IRQ0 address slot. */ local_irq_save(flags); -buffer = kmap_atomic(sg_page(sg[i]), KM_IRQ0) + sg[i].offset; + buffer = kmap_atomic(sg_page(sg), KM_IRQ0) + sg-offset; memcpy(cdata[xfer_cnt], buffer, min_cnt); -kunmap_atomic(buffer - sg[i].offset, KM_IRQ0); + kunmap_atomic(buffer - sg-offset, KM_IRQ0); local_irq_restore(flags); xfer_cnt += min_cnt; -- 1.5.3.7 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SCSI RAM driver
On Tue, 19 Feb 2008 06:31:20 -0700 Matthew Wilcox [EMAIL PROTECTED] wrote: On Tue, Feb 19, 2008 at 10:14:53PM +0900, FUJITA Tomonori wrote: I see that two drivers have very different objectives but if we add use_thread option to scsi_debug (we can do easily), it seems that scsi_debug can provide all the features that scsi_ram does. It's not just use_thread. It's also discard_read/discard_write. scsi_debug has a similar option, fake_rw, which discards both read and write data. And scsi_ram has a different data storage model from scsi_debug -- scsi_debug simulates an arbitrarily sized disc by wrapping around some small (virtually) contiguous allocation of pages; scsi_ram actually allocates the amount of ram that it's told to. This can be solved with another module parameter, of course. IIRC, if virtual_gb option is set to zero, scsi_debug allocates the amount of ram that it's told to. I'm in no way opposed to merging the two; it's a question of whether Doug will mind me doing some surgery on his driver. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ips.c broken since 2.6.23 on x86_64?
On Tue, 19 Feb 2008 10:06:39 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Tue, 2008-02-19 at 17:02 +0900, FUJITA Tomonori wrote: ips did scsi_add_host(sh, NULL) so scsi_dma_map uses shost_gendev.parent that isn't initialized properly, then the kernel crashes. 2.6.23 and 2.6.24 have this bug. We can fix this by calling scsi_add_host with pdev-dev, in the standard way (like the following way) but this bug was fixed in the current Linus tree by: commit 2551a13e61d3c3df6c2da6de5a3ece78e6d67111 Author: Jeff Garzik [EMAIL PROTECTED] Date: Thu Dec 13 16:14:10 2007 -0800 [SCSI] ips: handle scsi_add_host() failure, and other err cleanups James, the legitimate way to fix stable trees is sending this commit (not sending a patch that was not committed upstream)? Well, the upstream patch doesn't look so bad as a stable candidate to my eye. Just because it's an unintended bugfix doesn't automatically invalidate it. The reason stable likes backports of existing upstream patches is because they've supposedly been well tested in upstream. Although that doesn't apply in this case because the other bug rather prevented testing, the principle is still sound. So, would there be any problems simply backporting this? Thanks, I see. There is no problem. Please backport this. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ips.c broken since 2.6.23 on x86_64?
On Sun, 17 Feb 2008 15:37:02 -0800 Tim Pepper [EMAIL PROTECTED] wrote: On Mon 19 Feb at 07:31:56 +0900 [EMAIL PROTECTED] said: Can you apply the 0001 and 0002 against 2.6.24 and see how it works? If it works well, then please apply the 0001, 0002 and 0003. Fujita-san, I've started through the patches in order, cumulatively and after applying 0005 things break. I wont be able to test anything else until tomorrow when I can phycisally reset the machine... Great, thanks a lot! Can you apply this patch after the 0005 patch and see how it works? If it works, then please continue to test 0006, 0007 ... diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c index 05bb6ea..39cdd68 100644 --- a/drivers/scsi/ips.c +++ b/drivers/scsi/ips.c @@ -6906,7 +6906,7 @@ ips_register_scsi(int index) sh-max_channel = ha-nbus - 1; sh-can_queue = ha-max_cmds - 1; - scsi_add_host(sh, NULL); + scsi_add_host(sh, ha-pcidev-dev); scsi_scan_host(sh); return 0; -- 1.5.3.7 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bsg: bidi bio map failure fix
On Mon, 18 Feb 2008 13:55:08 +0100 Jens Axboe [EMAIL PROTECTED] wrote: On Tue, Feb 12 2008, James Bottomley wrote: On Tue, 2008-02-12 at 15:40 -0500, Pete Wyckoff wrote: If blk_rq_map_user requires more than one bio, and fails mapping somewhere after the first bio, it will return with rq-bio set to non-NULL, but it will have already unmapped the partial bio. The out: error exit section will see the non-null bio and try to unmap it again, triggering a mapcount bug via bad_page(). Signed-off-by: Pete Wyckoff [EMAIL PROTECTED] --- block/bsg.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/block/bsg.c b/block/bsg.c index 3337125..bba7154 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -295,8 +295,10 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr) dxferp = (void*)(unsigned long)hdr-din_xferp; ret = blk_rq_map_user(q, next_rq, dxferp, hdr-din_xfer_len); - if (ret) + if (ret) { + next_rq-bio = NULL; /* do not unmap twice */ Nice ... that's a nasty asymmetry of the blk_rq_map_user API. The map takes a request gets a ref and fills in the bio. The unmap has to be called on the bio leaving you to clear the now removed bio reference manually. It is nasty, how about fixing that instead? Yeah, looks better for me though the blk_rq_map_user API is still a bit hacky, as James said. James, Pete's patch is still in scsi-fixes, so how about dropping it and sending this patch via the block? diff --git a/block/blk-map.c b/block/blk-map.c index 955d75c..bc5ce60 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -143,6 +143,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, return 0; unmap_rq: blk_rq_unmap_user(bio); + rq-bio = NULL; return ret; } EXPORT_SYMBOL(blk_rq_map_user); -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ips.c broken since 2.6.23 on x86_64?
On Mon, 18 Feb 2008 15:30:58 -0800 Tim Pepper [EMAIL PROTECTED] wrote: On Mon 18 Feb at 22:32:46 +0900 [EMAIL PROTECTED] said: diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c index 05bb6ea..39cdd68 100644 --- a/drivers/scsi/ips.c +++ b/drivers/scsi/ips.c @@ -6906,7 +6906,7 @@ ips_register_scsi(int index) sh-max_channel = ha-nbus - 1; sh-can_queue = ha-max_cmds - 1; - scsi_add_host(sh, NULL); + scsi_add_host(sh, ha-pcidev-dev); scsi_scan_host(sh); return 0; Fujita-san, This applies and runs well on top of your 0005 patch! The rest of the patches also then apply in order and run successfully. Great, thanks a lot! Just to confirm, I applied the above alone to a clean 2.6.24 and things again build and run successfully. For completeness I also reproduced the problem against 2.6.23.16 and verified the above patch fixes on that kernel version as well. Nice. There is another bug on 2.6.24 but we rarely hit this so 2.6.24 works most of the time: http://marc.info/?l=linux-scsim=120303487528875w=2 Assuming this patch is accepted for 2.6.25, please also queue it for the 2.6.23/24 stable trees. Yes, I will take care about it. Can you please help me just once more? 2.6.25-rc2 fixed this bug in a bit different way by chance. Please test 2.6.25-rc2 with the attached patch to make sure that ips in 2.6.25 works well. Thank you very much for your help in tracking this issue down! No problem. I should have fixed it long time ago. Really sorry about it. diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c index bb152fb..429592a 100644 --- a/drivers/scsi/ips.c +++ b/drivers/scsi/ips.c @@ -1576,7 +1576,7 @@ ips_make_passthru(ips_ha_t *ha, struct scsi_cmnd *SC, ips_scb_t *scb, int intr) METHOD_TRACE(ips_make_passthru, 1); scsi_for_each_sg(SC, sg, scsi_sg_count(SC), i) -length += sg[i].length; +length += sg-length; if (length sizeof (ips_passthru_t)) { /* wrong size */ @@ -3510,15 +3510,16 @@ ips_scmd_buf_write(struct scsi_cmnd *scmd, void *data, unsigned int count) struct scatterlist *sg = scsi_sglist(scmd); for (i = 0, xfer_cnt = 0; - (i scsi_sg_count(scmd)) (xfer_cnt count); i++) { -min_cnt = min(count - xfer_cnt, sg[i].length); + (i scsi_sg_count(scmd)) (xfer_cnt count); + i++, sg = sg_next(sg)) { +min_cnt = min(count - xfer_cnt, sg-length); /* kmap_atomic() ensures addressability of the data buffer.*/ /* local_irq_save() protects the KM_IRQ0 address slot. */ local_irq_save(flags); -buffer = kmap_atomic(sg_page(sg[i]), KM_IRQ0) + sg[i].offset; +buffer = kmap_atomic(sg_page(sg), KM_IRQ0) + sg-offset; memcpy(buffer, cdata[xfer_cnt], min_cnt); -kunmap_atomic(buffer - sg[i].offset, KM_IRQ0); +kunmap_atomic(buffer - sg-offset, KM_IRQ0); local_irq_restore(flags); xfer_cnt += min_cnt; @@ -3543,15 +3544,16 @@ ips_scmd_buf_read(struct scsi_cmnd *scmd, void *data, unsigned int count) struct scatterlist *sg = scsi_sglist(scmd); for (i = 0, xfer_cnt = 0; - (i scsi_sg_count(scmd)) (xfer_cnt count); i++) { -min_cnt = min(count - xfer_cnt, sg[i].length); + (i scsi_sg_count(scmd)) (xfer_cnt count); + i++, sg = sg_next(sg)) { +min_cnt = min(count - xfer_cnt, sg-length); /* kmap_atomic() ensures addressability of the data buffer.*/ /* local_irq_save() protects the KM_IRQ0 address slot. */ local_irq_save(flags); -buffer = kmap_atomic(sg_page(sg[i]), KM_IRQ0) + sg[i].offset; +buffer = kmap_atomic(sg_page(sg), KM_IRQ0) + sg-offset; memcpy(cdata[xfer_cnt], buffer, min_cnt); -kunmap_atomic(buffer - sg[i].offset, KM_IRQ0); +kunmap_atomic(buffer - sg-offset, KM_IRQ0); local_irq_restore(flags); xfer_cnt += min_cnt; - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi_debug: disable clustering
On Sun, 17 Feb 2008 07:28:48 -0700 Matthew Wilcox [EMAIL PROTECTED] wrote: On Sun, Feb 17, 2008 at 08:18:11AM -0600, James Bottomley wrote: No, he means that kmap_atomic can only map a page of data. This makes single page only sg list entries and input assumption into this loop. with ENABLE_CLUSTERING, that's potentially not true. Of course, this accidentally works most of the time because of the way kmap functions. Ah, right. I'm on the verge of releasing a ram-based scsi driver I've been working on ... this loop should work fine with clustering as it takes account of the sg potentially having multiple pages: scsi_for_each_sg(cmnd, sg, scsi_sg_count(cmnd), i) { struct page *sgpage = sg_page(sg); unsigned int to_off = sg-offset; unsigned int sg_copy = sg-length; if (sg_copy len) sg_copy = len; len -= sg_copy; stex driver has a similar function to copies data between a buffer and a scatter list. I think that scsi_kmap_atomic_sg is a bit primitive (and not very popular). I'll send a patch to add a helper function to scsi_lib.c that copies data between a buffer and a scatter list. It would be useful for several drivers. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ps3rom: disable clustering
ps3rom does: scsi_for_each_sg(cmd, sgpnt, scsi_sg_count(cmd), k) { kaddr = kmap_atomic(sg_page(sgpnt), KM_IRQ0); We cannot do something like that with the clustering enabled (or we can use scsi_kmap_atomic_sg). Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Geert Uytterhoeven [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] --- drivers/scsi/ps3rom.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/ps3rom.c b/drivers/scsi/ps3rom.c index 0cd614a..90985cd 100644 --- a/drivers/scsi/ps3rom.c +++ b/drivers/scsi/ps3rom.c @@ -427,7 +427,7 @@ static struct scsi_host_template ps3rom_host_template = { .cmd_per_lun = 1, .emulated = 1, /* only sg driver uses this */ .max_sectors = PS3ROM_MAX_SECTORS, - .use_clustering = ENABLE_CLUSTERING, + .use_clustering = DISABLE_CLUSTERING, .module = THIS_MODULE, }; -- 1.5.3.7 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi_debug: disable clustering
On Sun, 17 Feb 2008 09:02:14 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Sun, 2008-02-17 at 23:52 +0900, FUJITA Tomonori wrote: On Sun, 17 Feb 2008 07:28:48 -0700 Matthew Wilcox [EMAIL PROTECTED] wrote: On Sun, Feb 17, 2008 at 08:18:11AM -0600, James Bottomley wrote: No, he means that kmap_atomic can only map a page of data. This makes single page only sg list entries and input assumption into this loop. with ENABLE_CLUSTERING, that's potentially not true. Of course, this accidentally works most of the time because of the way kmap functions. Ah, right. I'm on the verge of releasing a ram-based scsi driver I've been working on ... this loop should work fine with clustering as it takes account of the sg potentially having multiple pages: scsi_for_each_sg(cmnd, sg, scsi_sg_count(cmnd), i) { struct page *sgpage = sg_page(sg); unsigned int to_off = sg-offset; unsigned int sg_copy = sg-length; if (sg_copy len) sg_copy = len; len -= sg_copy; stex driver has a similar function to copies data between a buffer and a scatter list. I think that scsi_kmap_atomic_sg is a bit primitive (and not very popular). I'll send a patch to add a helper function to scsi_lib.c that copies data between a buffer and a scatter list. It would be useful for several drivers. Actually, if you're going to sweep up them all, libata also does this. Thanks, I'll look at it. However, mapping and copying data isn't a SCSI specific function, it's one any virtual block driver should do, so I think block might be the correct location for such a function. I see. It could go to lib/scatterlist.c or block/ I guess. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ips.c broken since 2.6.23 on x86_64?
On Sun, 17 Feb 2008 13:15:40 -0800 Tim Pepper [EMAIL PROTECTED] wrote: On Sat 16 Feb at 09:41:48 +0900 [EMAIL PROTECTED] said: Do you mean that you applied only the following two patches against 2.6.24, and then it doesn't work? 0001-ips-revert-the-changes-for-the-data-buffer-accessor.patch I only applies the 0001 patch and the driver appeared to function fine. I see, thanks. It would be nice if we can push only the 0001 patch to mainline, however as I explained, we can't unfortunately. The 0002-0009 patches are necessary for post 2.6.24. If so, the second patch is broken. Did you saw BUG_ON message (I added some BUG_ON to the patch)? I did not apply the 0002 patch. Can you apply the 0001 and 0002 against 2.6.24 and see how it works? If it works well, then please apply the 0001, 0002 and 0003. Please the patches in a step-by-step way and tell me which patch breaks the driver. Thanks, - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] scsi_debug: disable clustering
From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH] scsi_debug: disable clustering scsi_debug does at several places: for_each_sg(sdb-table.sgl, sg, sdb-table.nents, k) { kaddr = (unsigned char *) kmap_atomic(sg_page(sg), KM_USER0); We cannot do something like that with the clustering enabled (or we can use scsi_kmap_atomic_sg). Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/scsi_debug.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 1541c17..d1777a9 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -222,7 +222,7 @@ static struct scsi_host_template sdebug_driver_template = { .cmd_per_lun = 16, .max_sectors = 0x, .unchecked_isa_dma =0, - .use_clustering = ENABLE_CLUSTERING, + .use_clustering = DISABLE_CLUSTERING, .module = THIS_MODULE, }; -- 1.5.3.7 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ips.c broken since 2.6.23 on x86_64?
On Thu, 14 Feb 2008 17:16:36 -0800 Tim Pepper [EMAIL PROTECTED] wrote: On Fri 15 Feb at 09:13:16 +0900 [EMAIL PROTECTED] said: Thanks. So we surely have a bug in the non-breakup part. I've just found one bug. Can you try this patch against 2.6.24? Tested and unfortunately no change. Behaves same as the breakup-revert patch. Thanks and sorry about that. Now I don't have any idea. I split the patch. Can you please apply them in a step-by-step way and tell me which patch breaks the driver? There are nine patches against 2.6.24: http://www.kernel.org/pub/linux/kernel/people/tomo/ips/ The first one is just reverting the data buffer accessors conversion. It would be nice if we could just revert it but we can't. These changes are necessary to compile the driver against post 2.6.24. Really sorry about the troubles, - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ips.c broken since 2.6.23 on x86_64?
On Fri, 15 Feb 2008 14:50:57 -0800 Tim Pepper [EMAIL PROTECTED] wrote: On Sat 16 Feb at 01:09:43 +0900 [EMAIL PROTECTED] said: The first one is just reverting the data buffer accessors conversion. It would be nice if we could just revert it but we can't. These changes are necessary to compile the driver against post 2.6.24. Fujita-san, Unfortunately (and not too surprisingly given what we've tried so far) with only the first of your series reverted the driver is working fine for me again. Do you mean that you applied only the following two patches against 2.6.24, and then it doesn't work? 0001-ips-revert-the-changes-for-the-data-buffer-accessor.patch 0002-ips-kill-the-map_single-path-in-ips_scmd_buf_write.patch If so, the second patch is broken. Did you saw BUG_ON message (I added some BUG_ON to the patch)? I saw (eg: replies to http://lkml.org/lkml/2007/5/11/132) some possibly similar sounding issues with other drivers. Could there be some memory uninitialised? I did try changing all the ips.c kmalloc's to kzalloc's, but that didn't help. Also that thread ties into pci gart. The machines we've been using are liable to getting pci calgary although given my .config has: CONFIG_GART_IOMMU=y CONFIG_CALGARY_IOMMU=y # CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT is not set and that when booting this mainline I don't see any Calgary related messages like I get from eg: Ubuntu's 2.6.22-14-server...I'm probably not actually running the calgary iommu code in these repros. Yes, probabaly, your machine doesn't use any IOMMU hardware (nommu_map_sg function was in your crash log). Anyway, I greatly appreciate your efforts so far in trying to find what could be wrong here! Really sorry about the troubles and thanks for testing. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] qla2xxx: fix compilation compile
scsi/qla2xxx/qla_dfs.c: In function 'qla2x00_dfs_fce_show': scsi/qla2xxx/qla_dfs.c:26: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t' Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/qla2xxx/qla_dfs.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_dfs.c b/drivers/scsi/qla2xxx/qla_dfs.c index 1479c60..2cd899b 100644 --- a/drivers/scsi/qla2xxx/qla_dfs.c +++ b/drivers/scsi/qla2xxx/qla_dfs.c @@ -23,7 +23,7 @@ qla2x00_dfs_fce_show(struct seq_file *s, void *unused) mutex_lock(ha-fce_mutex); seq_printf(s, FCE Trace Buffer\n); - seq_printf(s, In Pointer = %llx\n\n, ha-fce_wr); + seq_printf(s, In Pointer = %llx\n\n, (unsigned long long)ha-fce_wr); seq_printf(s, Base = %llx\n\n, (unsigned long long) ha-fce_dma); seq_printf(s, FCE Enable Registers\n); seq_printf(s, %08x %08x %08x %08x %08x %08x\n, -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ips.c broken since 2.6.23 on x86_64?
On Wed, 13 Feb 2008 13:43:24 -0800 Tim Pepper [EMAIL PROTECTED] wrote: We recently upgraded a production x86_64 machine with serveraid cards to 2.6.24 and noted that /proc/scsi/scsi showed garbage for our serveraid service processors. sg_inq also returned garbage from the service processors' sg devices. After a few iterations I started seeing meaninful stuff in the garbage. Not sure if it was returning live memory or just unzero'd. Either way not good so we went back to a known good, older kernel and tried to repro on a similar machine. We got different, but still bad results in terms of pointing at memory badness. FWIW, the original machine had the following hardware: scsi0 : IBM PCI ServeRAID 7.12.05 Build 761 ServeRAID 4H scsi1 : IBM PCI ServeRAID 7.12.05 Build 761 ServeRAID 4M and the repro's have been on a machine with just: scsi0 : IBM PCI ServeRAID 7.12.05 Build 761 ServeRAID 4Mx On the repro machine I'm getting a hang on ips driver load with the following logged: Feb 13 13:16:08 ipstest kernel: [ 915.236563] scsi3 : IBM PCI ServeRAID 7.12.05 Build 761 ServeRAID 4Mx Feb 13 13:16:08 ipstest kernel: [ 915.236839] Unable to handle kernel NULL pointer dereference at RIP: Feb 13 13:16:08 ipstest kernel: [ 915.236863] [check_addr+16/80] check_addr+0x10/0x50 Feb 13 13:16:08 ipstest kernel: [ 915.237209] PGD 79fff067 PUD 7a898067 PMD 0 Feb 13 13:16:08 ipstest kernel: [ 915.237341] Oops: [1] SMP Feb 13 13:16:08 ipstest kernel: [ 915.237463] CPU 1 Feb 13 13:16:08 ipstest kernel: [ 915.239436] Modules linked in: ips aic94xx Feb 13 13:16:08 ipstest kernel: [ 915.239559] Pid: 5213, comm: scsi_scan_3 Not tainted 2.6.23-ips_as_module #3 Feb 13 13:16:08 ipstest kernel: [ 915.239692] RIP: 0010:[check_addr+16/80] [check_addr+16/80] check_addr+0x10/0x50 Feb 13 13:16:08 ipstest kernel: [ 915.239932] RSP: 0018:810076d87900 EFLAGS: 00010082 Feb 13 13:16:08 ipstest kernel: [ 915.240059] RAX: RBX: 81007b636300 RCX: 0024 Feb 13 13:16:08 ipstest kernel: [ 915.240196] RDX: 7b636b00 RSI: 8077cde0 RDI: 806c4ed5 Feb 13 13:16:08 ipstest kernel: [ 915.240332] RBP: 810076d87900 R08: 0500 R09: Feb 13 13:16:08 ipstest kernel: [ 915.240468] R10: 81007aa33b40 R11: 0060 R12: Feb 13 13:16:08 ipstest kernel: [ 915.240605] R13: 0001 R14: 8077cde0 R15: 81007aa33a80 Feb 13 13:16:08 ipstest kernel: [ 915.240741] FS: () GS:810001039300() knlGS: Feb 13 13:16:08 ipstest kernel: [ 915.240981] CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b Feb 13 13:16:08 ipstest kernel: [ 915.24] CR2: CR3: 78a98000 CR4: 06e0 Feb 13 13:16:08 ipstest kernel: [ 915.241248] DR0: DR1: DR2: Feb 13 13:16:08 ipstest kernel: [ 915.241384] DR3: DR6: 0ff0 DR7: 0400 Feb 13 13:16:08 ipstest kernel: [ 915.241520] Process scsi_scan_3 (pid: 5213, threadinfo 810076d86000, task 81007be26720) Feb 13 13:16:08 ipstest kernel: [ 915.241761] Stack: 810076d87930 802125c3 81007aa33a80 81007480cf50 Feb 13 13:16:08 ipstest kernel: [ 915.242006] 81007ba38ca8 810076d87940 8046fb42 Feb 13 13:16:08 ipstest kernel: [ 915.242248] 810076d879c0 8801c2ee 81007aa33af0 00017aa33af0 Feb 13 13:16:08 ipstest kernel: [ 915.242389] Call Trace: Feb 13 13:16:08 ipstest kernel: [ 915.242606] [nommu_map_sg+115/144] nommu_map_sg+0x73/0x90 Feb 13 13:16:08 ipstest kernel: [ 915.242736] [scsi_dma_map+66/96] scsi_dma_map+0x42/0x60 Feb 13 13:16:08 ipstest kernel: [ 915.242867] [_end+124884230/2127548952] :ips:ips_next+0x33e/0xc00 Feb 13 13:16:08 ipstest kernel: [ 915.242986] [scsi_done+0/48] scsi_done+0x0/0x30 Feb 13 13:16:08 ipstest kernel: [ 915.243114] [_end+124896894/2127548952] :ips:ips_queue+0x106/0x1f0 Feb 13 13:16:08 ipstest kernel: [ 915.243240] [scsi_dispatch_cmd+498/784] scsi_dispatch_cmd+0x1f2/0x310 Feb 13 13:16:08 ipstest kernel: [ 915.243370] [scsi_request_fn+491/976] scsi_request_fn+0x1eb/0x3d0 Feb 13 13:16:08 ipstest kernel: [ 915.243500] [__generic_unplug_device+37/48] __generic_unplug_device+0x25/0x30 Feb 13 13:16:08 ipstest kernel: [ 915.243630] [blk_execute_rq_nowait+99/176] blk_execute_rq_nowait+0x63/0xb0 Feb 13 13:16:08 ipstest kernel: [ 915.243761] [blk_execute_rq+122/224] blk_execute_rq+0x7a/0xe0 Feb 13 13:16:08 ipstest kernel: [ 915.243889] [scsi_execute+240/288] scsi_execute+0xf0/0x120 Feb 13 13:16:08 ipstest kernel: [ 915.244016] [scsi_execute_req+134/240] scsi_execute_req+0x86/0xf0 Feb 13 13:16:08 ipstest kernel: [ 915.244145] [scsi_probe_and_add_lun+594/3472]
Re: [PATCH] bsg: bidi bio map failure fix
On Tue, 12 Feb 2008 15:40:24 -0500 Pete Wyckoff [EMAIL PROTECTED] wrote: If blk_rq_map_user requires more than one bio, and fails mapping somewhere after the first bio, it will return with rq-bio set to non-NULL, but it will have already unmapped the partial bio. The out: error exit section will see the non-null bio and try to unmap it again, triggering a mapcount bug via bad_page(). Signed-off-by: Pete Wyckoff [EMAIL PROTECTED] --- block/bsg.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/block/bsg.c b/block/bsg.c index 3337125..bba7154 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -295,8 +295,10 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr) dxferp = (void*)(unsigned long)hdr-din_xferp; ret = blk_rq_map_user(q, next_rq, dxferp, hdr-din_xfer_len); - if (ret) + if (ret) { + next_rq-bio = NULL; /* do not unmap twice */ goto out; + } } if (hdr-dout_xfer_len) { Thanks! Acked-by: FUJITA Tomonori [EMAIL PROTECTED] James, please put this to the scsi-fixes tree. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ips.c broken since 2.6.23 on x86_64?
On Thu, 14 Feb 2008 15:55:49 -0800 Tim Pepper [EMAIL PROTECTED] wrote: On Thu 14 Feb at 20:48:38 +0900 [EMAIL PROTECTED] said: I have a slight doubt on the breakup code though I'm not sure you hit the code. Reverting only the breakup part works? The patch is against 2.6.24. I've tested this revert you posted. Essentially the same trace is logged (see below). The machine doesn't die though. But /proc/scsi/scsi doesn't show the hardware and I am getting an additional trace dumped now (see futher below). Thanks. So we surely have a bug in the non-breakup part. I've just found one bug. Can you try this patch against 2.6.24? Thanks, diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c index 5c5a9b2..d8f5eb5 100644 --- a/drivers/scsi/ips.c +++ b/drivers/scsi/ips.c @@ -1575,12 +1575,12 @@ ips_make_passthru(ips_ha_t *ha, struct scsi_cmnd *SC, ips_scb_t *scb, int intr) ips_passthru_t *pt; int length = 0; int i, ret; -struct scatterlist *sg = scsi_sglist(SC); +struct scatterlist *sg; METHOD_TRACE(ips_make_passthru, 1); scsi_for_each_sg(SC, sg, scsi_sg_count(SC), i) -length += sg[i].length; +length += sg-length; if (length sizeof (ips_passthru_t)) { /* wrong size */ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Latest git oopses during boot
On Thu, 7 Feb 2008 12:14:56 +0100 Harald Arnesen [EMAIL PROTECTED] wrote: On 2/7/08, Andrew Morton [EMAIL PROTECTED] wrote: (cc's restored, and expanded a bit) Ah, sorry, not used to gmail's web interface. Clicked the wrong button. Seems to be the advansys driver, so I tried to remove it - and indeed, the kernel now boots. So I guess it's either that driver or my ancient Nikon Coolscan II that is the only thing attached to the board. Thanks. I uploaded the oops picture to http://userweb.kernel.org/~akpm/oops.jpg Cc to the Matthew Wilcox added. mm... looks like all Matthew's changes were in 2.6.23. And 2.6.23 worked OK, yes? Both 2.6.23 and 2.6.24 are ok. The only recent changes to drivers/scsi/advansys.c are commit b80ca4f7ee36c26d300c5a8f429e73372d153379 Author: FUJITA Tomonori [EMAIL PROTECTED] Date: Sun Jan 13 15:46:13 2008 +0900 [SCSI] replace sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE This replaces sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE in several LLDs. It's a preparation for the future changes to remove sense_buffer array in scsi_cmnd structure. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Signed-off-by: James Bottomley [EMAIL PROTECTED] :100644 100644 9dd3952... 492702b... M drivers/scsi/advansys.c commit 747d016e7e25e216b31022fe2b012508d99fb682 Author: Randy Dunlap [EMAIL PROTECTED] Date: Mon Jan 14 00:55:18 2008 -0800 advansys: fix section mismatch warning Fix section mismatch warning: WARNING: vmlinux.o(.exit.text+0x152a): Section mismatch: reference to .init. Signed-off-by: Randy Dunlap [EMAIL PROTECTED] Cc: Matthew Wilcox [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] Signed-off-by: Linus Torvalds [EMAIL PROTECTED] which seem fairly benign. gcc inlining is going to make it rather a lot of work to find out which statement has actually oopsed there. -- Can you try this? Thanks, diff --git a/drivers/scsi/advansys.c b/drivers/scsi/advansys.c index 374ed02..f5dde12 100644 --- a/drivers/scsi/advansys.c +++ b/drivers/scsi/advansys.c @@ -566,7 +566,7 @@ typedef struct asc_dvc_var { ASC_SCSI_BIT_ID_TYPE unit_not_ready; ASC_SCSI_BIT_ID_TYPE queue_full_or_busy; ASC_SCSI_BIT_ID_TYPE start_motor; - uchar overrun_buf[ASC_OVERRUN_BSIZE] __aligned(8); + uchar *overrun_buf; dma_addr_t overrun_dma; uchar scsi_reset_wait; uchar chip_no; @@ -13833,6 +13833,12 @@ static int __devinit advansys_board_found(struct Scsi_Host *shost, */ if (ASC_NARROW_BOARD(boardp)) { ASC_DBG(2, AscInitAsc1000Driver()\n); + + asc_dvc_varp-overrun_buf = kzalloc(ASC_OVERRUN_BSIZE, GFP_KERNEL); + if (!asc_dvc_varp-overrun_buf) { + ret = -ENOMEM; + goto err_free_wide_mem; + } warn_code = AscInitAsc1000Driver(asc_dvc_varp); if (warn_code || asc_dvc_varp-err_code) { @@ -13840,8 +13846,10 @@ static int __devinit advansys_board_found(struct Scsi_Host *shost, warn 0x%x, error 0x%x\n, asc_dvc_varp-init_state, warn_code, asc_dvc_varp-err_code); - if (asc_dvc_varp-err_code) + if (asc_dvc_varp-err_code) { ret = -ENODEV; + kfree(asc_dvc_varp-overrun_buf); + } } } else { if (advansys_wide_init_chip(shost)) @@ -13891,9 +13899,11 @@ static int advansys_release(struct Scsi_Host *shost) free_dma(shost-dma_channel); } if (ASC_NARROW_BOARD(board)) { + ASC_DVC_VAR *asc_dvc_varp = board-dvc_var.asc_dvc_var; dma_unmap_single(board-dev, board-dvc_var.asc_dvc_var.overrun_dma, ASC_OVERRUN_BSIZE, DMA_FROM_DEVICE); + kfree(asc_dvc_varp-overrun_buf); } else { iounmap(board-ioremap_addr); advansys_wide_free_mem(board); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Latest git oopses during boot
On Thu, 7 Feb 2008 23:24:00 +0100 Harald Arnesen [EMAIL PROTECTED] wrote: On 2/7/08, Linus Torvalds [EMAIL PROTECTED] wrote: On Thu, 7 Feb 2008, Harald Arnesen wrote: I'll try applying the patch to a freshly downloaded git-tree. Ok, good. Shall I try another compiler? I have at least these two: gcc version 3.4.6 (Ubuntu 3.4.6-6ubuntu2) gcc version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2) I would suggest a patch mis-application problem first (or possibly even the patch itself being broken - I simply didn't look very closely at the patch, but it *looked* ok). If it's a compiler bug, it's a pretty big one, and quite frankly, I doubt it. Compiler bugs do happen, but they are pretty rare, and they tend to have more subtle effects than the one you see. However: in addition to the self-compiled 4.2.3 I used for the tests. 4.2.3? Really? That's pretty damn recent, and so almost totally untested. That does make a compiler bug at least more likely. So yes, if you already have other compilers installed, you should try them. If it really is a compiler bug, it's a really bad one, and you would want to let the gcc people know. Still, I'd double-check that the asc_dvc_varp-overrun_buf = kzalloc(ASC_OVERRUN_BSIZE, GFP_KERNEL); line was added properly first. You should see it way after the point where it did asc_dvc_varp = boardp-dvc_var.asc_dvc_var; to initialize it (and both statements should be inside a if (ASC_NARROW_BOARD(boardp)) { conditional - please check that the source code looks sane too). Linus I just re-downloaded an re-patched and re-compiled (with gcc 4.2.3), and now the kernel boots. I must have screwed up the previous patching. It now works, with Fujita's patch applied. Thanks Harald and Linus, The bug has been in the advansys driver. 2.6.23 and 2.6.24 works just because the size of Scsi_Host structure was multiples of 8. After 2.6.24, some patches change Scsi_Host structure and now the size is not multiples of 8. So we hit this bug. I'll resend the patch with a proper description. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] advansys: fix overrun_buf aligned bug
struct asc_dvc_var needs overrun buffer to be placed on an 8 byte boundary. advansys defines struct asc_dvc_var: struct asc_dvc_var { ... uchar overrun_buf[ASC_OVERRUN_BSIZE] __aligned(8); The problem is that struct asc_dvc_var is placed on shost-hostdata. So if the hostdata is not on an 8 byte boundary, the advansys crashes. The hostdata is placed on a sizeof(unsigned long) boundary so the 8 byte boundary is not garanteed with x86_32. With 2.6.23 and 2.6.24, the hostdata is on an 8 byte boundary by chance, but with the current git, it's not. This patch removes overrun_buf static array and use kzalloc. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/advansys.c | 13 +++-- 1 files changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/advansys.c b/drivers/scsi/advansys.c index ccef891..3c2d688 100644 --- a/drivers/scsi/advansys.c +++ b/drivers/scsi/advansys.c @@ -566,7 +566,7 @@ typedef struct asc_dvc_var { ASC_SCSI_BIT_ID_TYPE unit_not_ready; ASC_SCSI_BIT_ID_TYPE queue_full_or_busy; ASC_SCSI_BIT_ID_TYPE start_motor; - uchar overrun_buf[ASC_OVERRUN_BSIZE] __aligned(8); + uchar *overrun_buf; dma_addr_t overrun_dma; uchar scsi_reset_wait; uchar chip_no; @@ -13833,6 +13833,12 @@ static int __devinit advansys_board_found(struct Scsi_Host *shost, */ if (ASC_NARROW_BOARD(boardp)) { ASC_DBG(2, AscInitAsc1000Driver()\n); + + asc_dvc_varp-overrun_buf = kzalloc(ASC_OVERRUN_BSIZE, GFP_KERNEL); + if (!asc_dvc_varp-overrun_buf) { + ret = -ENOMEM; + goto err_free_wide_mem; + } warn_code = AscInitAsc1000Driver(asc_dvc_varp); if (warn_code || asc_dvc_varp-err_code) { @@ -13840,8 +13846,10 @@ static int __devinit advansys_board_found(struct Scsi_Host *shost, warn 0x%x, error 0x%x\n, asc_dvc_varp-init_state, warn_code, asc_dvc_varp-err_code); - if (asc_dvc_varp-err_code) + if (asc_dvc_varp-err_code) { ret = -ENODEV; + kfree(asc_dvc_varp-overrun_buf); + } } } else { if (advansys_wide_init_chip(shost)) @@ -13894,6 +13902,7 @@ static int advansys_release(struct Scsi_Host *shost) dma_unmap_single(board-dev, board-dvc_var.asc_dvc_var.overrun_dma, ASC_OVERRUN_BSIZE, DMA_FROM_DEVICE); + kfree(board-dvc_var.asc_dvc_var.overrun_buf); } else { iounmap(board-ioremap_addr); advansys_wide_free_mem(board); -- 1.5.3.7 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] advansys: fix overrun_buf aligned bug
On Thu, 07 Feb 2008 19:01:55 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Fri, 2008-02-08 at 09:50 +0900, FUJITA Tomonori wrote: struct asc_dvc_var needs overrun buffer to be placed on an 8 byte boundary. advansys defines struct asc_dvc_var: struct asc_dvc_var { ... uchar overrun_buf[ASC_OVERRUN_BSIZE] __aligned(8); The problem is that struct asc_dvc_var is placed on shost-hostdata. So if the hostdata is not on an 8 byte boundary, the advansys crashes. The hostdata is placed on a sizeof(unsigned long) boundary so the 8 byte boundary is not garanteed with x86_32. With 2.6.23 and 2.6.24, the hostdata is on an 8 byte boundary by chance, but with the current git, it's not. This patch removes overrun_buf static array and use kzalloc. It's a bit of a waste of a kmallocs. The usual way of fixing this type of cockup is to float the structure until it becomes aligned, but I suppose that involves changing all calls to shost_priv in the driver ... Yeah, agreed. It's better but I'm not familiar with the driver so I use kmalloc. It's not so bad as a short-term solution, I think. Any chance to push it to final SCSI updates for 2.6.24 merge window? Though we can push it any time since it's a bug fix. Anyway, I'm fine with dropping it if Matthew will fix the driver in a better way. I'm happy unless people blame my IOMMU or sense buffer patch for this bug. :) - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] final SCSI updates for 2.6.24 merge window
On Thu, 07 Feb 2008 19:37:07 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2008-02-07 at 18:56 -0600, James Bottomley wrote: Quite a bit of this is fixing things broken previously (the advansys fix is still pending resolution, but I'll send it as an -rc fix when we have it). There's the final elimination of all drivers that are esp based but don't use the scsi_esp core (that's mostly m68k and alpha). Plus the usual bunch of driver updates and the addition of a new enclosure services driver and the corresponding ULD. OK, the advansys fix came in. I've added it to the patch. James From f983323fea178352ed3b69c70561a13825a3ce59 Mon Sep 17 00:00:00 2001 From: FUJITA Tomonori [EMAIL PROTECTED] Date: Fri, 8 Feb 2008 09:50:08 +0900 Subject: [SCSI] advansys: fix overrun_buf aligned bug Seems that it was a bit late, Linus pulled scsi-misc before the patch was added. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Tue, 05 Feb 2008 08:14:01 +0100 Tomasz Chmielewski [EMAIL PROTECTED] wrote: James Bottomley schrieb: These are both features being independently worked on, are they not? Even if they weren't, the combination of the size of SCST in kernel plus the problem of having to find a migration path for the current STGT users still looks to me to involve the greater amount of work. I don't want to be mean, but does anyone actually use STGT in production? Seriously? In the latest development version of STGT, it's only possible to stop the tgtd target daemon using KILL / 9 signal - which also means all iSCSI initiator connections are corrupted when tgtd target daemon is started again (kernel upgrade, target daemon upgrade, server reboot etc.). I don't know what iSCSI initiator connections are corrupted mean. But if you reboot a server, how can an iSCSI target implementation keep iSCSI tcp connections? Imagine you have to reboot all your NFS clients when you reboot your NFS server. Not only that - your data is probably corrupted, or at least the filesystem deserves checking... - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Tue, 05 Feb 2008 05:43:10 +0100 Matteo Tescione [EMAIL PROTECTED] wrote: Hi all, And sorry for intrusion, i am not a developer but i work everyday with iscsi and i found it fantastic. Altough Aoe, Fcoe and so on could be better, we have to look in real world implementations what is needed *now*, and if we look at vmware world, virtual iron, microsoft clustering etc, the answer is iSCSI. And now, SCST is the best open-source iSCSI target. So, from an end-user point of view, what are the really problems to not integrate scst in the mainstream kernel? Currently, the best open-source iSCSI target implemenation in Linux is Nicholas's LIO, I guess. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: new scsi sense handling
On Mon, 4 Feb 2008 18:39:22 -0800 (PST) Luben Tuikov [EMAIL PROTECTED] wrote: --- On Mon, 2/4/08, Boaz Harrosh [EMAIL PROTECTED] wrote: There are 3 usages of sense handling in drivers 1. sense is available in driver internal structure and is mem-copied to upper level 2. A CHECK_CONDITION status was returned and the driver uses the scsi_eh_prep_cmnd() for a REQUEST_SENSE invocation to the target. Then returning the sense in scsi_eh_return_cmnd(). A variation on this is when the driver does nothing the queue is frozen an the scsi watchdog timer does the above. 3. The underline host adapter does the REQUEST_SENSE and a pre-allocated and DMA mapped sense buffer receives the sense information from HW. Many years ago when ACA had a constructive meaning, so did Autosense. Then about 5 years ago, Autosense disappeared completely since it became the de facto implementation of the then SCSI Execute Command RPC, now just SCSI Execute Command procedure call. At that point in time, the SCSI mid-layer decided to embrace this model and give the LLDD a scsi command structure which included the sense data buffer to a size that the SCSI mid-layer was interested in, at the moment 96 bytes, macro defined in include/scsi/scsi_cmnd.h. The concept of Autosense was off-loaded to LLDD to emulate it if the specific target device to which the command was issued, didn't supply the sense data on CHECK CONDITION, and more so relevant to target devices which implemented queuing, thus the ACA. And the mid-layer would consider extracting the sense data via REQUEST SENSE command as a _special case_ if the LLDD/transport layer didn't implement the autosense model. Only SPI and USB? The most of LLDs using the transport protocol that we care about today uses sense buffer in their own internal structure. I think that the issue to solve to kill scsi_cmnd:sense_buffer is how to share (or export) such sense buffer with the scsi mid-layer. For the old transport protocols, we could do something that James said in this thread to to kill scsi_cmnd:sense_buffer. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Mon, 4 Feb 2008 20:07:01 -0600 Chris Weiss [EMAIL PROTECTED] wrote: On Feb 4, 2008 11:30 AM, Douglas Gilbert [EMAIL PROTECTED] wrote: Alan Cox wrote: better. So for example, I personally suspect that ATA-over-ethernet is way better than some crazy SCSI-over-TCP crap, but I'm biased for simple and low-level, and against those crazy SCSI people to begin with. Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP would probably trash iSCSI for latency if nothing else. And a variant that doesn't do ATA or IP: http://www.fcoe.com/ however, and interestingly enough, the open-fcoe software target depends on scst (for now anyway) STGT also supports software FCoE target driver though it's still experimental stuff. http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg12705.html It works in user space like STGT's iSCSI (and iSER) target driver (i.e. no kernel/user space interaction). - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Tue, 05 Feb 2008 17:07:07 +0100 Tomasz Chmielewski [EMAIL PROTECTED] wrote: FUJITA Tomonori schrieb: On Tue, 05 Feb 2008 08:14:01 +0100 Tomasz Chmielewski [EMAIL PROTECTED] wrote: James Bottomley schrieb: These are both features being independently worked on, are they not? Even if they weren't, the combination of the size of SCST in kernel plus the problem of having to find a migration path for the current STGT users still looks to me to involve the greater amount of work. I don't want to be mean, but does anyone actually use STGT in production? Seriously? In the latest development version of STGT, it's only possible to stop the tgtd target daemon using KILL / 9 signal - which also means all iSCSI initiator connections are corrupted when tgtd target daemon is started again (kernel upgrade, target daemon upgrade, server reboot etc.). I don't know what iSCSI initiator connections are corrupted mean. But if you reboot a server, how can an iSCSI target implementation keep iSCSI tcp connections? The problem with tgtd is that you can't start it (configured) in an atomic way. Usually, one will start tgtd and it's configuration in a script (I replaced some parameters with ... to make it shorter and more readable): Thanks for the details. So the way to stop the daemon is not related with your problem. It's easily fixable. Can you start a new thread about this on stgt-devel mailing list? When we agree on the interface to start the daemon, I'll implement it. tgtd tgtadm --op new ... tgtadm --lld iscsi --op new ... (snip) So the only way to start/restart tgtd reliably is to do hacks which are needed with yet another iSCSI kernel implementation (IET): use iptables. iptables block iSCSI traffic tgtd sleep 1 tgtadm --op new ... tgtadm --lld iscsi --op new ... iptables unblock iSCSI traffic A bit ugly, isn't it? Having to tinker with a firewall in order to start a daemon is by no means a sign of a well-tested and mature project. That's why I asked how many people use stgt in a production environment - James was worried about a potential migration path for current users. I don't know how many people use stgt in a production environment but I'm not sure that this problem prevents many people from using it in a production environment. You want to reboot a server running target devices while initiators connect to it. Rebooting the target server behind the initiators seldom works. System adminstorators in my workplace reboot storage devices once a year and tell us to shut down the initiator machines that use them before that. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 21/24][RFC] scsi_tgt: use of sense accessors
On Tue, 5 Feb 2008 11:21:33 -0500 Pete Wyckoff [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote on Mon, 04 Feb 2008 19:53 +0200: FIXME: I need help with this driver (Pete?) I used scsi_sense() in a none const way. But since scsi_tgt is the ULD here, it can just access it's own sense buffer directly. I did not use scsi_eh_cpy_sense() because I did not want the extra copy. Pete will want to use a 260 bytes buffer here. Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] Need-help-from: Pete Wyckoff [EMAIL PROTECTED] FYI, I never use scsi_tgt. Only just pure userspace on the target, and a dumb ethernet NIC that does not know it is speaking any form of SCSI. Seems that many people misunderstand STGT iSCSI (and iSER), FCoE, and SRP (not implemented yet) software target drivers. They don't use the tgt kernel module. They just run in user space like user-space nfs daemon. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: new scsi sense handling
On Tue, 5 Feb 2008 11:43:58 -0800 (PST) Luben Tuikov [EMAIL PROTECTED] wrote: --- On Tue, 2/5/08, FUJITA Tomonori [EMAIL PROTECTED] wrote: On Mon, 4 Feb 2008 18:39:22 -0800 (PST) Luben Tuikov [EMAIL PROTECTED] wrote: --- On Mon, 2/4/08, Boaz Harrosh [EMAIL PROTECTED] wrote: There are 3 usages of sense handling in drivers 1. sense is available in driver internal structure and is mem-copied to upper level 2. A CHECK_CONDITION status was returned and the driver uses the scsi_eh_prep_cmnd() for a REQUEST_SENSE invocation to the target. Then returning the sense in scsi_eh_return_cmnd(). A variation on this is when the driver does nothing the queue is frozen an the scsi watchdog timer does the above. 3. The underline host adapter does the REQUEST_SENSE and a pre-allocated and DMA mapped sense buffer receives the sense information from HW. Many years ago when ACA had a constructive meaning, so did Autosense. Then about 5 years ago, Autosense disappeared completely since it became the de facto implementation of the then SCSI Execute Command RPC, now just SCSI Execute Command procedure call. At that point in time, the SCSI mid-layer decided to embrace this model and give the LLDD a scsi command structure which included the sense data buffer to a size that the SCSI mid-layer was interested in, at the moment 96 bytes, macro defined in include/scsi/scsi_cmnd.h. The concept of Autosense was off-loaded to LLDD to emulate it if the specific target device to which the command was issued, didn't supply the sense data on CHECK CONDITION, and more so relevant to target devices which implemented queuing, thus the ACA. And the mid-layer would consider extracting the sense data via REQUEST SENSE command as a _special case_ if the LLDD/transport layer didn't implement the autosense model. Only SPI and USB? I don't understand this question. I meant, 'what transport protocols are categorized into the transport protocol that doesn't implement the autosense model?' The most of LLDs using the transport protocol that we care about today uses sense buffer in their own internal structure. Yes. I think that the issue to solve to kill scsi_cmnd:sense_buffer is how to share (or export) such sense buffer with the scsi mid-layer. And therein lies the problem. Sense data is SCSI specific, it should be left to SCSI, unless of course you can stipulate that _all_ block devices return sense data. Yeah, sense data is SCSI specific and it should be left to SCSI. But I'm not sure we need to stipulate that _all_ block devices return sense data. Today the block layer users (sg, bsg, etc) use it only when it's appropriate (or only if they want to use it). If that's not the case and you move it to the block layer, then you get a whole bunch of other problems, like does this device want/use it, should we allocate it, etc. OTOH, if that _is_ the case, then you don't have to worry about this and the model is pretty much as the SCSI mid-layer has it, i.e. sense buffer always present. So I guess the question is, can you stipulate that _all_ block devices return sense data? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Tue, 05 Feb 2008 18:09:15 +0100 Matteo Tescione [EMAIL PROTECTED] wrote: On 5-02-2008 14:38, FUJITA Tomonori [EMAIL PROTECTED] wrote: On Tue, 05 Feb 2008 08:14:01 +0100 Tomasz Chmielewski [EMAIL PROTECTED] wrote: James Bottomley schrieb: These are both features being independently worked on, are they not? Even if they weren't, the combination of the size of SCST in kernel plus the problem of having to find a migration path for the current STGT users still looks to me to involve the greater amount of work. I don't want to be mean, but does anyone actually use STGT in production? Seriously? In the latest development version of STGT, it's only possible to stop the tgtd target daemon using KILL / 9 signal - which also means all iSCSI initiator connections are corrupted when tgtd target daemon is started again (kernel upgrade, target daemon upgrade, server reboot etc.). I don't know what iSCSI initiator connections are corrupted mean. But if you reboot a server, how can an iSCSI target implementation keep iSCSI tcp connections? Imagine you have to reboot all your NFS clients when you reboot your NFS server. Not only that - your data is probably corrupted, or at least the filesystem deserves checking... Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat) rebooting the primary server doesn't affect my iscsi traffic, SCST correctly manages stop/crash, by sending unit attention to clients on reconnect. Drbd+heartbeat correctly manages those things too. Still from an end-user POV, i was able to reboot/survive a crash only with SCST, IETD still has reconnect problems and STGT are even worst. Please tell us on stgt-devel mailing list if you see problems. We will try to fix them. Thanks, - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Integration of SCST in the mainstream Linux kernel
On Wed, 30 Jan 2008 09:38:04 +0100 Bart Van Assche [EMAIL PROTECTED] wrote: On Jan 30, 2008 12:32 AM, FUJITA Tomonori [EMAIL PROTECTED] wrote: iSER has parameters to limit the maximum size of RDMA (it needs to repeat RDMA with a poor configuration)? Please specify which parameters you are referring to. As you know I Sorry, I can't say. I don't know much about iSER. But seems that Pete and Robin can get the better I/O performance - line speed ratio with STGT. The version of OpenIB might matters too. For example, Pete said that STGT reads loses about 100 MB/s for some transfer sizes for some transfer sizes due to the OpenIB version difference or other unclear reasons. http://article.gmane.org/gmane.linux.iscsi.tgt.devel/135 It's fair to say that it takes long time and need lots of knowledge to get the maximum performance of SAN, I think. I think that it would be easier to convince James with the detailed analysis (e.g. where does it take so long, like Pete did), not just 'dd' performance results. Pushing iSCSI target code into mainline failed four times: IET, SCST, STGT (doing I/Os in kernel in the past), and PyX's one (*1). iSCSI target code is huge. You said SCST comprises 14,000 lines, but it's not iSCSI target code. The SCSI engine code comprises 14,000 lines. You need another 10,000 lines for the iSCSI driver. Note that SCST's iSCSI driver provides only basic iSCSI features. PyX's iSCSI target code implemenents more iSCSI features (like MC/S, ERL2, etc) and comprises about 60,000 lines and it still lacks some features like iSER, bidi, etc. I think that it's reasonable to say that we need more than 'dd' results before pushing about possible more than 60,000 lines to mainline. (*1) http://linux-iscsi.org/ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Integration of SCST in the mainstream Linux kernel
On Wed, 30 Jan 2008 14:10:47 +0100 Bart Van Assche [EMAIL PROTECTED] wrote: On Jan 30, 2008 11:56 AM, FUJITA Tomonori [EMAIL PROTECTED] wrote: On Wed, 30 Jan 2008 09:38:04 +0100 Bart Van Assche [EMAIL PROTECTED] wrote: Please specify which parameters you are referring to. As you know I Sorry, I can't say. I don't know much about iSER. But seems that Pete and Robin can get the better I/O performance - line speed ratiwo with STGT. Robin Humble was using a DDR InfiniBand network, while my tests were performed with an SDR InfiniBand network. Robin's results can't be directly compared to my results. I know that you use different hardware. I used 'ratio' word. BTW, you said the performance difference of dio READ is 38% but I think it's 27.3 %, though it's still large. Pete Wyckoff's results (http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf) are hard to interpret. I have asked Pete which of the numbers in his test can be compared with what I measured, but Pete did not reply. The version of OpenIB might matters too. For example, Pete said that STGT reads loses about 100 MB/s for some transfer sizes for some transfer sizes due to the OpenIB version difference or other unclear reasons. http://article.gmane.org/gmane.linux.iscsi.tgt.devel/135 Pete wrote about a degradation from 600 MB/s to 500 MB/s for reads with STGT+iSER. In my tests I measured 589 MB/s for reads (direct I/O), which matches with the better result obtained by Pete. I don't know he used the same benchmark software so I don't think that we can compare them. All I tried to say is the OFED version might has big effect on the performance. So you might need to find the best one. Note: the InfiniBand kernel modules I used were those from the 2.6.22.9 kernel, not from the OFED distribution. I'm talking about a target machine (I think that Pete was also talking about OFED on his target machine). STGT uses OFED libraries, I think. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Integration of SCST in the mainstream Linux kernel
On Tue, 29 Jan 2008 13:31:52 -0800 Roland Dreier [EMAIL PROTECTED] wrote: . . STGT read SCST read.STGT read SCST read. . . performance performance . performance performance . . . (0.5K, MB/s) (0.5K, MB/s) . (1 MB, MB/s) (1 MB, MB/s) . . iSER (8 Gb/s network) . 250N/A . 360 N/A . . SRP (8 Gb/s network) . N/A421 . N/A 683 . On the comparable figures, which only seem to be IPoIB they're showing a 13-18% variance, aren't they? Which isn't an incredible difference. Maybe I'm all wet, but I think iSER vs. SRP should be roughly comparable. The exact formatting of various messages etc. is different but the data path using RDMA is pretty much identical. So the big difference between STGT iSER and SCST SRP hints at some big difference in the efficiency of the two implementations. iSER has parameters to limit the maximum size of RDMA (it needs to repeat RDMA with a poor configuration)? Anyway, here's the results from Robin Humble: iSER to 7G ramfs, x86_64, centos4.6, 2.6.22 kernels, git tgtd, initiator end booted with mem=512M, target with 8G ram direct i/o dd write/read 800/751 MB/s dd if=/dev/zero of=/dev/sdc bs=1M count=5000 oflag=direct dd of=/dev/null if=/dev/sdc bs=1M count=5000 iflag=direct http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg13502.html I think that STGT is pretty fast with the fast backing storage. I don't think that there is the notable perfornace difference between kernel-space and user-space SRP (or ISER) implementations about moving data between hosts. IB is expected to enable user-space applications to move data between hosts quickly (if not, what can IB provide us?). I think that the question is how fast user-space applications can do I/Os ccompared with I/Os in kernel space. STGT is eager for the advent of good asynchronous I/O and event notification interfances. One more possible optimization for STGT is zero-copy data transfer. STGT uses pre-registered buffers and move data between page cache and thsse buffers, and then does RDMA transfer. If we implement own caching mechanism to use pre-registered buffers directly with (AIO and O_DIRECT), then STGT can move data without data copies. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] zfcp: fix sense_buffer access bug
On Mon, 28 Jan 2008 08:46:25 +0100 Christof Schmitt [EMAIL PROTECTED] wrote: On Sun, Jan 27, 2008 at 12:41:50PM +0900, FUJITA Tomonori wrote: The commit de25deb18016f66dcdede165d07654559bb332bc changed scsi_cmnd.sense_buffer from a static array to a dynamically allocated buffer. We can't access to sense_buffer in 'cmd-sense_buffer' way. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/s390/scsi/zfcp_fsf.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c index fe57941..a9a147d 100644 --- a/drivers/s390/scsi/zfcp_fsf.c +++ b/drivers/s390/scsi/zfcp_fsf.c @@ -4224,10 +4224,10 @@ zfcp_fsf_send_fcp_command_task_handler(struct zfcp_fsf_req *fsf_req) ZFCP_LOG_TRACE(%i bytes sense data provided by FCP\n, fcp_rsp_iu-fcp_sns_len); - memcpy(scpnt-sense_buffer, + memcpy(scpnt-sense_buffer, zfcp_get_fcp_sns_info_ptr(fcp_rsp_iu), sns_len); ZFCP_HEX_DUMP(ZFCP_LOG_LEVEL_TRACE, - (void *) scpnt-sense_buffer, sns_len); + (void *)scpnt-sense_buffer, sns_len); } /* check for overrun */ ACK for fixing the access to the sense buffer. We are working internally on cleaning up the zfcp messages. With this change, the 'trace' and 'hex dump' messages will disappear. So, could you simply remove the ZFCP_HEX_DUMP message above, instead of fixing it? I can but James has already merged the above patch to scsi-misc. So it would be more convenient for everyone if you could rebase your patchset on top of scsi-misc? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] aic79xx: fix warnings with CONFIG_PM disabled
CC [M] drivers/scsi/aic7xxx/aic79xx_osm_pci.o drivers/scsi/aic7xxx/aic79xx_osm_pci.c:101: warning: 'ahd_linux_pci_dev_suspend' defined but not used drivers/scsi/aic7xxx/aic79xx_osm_pci.c:121: warning: 'ahd_linux_pci_dev_resume' defined but not used This moves aic79xx_pci_driver struct, removes some forward declarations, and adds some ifdef CONFIG_PM. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/aic7xxx/aic79xx.h |5 +++- drivers/scsi/aic7xxx/aic79xx_core.c|2 + drivers/scsi/aic7xxx/aic79xx_osm_pci.c | 33 +++ drivers/scsi/aic7xxx/aic79xx_pci.c |2 + 4 files changed, 20 insertions(+), 22 deletions(-) diff --git a/drivers/scsi/aic7xxx/aic79xx.h b/drivers/scsi/aic7xxx/aic79xx.h index ce638aa..2f00467 100644 --- a/drivers/scsi/aic7xxx/aic79xx.h +++ b/drivers/scsi/aic7xxx/aic79xx.h @@ -1340,8 +1340,10 @@ struct ahd_pci_identity *ahd_find_pci_device(ahd_dev_softc_t); int ahd_pci_config(struct ahd_softc *, struct ahd_pci_identity *); intahd_pci_test_register_access(struct ahd_softc *); +#ifdef CONFIG_PM void ahd_pci_suspend(struct ahd_softc *); void ahd_pci_resume(struct ahd_softc *); +#endif /** SCB and SCB queue management **/ void ahd_qinfifo_requeue_tail(struct ahd_softc *ahd, @@ -1352,8 +1354,10 @@ struct ahd_softc *ahd_alloc(void *platform_arg, char *name); int ahd_softc_init(struct ahd_softc *); voidahd_controller_info(struct ahd_softc *ahd, char *buf); int ahd_init(struct ahd_softc *ahd); +#ifdef CONFIG_PM int ahd_suspend(struct ahd_softc *ahd); voidahd_resume(struct ahd_softc *ahd); +#endif int ahd_default_config(struct ahd_softc *ahd); int ahd_parse_vpddata(struct ahd_softc *ahd, struct vpd_config *vpd); @@ -1361,7 +1365,6 @@ intahd_parse_cfgdata(struct ahd_softc *ahd, struct seeprom_config *sc); voidahd_intr_enable(struct ahd_softc *ahd, int enable); voidahd_pause_and_flushwork(struct ahd_softc *ahd); -int ahd_suspend(struct ahd_softc *ahd); voidahd_set_unit(struct ahd_softc *, int); voidahd_set_name(struct ahd_softc *, char *); struct scb *ahd_get_scb(struct ahd_softc *ahd, u_int col_idx); diff --git a/drivers/scsi/aic7xxx/aic79xx_core.c b/drivers/scsi/aic7xxx/aic79xx_core.c index a7dd8cd..ade0fb8 100644 --- a/drivers/scsi/aic7xxx/aic79xx_core.c +++ b/drivers/scsi/aic7xxx/aic79xx_core.c @@ -7175,6 +7175,7 @@ ahd_pause_and_flushwork(struct ahd_softc *ahd) ahd-flags = ~AHD_ALL_INTERRUPTS; } +#ifdef CONFIG_PM int ahd_suspend(struct ahd_softc *ahd) { @@ -7197,6 +7198,7 @@ ahd_resume(struct ahd_softc *ahd) ahd_intr_enable(ahd, TRUE); ahd_restart(ahd); } +#endif /** Busy Target Table */ /* diff --git a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c index 66f0259..4150c8a 100644 --- a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c +++ b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c @@ -43,17 +43,6 @@ #include aic79xx_inline.h #include aic79xx_pci.h -static int ahd_linux_pci_dev_probe(struct pci_dev *pdev, - const struct pci_device_id *ent); -static int ahd_linux_pci_reserve_io_regions(struct ahd_softc *ahd, -u_long *base, u_long *base2); -static int ahd_linux_pci_reserve_mem_region(struct ahd_softc *ahd, -u_long *bus_addr, -uint8_t __iomem **maddr); -static int ahd_linux_pci_dev_suspend(struct pci_dev *pdev, pm_message_t mesg); -static int ahd_linux_pci_dev_resume(struct pci_dev *pdev); -static voidahd_linux_pci_dev_remove(struct pci_dev *pdev); - /* Define the macro locally since it's different for different class of chips. */ #define ID(x)\ @@ -85,17 +74,7 @@ static struct pci_device_id ahd_linux_pci_id_table[] = { MODULE_DEVICE_TABLE(pci, ahd_linux_pci_id_table); -static struct pci_driver aic79xx_pci_driver = { - .name = aic79xx, - .probe = ahd_linux_pci_dev_probe, #ifdef CONFIG_PM - .suspend= ahd_linux_pci_dev_suspend, - .resume = ahd_linux_pci_dev_resume, -#endif - .remove = ahd_linux_pci_dev_remove, - .id_table = ahd_linux_pci_id_table -}; - static int ahd_linux_pci_dev_suspend(struct pci_dev *pdev, pm_message_t mesg) { @@ -139,6 +118,7 @@ ahd_linux_pci_dev_resume(struct pci_dev *pdev
[PATCH] aic7xxx: fix warnings with CONFIG_PM disabled
CC [M] drivers/scsi/aic7xxx/aic7xxx_osm_pci.o drivers/scsi/aic7xxx/aic7xxx_osm_pci.c:148: warning: 'ahc_linux_pci_dev_suspend' defined but not used drivers/scsi/aic7xxx/aic7xxx_osm_pci.c:166: warning: 'ahc_linux_pci_dev_resume' defined but not used This moves aic7xxx_pci_driver struct, removes some forward declarations, and adds some ifdef CONFIG_PM. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/aic7xxx/aic7xxx.h |4 +++ drivers/scsi/aic7xxx/aic7xxx_core.c|3 +- drivers/scsi/aic7xxx/aic7xxx_osm_pci.c | 33 +++ drivers/scsi/aic7xxx/aic7xxx_pci.c |2 + 4 files changed, 20 insertions(+), 22 deletions(-) diff --git a/drivers/scsi/aic7xxx/aic7xxx.h b/drivers/scsi/aic7xxx/aic7xxx.h index 3d4e42d..c0344e6 100644 --- a/drivers/scsi/aic7xxx/aic7xxx.h +++ b/drivers/scsi/aic7xxx/aic7xxx.h @@ -1143,7 +1143,9 @@ struct ahc_pci_identity *ahc_find_pci_device(ahc_dev_softc_t); int ahc_pci_config(struct ahc_softc *, struct ahc_pci_identity *); int ahc_pci_test_register_access(struct ahc_softc *); +#ifdef CONFIG_PM voidahc_pci_resume(struct ahc_softc *ahc); +#endif /*** EISA/VL Front End / struct aic7770_identity *aic7770_find_device(uint32_t); @@ -1170,8 +1172,10 @@ int ahc_chip_init(struct ahc_softc *ahc); int ahc_init(struct ahc_softc *ahc); voidahc_intr_enable(struct ahc_softc *ahc, int enable); voidahc_pause_and_flushwork(struct ahc_softc *ahc); +#ifdef CONFIG_PM int ahc_suspend(struct ahc_softc *ahc); int ahc_resume(struct ahc_softc *ahc); +#endif voidahc_set_unit(struct ahc_softc *, int); voidahc_set_name(struct ahc_softc *, char *); voidahc_alloc_scbs(struct ahc_softc *ahc); diff --git a/drivers/scsi/aic7xxx/aic7xxx_core.c b/drivers/scsi/aic7xxx/aic7xxx_core.c index f350b5e..6d2ae64 100644 --- a/drivers/scsi/aic7xxx/aic7xxx_core.c +++ b/drivers/scsi/aic7xxx/aic7xxx_core.c @@ -5078,6 +5078,7 @@ ahc_pause_and_flushwork(struct ahc_softc *ahc) ahc-flags = ~AHC_ALL_INTERRUPTS; } +#ifdef CONFIG_PM int ahc_suspend(struct ahc_softc *ahc) { @@ -5113,7 +5114,7 @@ ahc_resume(struct ahc_softc *ahc) ahc_restart(ahc); return (0); } - +#endif /** Busy Target Table */ /* * Return the untagged transaction id for a given target/channel lun. diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c index 4488946..dd6e21d 100644 --- a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c +++ b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c @@ -42,17 +42,6 @@ #include aic7xxx_osm.h #include aic7xxx_pci.h -static int ahc_linux_pci_dev_probe(struct pci_dev *pdev, - const struct pci_device_id *ent); -static int ahc_linux_pci_reserve_io_region(struct ahc_softc *ahc, - u_long *base); -static int ahc_linux_pci_reserve_mem_region(struct ahc_softc *ahc, -u_long *bus_addr, -uint8_t __iomem **maddr); -static int ahc_linux_pci_dev_suspend(struct pci_dev *pdev, pm_message_t mesg); -static int ahc_linux_pci_dev_resume(struct pci_dev *pdev); -static voidahc_linux_pci_dev_remove(struct pci_dev *pdev); - /* Define the macro locally since it's different for different class of chips. */ #define ID(x) ID_C(x, PCI_CLASS_STORAGE_SCSI) @@ -132,17 +121,7 @@ static struct pci_device_id ahc_linux_pci_id_table[] = { MODULE_DEVICE_TABLE(pci, ahc_linux_pci_id_table); -static struct pci_driver aic7xxx_pci_driver = { - .name = aic7xxx, - .probe = ahc_linux_pci_dev_probe, #ifdef CONFIG_PM - .suspend= ahc_linux_pci_dev_suspend, - .resume = ahc_linux_pci_dev_resume, -#endif - .remove = ahc_linux_pci_dev_remove, - .id_table = ahc_linux_pci_id_table -}; - static int ahc_linux_pci_dev_suspend(struct pci_dev *pdev, pm_message_t mesg) { @@ -182,6 +161,7 @@ ahc_linux_pci_dev_resume(struct pci_dev *pdev) return (ahc_resume(ahc)); } +#endif static void ahc_linux_pci_dev_remove(struct pci_dev *pdev) @@ -289,6 +269,17 @@ ahc_linux_pci_dev_probe(struct pci_dev *pdev, const struct pci_device_id *ent) return (0); } +static struct pci_driver aic7xxx_pci_driver = { + .name = aic7xxx, + .probe = ahc_linux_pci_dev_probe, +#ifdef CONFIG_PM + .suspend= ahc_linux_pci_dev_suspend, + .resume = ahc_linux_pci_dev_resume, +#endif + .remove
[PATCH] hptiop: fix sense_buffer
Sorry, there was another place that I overlooked in the sense buffer conversion. = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH] hptiop: fix sense_buffer access bug Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/hptiop.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/hptiop.c b/drivers/scsi/hptiop.c index e7b2f35..890f44f 100644 --- a/drivers/scsi/hptiop.c +++ b/drivers/scsi/hptiop.c @@ -573,7 +573,7 @@ static void hptiop_finish_scsi_req(struct hptiop_hba *hba, u32 tag, scsi_set_resid(scp, scsi_bufflen(scp) - le32_to_cpu(req-dataxfer_length)); scp-result = SAM_STAT_CHECK_CONDITION; - memcpy(scp-sense_buffer, req-sg_list, + memcpy(scp-sense_buffer, req-sg_list, min_t(size_t, SCSI_SENSE_BUFFERSIZE, le32_to_cpu(req-dataxfer_length))); break; -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
aic79xx: fix sense_buffer access bug
Ah, I overlooked more LLDs... = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH] aic79xx: fix sense_buffer access bug The commit de25deb18016f66dcdede165d07654559bb332bc changed scsi_cmnd.sense_buffer from a static array to a dynamically allocated buffer. We can't access to sense_buffer in 'cmd-sense_buffer' way. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/aic7xxx/aic79xx_osm.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/aic7xxx/aic79xx_osm.c b/drivers/scsi/aic7xxx/aic79xx_osm.c index 0e4708f..3c4efa4 100644 --- a/drivers/scsi/aic7xxx/aic79xx_osm.c +++ b/drivers/scsi/aic7xxx/aic79xx_osm.c @@ -1922,7 +1922,7 @@ ahd_linux_queue_cmd_complete(struct ahd_softc *ahd, struct scsi_cmnd *cmd) struct scsi_sense_data *sense; sense = (struct scsi_sense_data *) - cmd-sense_buffer; + cmd-sense_buffer; if (sense-extra_len = 5 (sense-add_sense_code == 0x47 || sense-add_sense_code == 0x48)) -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] zfcp: fix sense_buffer access bug
The commit de25deb18016f66dcdede165d07654559bb332bc changed scsi_cmnd.sense_buffer from a static array to a dynamically allocated buffer. We can't access to sense_buffer in 'cmd-sense_buffer' way. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/s390/scsi/zfcp_fsf.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c index fe57941..a9a147d 100644 --- a/drivers/s390/scsi/zfcp_fsf.c +++ b/drivers/s390/scsi/zfcp_fsf.c @@ -4224,10 +4224,10 @@ zfcp_fsf_send_fcp_command_task_handler(struct zfcp_fsf_req *fsf_req) ZFCP_LOG_TRACE(%i bytes sense data provided by FCP\n, fcp_rsp_iu-fcp_sns_len); - memcpy(scpnt-sense_buffer, + memcpy(scpnt-sense_buffer, zfcp_get_fcp_sns_info_ptr(fcp_rsp_iu), sns_len); ZFCP_HEX_DUMP(ZFCP_LOG_LEVEL_TRACE, - (void *) scpnt-sense_buffer, sns_len); + (void *)scpnt-sense_buffer, sns_len); } /* check for overrun */ -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ncr53c8xx: fix sense_buffer access bug
The commit de25deb18016f66dcdede165d07654559bb332bc changed scsi_cmnd.sense_buffer from a static array to a dynamically allocated buffer. We can't access to sense_buffer in 'cmd-sense_buffer' way. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/ncr53c8xx.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/ncr53c8xx.c b/drivers/scsi/ncr53c8xx.c index c02771a..c5ebf01 100644 --- a/drivers/scsi/ncr53c8xx.c +++ b/drivers/scsi/ncr53c8xx.c @@ -4967,7 +4967,7 @@ void ncr_complete (struct ncb *np, struct ccb *cp) sizeof(cp-sense_buf))); if (DEBUG_FLAGS (DEBUG_RESULT|DEBUG_TINY)) { - u_char * p = (u_char*) cmd-sense_buffer; + u_char *p = cmd-sense_buffer; int i; PRINT_ADDR(cmd, sense data:); for (i=0; i14; i++) printk ( %x, *p++); -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] use the cmd_type of a leading request for scsi_init_sgtable
On Fri, 25 Jan 2008 20:05:55 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Sat, 2008-01-26 at 09:57 +0900, FUJITA Tomonori wrote: This is against the scsi-bidi tree. We need to use the cmd_type of a leading request for scsi_init_sgtable to set up scsi_data_buffer:length of a bidi request properly. An alternative approach is setting the cmd_type of a leading request and its bidi request (*1). But the block layer and scsi-ml don't expect that the leading request and its sub-requests have the different command types. Note that scsi_debug's XDWRITEREAD_10 support is fine without this patch since req-nr_sectors works for it but req-nr_sectors doesn't work for everyone. (*1) http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg12669.html = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH] use the cmd_type of a leading request for scsi_init_sgtable We need to use the cmd_type of a leading request for scsi_init_sgtable to set up scsi_data_buffer:length of its bidi request properly. This seems to be a very convoluted work around for the fact that we forgot to set the cmd_type on the subordinate request. Wouldn't this be a better fix? I'm fine with this. I have no big preference in this issue. Acked-by: FUJITA Tomonori [EMAIL PROTECTED] I just thought that the approach (the block layer and scsi-ml look at only the type of a leading request) show better how the block layer and scsi-ml see a leading request and its subordinate requests (all the requests need to have the same command type). James --- diff --git a/block/bsg.c b/block/bsg.c index 69b0a9d..8917c51 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -279,6 +279,7 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr) goto out; } rq-next_rq = next_rq; + next_rq-cmd_type = rq-cmd_type; dxferp = (void*)(unsigned long)hdr-din_xferp; ret = blk_rq_map_user(q, next_rq, dxferp, hdr-din_xfer_len); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] use the cmd_type of a leading request for scsi_init_sgtable
On Sat, 26 Jan 2008 11:22:47 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote: On Fri, 25 Jan 2008 20:05:55 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Sat, 2008-01-26 at 09:57 +0900, FUJITA Tomonori wrote: This is against the scsi-bidi tree. We need to use the cmd_type of a leading request for scsi_init_sgtable to set up scsi_data_buffer:length of a bidi request properly. An alternative approach is setting the cmd_type of a leading request and its bidi request (*1). But the block layer and scsi-ml don't expect that the leading request and its sub-requests have the different command types. Note that scsi_debug's XDWRITEREAD_10 support is fine without this patch since req-nr_sectors works for it but req-nr_sectors doesn't work for everyone. (*1) http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg12669.html = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH] use the cmd_type of a leading request for scsi_init_sgtable We need to use the cmd_type of a leading request for scsi_init_sgtable to set up scsi_data_buffer:length of its bidi request properly. This seems to be a very convoluted work around for the fact that we forgot to set the cmd_type on the subordinate request. Wouldn't this be a better fix? I'm fine with this. I have no big preference in this issue. Acked-by: FUJITA Tomonori [EMAIL PROTECTED] I just thought that the approach (the block layer and scsi-ml look at only the type of a leading request) show better how the block layer and scsi-ml see a leading request and its subordinate requests (all the requests need to have the same command type). I meant that the block layer and scsi-ml use its subordinate requests just to hook data buffer. Anyway, I'm fine with your patch. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] use the cmd_type of a leading request for scsi_init_sgtable
This is against the scsi-bidi tree. We need to use the cmd_type of a leading request for scsi_init_sgtable to set up scsi_data_buffer:length of a bidi request properly. An alternative approach is setting the cmd_type of a leading request and its bidi request (*1). But the block layer and scsi-ml don't expect that the leading request and its sub-requests have the different command types. Note that scsi_debug's XDWRITEREAD_10 support is fine without this patch since req-nr_sectors works for it but req-nr_sectors doesn't work for everyone. (*1) http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg12669.html = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH] use the cmd_type of a leading request for scsi_init_sgtable We need to use the cmd_type of a leading request for scsi_init_sgtable to set up scsi_data_buffer:length of its bidi request properly. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/scsi_lib.c | 23 ++- 1 files changed, 14 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index e1c7eeb..61093ea 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1001,8 +1001,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) scsi_end_request(cmd, -EIO, this_count, !result); } -static int scsi_init_sgtable(struct request *req, struct scsi_data_buffer *sdb, -gfp_t gfp_mask) +static int scsi_init_sgtable(struct request *req, +enum rq_cmd_type_bits cmd_type, +struct scsi_data_buffer *sdb, gfp_t gfp_mask) { int count; @@ -1015,12 +1016,12 @@ static int scsi_init_sgtable(struct request *req, struct scsi_data_buffer *sdb, } req-buffer = NULL; - if (blk_pc_request(req)) + if (cmd_type == REQ_TYPE_BLOCK_PC) sdb-length = req-data_len; else sdb-length = req-nr_sectors 9; - /* + /* * Next, walk the list, and fill in the addresses and sizes of * each segment. */ @@ -1043,21 +1044,25 @@ static int scsi_init_sgtable(struct request *req, struct scsi_data_buffer *sdb, */ int scsi_init_io(struct scsi_cmnd *cmd, gfp_t gfp_mask) { - int error = scsi_init_sgtable(cmd-request, cmd-sdb, gfp_mask); + int error; + enum rq_cmd_type_bits cmd_type = cmd-request-cmd_type; + + error = scsi_init_sgtable(cmd-request, cmd_type, cmd-sdb, gfp_mask); if (error) goto err_exit; if (blk_bidi_rq(cmd-request)) { - struct scsi_data_buffer *bidi_sdb = kmem_cache_zalloc( - scsi_bidi_sdb_cache, GFP_ATOMIC); + struct scsi_data_buffer *bidi_sdb; + + bidi_sdb = kmem_cache_zalloc(scsi_bidi_sdb_cache, GFP_ATOMIC); if (!bidi_sdb) { error = BLKPREP_DEFER; goto err_exit; } cmd-request-next_rq-special = bidi_sdb; - error = scsi_init_sgtable(cmd-request-next_rq, bidi_sdb, - GFP_ATOMIC); + error = scsi_init_sgtable(cmd-request-next_rq, cmd_type, + bidi_sdb, GFP_ATOMIC); if (error) goto err_exit; } -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] destroy scsi_bidi_sdb_cache in scsi_exit_queue
This patchset is against the scsi-bidi tree. = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH 1/2] destroy scsi_bidi_sdb_cache in scsi_exit_queue Needs to call kmem_cache_destroy for scsi_bidi_sdb_cache in scsi_exit_queue. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/scsi_lib.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index e1c7eeb..7bfec7e 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1712,6 +1712,7 @@ void scsi_exit_queue(void) int i; kmem_cache_destroy(scsi_io_context_cache); + kmem_cache_destroy(scsi_bidi_sdb_cache); for (i = 0; i SG_MEMPOOL_NR; i++) { struct scsi_host_sg_pool *sgp = scsi_sg_pools + i; -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] handle scsi_init_queue failure properly
scsi_init_queue is expected to clean up allocated things when it fails. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/scsi_lib.c | 18 +- 1 files changed, 17 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 7bfec7e..b12fb31 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1682,7 +1682,7 @@ int __init scsi_init_queue(void) 0, 0, NULL); if (!scsi_bidi_sdb_cache) { printk(KERN_ERR SCSI: can't init scsi bidi sdb cache\n); - return -ENOMEM; + goto cleanup_io_context; } for (i = 0; i SG_MEMPOOL_NR; i++) { @@ -1694,6 +1694,7 @@ int __init scsi_init_queue(void) if (!sgp-slab) { printk(KERN_ERR SCSI: can't init sg slab %s\n, sgp-name); + goto cleanup_bidi_sdb; } sgp-pool = mempool_create_slab_pool(SG_MEMPOOL_SIZE, @@ -1701,10 +1702,25 @@ int __init scsi_init_queue(void) if (!sgp-pool) { printk(KERN_ERR SCSI: can't init sg mempool %s\n, sgp-name); + goto cleanup_bidi_sdb; } } return 0; + +cleanup_bidi_sdb: + for (i = 0; i SG_MEMPOOL_NR; i++) { + struct scsi_host_sg_pool *sgp = scsi_sg_pools + i; + if (sgp-pool) + mempool_destroy(sgp-pool); + if (sgp-slab) + kmem_cache_destroy(sgp-slab); + } + kmem_cache_destroy(scsi_bidi_sdb_cache); +cleanup_io_context: + kmem_cache_destroy(scsi_io_context_cache); + + return -ENOMEM; } void scsi_exit_queue(void) -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] ch: fix device minor number management bug
Oops, sorry, I sent the patches to linux-kernel... = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH 1/2] ch: fix device minor number management bug ch_probe uses the total number of ch devices as minor. ch_probe: ch-minor = ch_devcount; ... ch_devcount++; Then ch_remove decreases ch_devcount: ch_remove: ch_devcount--; If you have two ch devices, sch0 and sch1, and remove sch0, ch_devcount is 1. Then if you add another ch device, ch_probe tries to create sch1. So you get a warning and fail to create sch1: Jan 24 16:01:05 nice kernel: sysfs: duplicate filename 'sch1' can not be created Jan 24 16:01:05 nice kernel: WARNING: at fs/sysfs/dir.c:424 sysfs_add_one() Jan 24 16:01:05 nice kernel: Pid: 2571, comm: iscsid Not tainted 2.6.24-rc7-ga3d2c2e8-dirty #1 Jan 24 16:01:05 nice kernel: Jan 24 16:01:05 nice kernel: Call Trace: Jan 24 16:01:05 nice kernel: [802a22b8] sysfs_add_one+0x54/0xbd Jan 24 16:01:05 nice kernel: [802a283c] create_dir+0x4f/0x87 Jan 24 16:01:05 nice kernel: [802a28a9] sysfs_create_dir+0x35/0x4a Jan 24 16:01:05 nice kernel: [803069a1] kobject_get+0x12/0x17 Jan 24 16:01:05 nice kernel: [80306ece] kobject_add+0xf3/0x1a6 Jan 24 16:01:05 nice kernel: [8034252b] class_device_add+0xaa/0x39d Jan 24 16:01:05 nice kernel: [803428fb] class_device_create+0xcb/0xfa Jan 24 16:01:05 nice kernel: [80229e09] printk+0x4e/0x56 Jan 24 16:01:05 nice kernel: [802a2054] sysfs_ilookup_test+0x0/0xf Jan 24 16:01:05 nice kernel: [88022580] :ch:ch_probe+0xbe/0x61a (snip) This patch converts ch to use a standard minor number management way, idr like sg and bsg. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/ch.c | 71 +--- 1 files changed, 40 insertions(+), 31 deletions(-) diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c index 765f2fc..2b07014 100644 --- a/drivers/scsi/ch.c +++ b/drivers/scsi/ch.c @@ -21,6 +21,7 @@ #include linux/compat.h #include linux/chio.h/* here are all the ioctls */ #include linux/mutex.h +#include linux/idr.h #include scsi/scsi.h #include scsi/scsi_cmnd.h @@ -33,6 +34,7 @@ #define CH_DT_MAX 16 #define CH_TYPES8 +#define CH_MAX_DEVS 128 MODULE_DESCRIPTION(device driver for scsi media changer devices); MODULE_AUTHOR(Gerd Knorr [EMAIL PROTECTED]); @@ -113,9 +115,8 @@ typedef struct { struct mutexlock; } scsi_changer; -static LIST_HEAD(ch_devlist); -static DEFINE_SPINLOCK(ch_devlist_lock); -static int ch_devcount; +static DEFINE_IDR(ch_index_idr); +static DEFINE_SPINLOCK(ch_index_lock); static struct scsi_driver ch_template = { @@ -598,20 +599,17 @@ ch_release(struct inode *inode, struct file *file) static int ch_open(struct inode *inode, struct file *file) { - scsi_changer *tmp, *ch; + scsi_changer *ch; int minor = iminor(inode); - spin_lock(ch_devlist_lock); - ch = NULL; - list_for_each_entry(tmp,ch_devlist,list) { - if (tmp-minor == minor) - ch = tmp; - } + spin_lock(ch_index_lock); + ch = idr_find(ch_index_idr, minor); + if (NULL == ch || scsi_device_get(ch-device)) { - spin_unlock(ch_devlist_lock); + spin_unlock(ch_index_lock); return -ENXIO; } - spin_unlock(ch_devlist_lock); + spin_unlock(ch_index_lock); file-private_data = ch; return 0; @@ -914,6 +912,7 @@ static int ch_probe(struct device *dev) { struct scsi_device *sd = to_scsi_device(dev); struct class_device *class_dev; + int minor, ret = -ENOMEM; scsi_changer *ch; if (sd-type != TYPE_MEDIUM_CHANGER) @@ -923,7 +922,22 @@ static int ch_probe(struct device *dev) if (NULL == ch) return -ENOMEM; - ch-minor = ch_devcount; + if (!idr_pre_get(ch_index_idr, GFP_KERNEL)) + goto free_ch; + + spin_lock(ch_index_lock); + ret = idr_get_new(ch_index_idr, ch, minor); + spin_unlock(ch_index_lock); + + if (ret) + goto free_ch; + + if (minor CH_MAX_DEVS) { + ret = -ENODEV; + goto remove_idr; + } + + ch-minor = minor; sprintf(ch-name,ch%d,ch-minor); class_dev = class_device_create(ch_sysfs_class, NULL, @@ -932,8 +946,8 @@ static int ch_probe(struct device *dev) if (IS_ERR(class_dev)) { printk(KERN_WARNING ch%d: class_device_create failed\n, ch-minor); - kfree(ch); - return PTR_ERR(class_dev); + ret = PTR_ERR(class_dev); + goto remove_idr; } mutex_init(ch-lock); @@ -942,35 +956,29 @@ static int ch_probe(struct device *dev) if (init) ch_init_elem(ch
[PATCH 2/2] ch: remove forward declarations
This moves ch_template and changer_fops structs to the end of file and removes forward declarations. This also removes some trailing whitespace. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/ch.c | 120 - 1 files changed, 54 insertions(+), 66 deletions(-) diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c index 2b07014..7aad154 100644 --- a/drivers/scsi/ch.c +++ b/drivers/scsi/ch.c @@ -90,16 +90,6 @@ static const char * vendor_labels[CH_TYPES-4] = { #define MAX_RETRIES 1 -static int ch_probe(struct device *); -static int ch_remove(struct device *); -static int ch_open(struct inode * inode, struct file * filp); -static int ch_release(struct inode * inode, struct file * filp); -static long ch_ioctl(struct file *filp, unsigned int cmd, unsigned long arg); -#ifdef CONFIG_COMPAT -static long ch_ioctl_compat(struct file * filp, - unsigned int cmd, unsigned long arg); -#endif - static struct class * ch_sysfs_class; typedef struct { @@ -118,27 +108,6 @@ typedef struct { static DEFINE_IDR(ch_index_idr); static DEFINE_SPINLOCK(ch_index_lock); -static struct scsi_driver ch_template = -{ - .owner = THIS_MODULE, - .gendrv = { - .name = ch, - .probe = ch_probe, - .remove = ch_remove, - }, -}; - -static const struct file_operations changer_fops = -{ - .owner = THIS_MODULE, - .open = ch_open, - .release= ch_release, - .unlocked_ioctl = ch_ioctl, -#ifdef CONFIG_COMPAT - .compat_ioctl = ch_ioctl_compat, -#endif -}; - static const struct { unsigned char sense; unsigned char asc; @@ -207,7 +176,7 @@ ch_do_scsi(scsi_changer *ch, unsigned char *cmd, { int errno, retries = 0, timeout, result; struct scsi_sense_hdr sshdr; - + timeout = (cmd[0] == INITIALIZE_ELEMENT_STATUS) ? timeout_init : timeout_move; @@ -245,7 +214,7 @@ static int ch_elem_to_typecode(scsi_changer *ch, u_int elem) { int i; - + for (i = 0; i CH_TYPES; i++) { if (elem = ch-firsts[i] elem ch-firsts[i] + @@ -261,15 +230,15 @@ ch_read_element_status(scsi_changer *ch, u_int elem, char *data) u_char cmd[12]; u_char *buffer; int result; - + buffer = kmalloc(512, GFP_KERNEL | GFP_DMA); if(!buffer) return -ENOMEM; - + retry: memset(cmd,0,sizeof(cmd)); cmd[0] = READ_ELEMENT_STATUS; - cmd[1] = (ch-device-lun 5) | + cmd[1] = (ch-device-lun 5) | (ch-voltags ? 0x10 : 0) | ch_elem_to_typecode(ch,elem); cmd[2] = (elem 8) 0xff; @@ -296,7 +265,7 @@ ch_read_element_status(scsi_changer *ch, u_int elem, char *data) return result; } -static int +static int ch_init_elem(scsi_changer *ch) { int err; @@ -322,7 +291,7 @@ ch_readconfig(scsi_changer *ch) buffer = kzalloc(512, GFP_KERNEL | GFP_DMA); if (!buffer) return -ENOMEM; - + memset(cmd,0,sizeof(cmd)); cmd[0] = MODE_SENSE; cmd[1] = ch-device-lun 5; @@ -365,7 +334,7 @@ ch_readconfig(scsi_changer *ch) } else { vprintk(reading element address assigment page failed!\n); } - + /* vendor specific element types */ for (i = 0; i 4; i++) { if (0 == vendor_counts[i]) @@ -443,7 +412,7 @@ static int ch_position(scsi_changer *ch, u_int trans, u_int elem, int rotate) { u_char cmd[10]; - + dprintk(position: 0x%x\n,elem); if (0 == trans) trans = ch-firsts[CHET_MT]; @@ -462,7 +431,7 @@ static int ch_move(scsi_changer *ch, u_int trans, u_int src, u_int dest, int rotate) { u_char cmd[12]; - + dprintk(move: 0x%x = 0x%x\n,src,dest); if (0 == trans) trans = ch-firsts[CHET_MT]; @@ -484,7 +453,7 @@ ch_exchange(scsi_changer *ch, u_int trans, u_int src, u_int dest1, u_int dest2, int rotate1, int rotate2) { u_char cmd[12]; - + dprintk(exchange: 0x%x = 0x%x = 0x%x\n, src,dest1,dest2); if (0 == trans) @@ -501,7 +470,7 @@ ch_exchange(scsi_changer *ch, u_int trans, u_int src, cmd[8] = (dest2 8) 0xff; cmd[9] = dest20xff; cmd[10] = (rotate1 ? 1 : 0) | (rotate2 ? 2 : 0); - + return ch_do_scsi(ch, cmd, NULL,0, DMA_NONE); } @@ -539,14 +508,14 @@ ch_set_voltag(scsi_changer *ch, u_int elem, elem, tag); memset(cmd,0,sizeof(cmd)); cmd[0] = SEND_VOLUME_TAG; - cmd[1] = (ch-device-lun 5) | + cmd[1] = (ch-device-lun 5) | ch_elem_to_typecode(ch,elem); cmd[2] = (elem 8) 0xff; cmd[3] = elem
Re: Performance of SCST versus STGT
On Tue, 22 Jan 2008 14:33:13 +0300 Vladislav Bolkhovitin [EMAIL PROTECTED] wrote: FUJITA Tomonori wrote: The big problem of stgt iSER is disk I/Os (move data between disk and page cache). We need a proper asynchronous I/O mechanism, however, Linux doesn't provide such and we use a workaround, which incurs large latency. I guess, we cannot solve this until syslets is merged into mainline. Hmm, SCST also doesn't have ability to use asynchronous I/O, but that doesn't prevent it from showing good performance. I don't know how SCST performs I/Os, but surely, in kernel space, you can performs I/Os asynchronously. Or you use an event notification mechanism with multiple kernel threads performing I/Os synchronously. Xen blktap has the same problem as stgt. IIRC, Xen mainline uses a kernel patch to add a proper event notification to AIO though redhat uses the same workaround as stgt instead of applying the kernel patch. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] scsi_debug: add get_data_transfer_info helper function
This adds get_data_transfer_info helper function that get lha and sectors for READ_* and WRITE_* commands (and XDWRITEREAD_10 later). Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/scsi_debug.c | 83 1 files changed, 38 insertions(+), 45 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 82c06f0..31f7378 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -311,12 +311,47 @@ static void sdebug_max_tgts_luns(void); static struct device pseudo_primary; static struct bus_type pseudo_lld_bus; +static void get_data_transfer_info(unsigned char *cmd, + unsigned long long *lba, unsigned int *num) +{ + int i; + + switch (*cmd) { + case WRITE_16: + case READ_16: + for (*lba = 0, i = 0; i 8; ++i) { + if (i 0) + *lba = 8; + *lba += cmd[2 + i]; + } + *num = cmd[13] + (cmd[12] 8) + + (cmd[11] 16) + (cmd[10] 24); + break; + case WRITE_12: + case READ_12: + *lba = cmd[5] + (cmd[4] 8) + (cmd[3] 16) + (cmd[2] 24); + *num = cmd[9] + (cmd[8] 8) + (cmd[7] 16) + (cmd[6] 24); + break; + case WRITE_10: + case READ_10: + *lba = cmd[5] + (cmd[4] 8) + (cmd[3] 16) + (cmd[2] 24); + *num = cmd[8] + (cmd[7] 8); + break; + case WRITE_6: + case READ_6: + *lba = cmd[3] + (cmd[2] 8) + ((cmd[1] 0x1f) 16); + *num = (0 == cmd[4]) ? 256 : cmd[4]; + break; + default: + break; + } +} static int scsi_debug_queuecommand(struct scsi_cmnd * SCpnt, done_funct_t done) { unsigned char *cmd = (unsigned char *) SCpnt-cmnd; - int len, k, j; + int len, k; unsigned int num; unsigned long long lba; int errsts = 0; @@ -452,28 +487,7 @@ int scsi_debug_queuecommand(struct scsi_cmnd * SCpnt, done_funct_t done) break; if (scsi_debug_fake_rw) break; - if ((*cmd) == READ_16) { - for (lba = 0, j = 0; j 8; ++j) { - if (j 0) - lba = 8; - lba += cmd[2 + j]; - } - num = cmd[13] + (cmd[12] 8) + - (cmd[11] 16) + (cmd[10] 24); - } else if ((*cmd) == READ_12) { - lba = cmd[5] + (cmd[4] 8) + - (cmd[3] 16) + (cmd[2] 24); - num = cmd[9] + (cmd[8] 8) + - (cmd[7] 16) + (cmd[6] 24); - } else if ((*cmd) == READ_10) { - lba = cmd[5] + (cmd[4] 8) + - (cmd[3] 16) + (cmd[2] 24); - num = cmd[8] + (cmd[7] 8); - } else {/* READ (6) */ - lba = cmd[3] + (cmd[2] 8) + - ((cmd[1] 0x1f) 16); - num = (0 == cmd[4]) ? 256 : cmd[4]; - } + get_data_transfer_info(cmd, lba, num); errsts = resp_read(SCpnt, lba, num, devip); if (inj_recovered (0 == errsts)) { mk_sense_buffer(devip, RECOVERED_ERROR, @@ -500,28 +514,7 @@ int scsi_debug_queuecommand(struct scsi_cmnd * SCpnt, done_funct_t done) break; if (scsi_debug_fake_rw) break; - if ((*cmd) == WRITE_16) { - for (lba = 0, j = 0; j 8; ++j) { - if (j 0) - lba = 8; - lba += cmd[2 + j]; - } - num = cmd[13] + (cmd[12] 8) + - (cmd[11] 16) + (cmd[10] 24); - } else if ((*cmd) == WRITE_12) { - lba = cmd[5] + (cmd[4] 8) + - (cmd[3] 16) + (cmd[2] 24); - num = cmd[9] + (cmd[8] 8) + - (cmd[7] 16) + (cmd[6] 24); - } else if ((*cmd) == WRITE_10) { - lba = cmd[5] + (cmd[4] 8) + - (cmd[3] 16) + (cmd[2] 24); - num = cmd[8] + (cmd[7] 8); - } else {/* WRITE (6) */ - lba = cmd[3] + (cmd[2] 8) + - ((cmd[1] 0x1f) 16); - num = (0 == cmd[4]) ? 256 : cmd[4]; - } + get_data_transfer_info(cmd, lba, num
Re: [PATCH 0/3] update bidirectional series to sit on top of sg_table
From: James Bottomley [EMAIL PROTECTED] Subject: [PATCH 0/3] update bidirectional series to sit on top of sg_table Date: Fri, 11 Jan 2008 21:09:00 -0600 OK, I suppose in the scheme of things, it's my turn to bear some of the pain. the SCSI bidirectional series rejects pretty badly with sg_table, and since sg_table has to go in, I rebased the series on top of it. Additionally, I tidied up the patches to take advantages of some of the features of sg_table. I killed both use_sg and sg_count in favour of using sg_table.nseg for the count. Just so you can test all of this to make sure I got it right, you can pull the patch series from master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-bidi-2.6.git I've just started to look at the bidi tree. I suppose that only panasas guys have tested the bidi tree with their OSD target devices. To test the bidi tree, I added XDWRITEREAD_10 support to scsi_debug and sgv4_xdwriteread tool to my makeshift bsg tool collections: git://git.kernel.org/pub/scm/linux/kernel/git/tomo/sgv4-tools.git It just sends XDWRITEREAD_10 commands like this: tulip:~# sgv4_xdwriteread --length 16384 /sys/class/bsg/1:0:0:0 driver:0, transport:0, device:0, din_resid: 0, dout_resid: 0 No errors. I'll send the patchset (over the bidi tree) to add XDWRITEREAD_10 support to scsi_debug though I'm not sure whether it's worth adding it to mainline. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] scsi_debug: add bidi data transfer support
This enables fill_from_dev_buffer and fetch_to_dev_buffer to handle bidi commands. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/scsi_debug.c | 21 ++--- 1 files changed, 10 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 31f7378..d810aa7 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -594,18 +594,18 @@ static int fill_from_dev_buffer(struct scsi_cmnd * scp, unsigned char * arr, int k, req_len, act_len, len, active; void * kaddr; void * kaddr_off; - struct scatterlist * sg; + struct scatterlist *sg; + struct scsi_data_buffer *sdb = scsi_in(scp); - if (0 == scsi_bufflen(scp)) + if (!sdb-length) return 0; - if (NULL == scsi_sglist(scp)) + if (!sdb-table.sgl) return (DID_ERROR 16); - if (! ((scp-sc_data_direction == DMA_BIDIRECTIONAL) || - (scp-sc_data_direction == DMA_FROM_DEVICE))) + if (!(scsi_bidi_cmnd(scp) || scp-sc_data_direction == DMA_FROM_DEVICE)) return (DID_ERROR 16); active = 1; req_len = act_len = 0; - scsi_for_each_sg(scp, sg, scsi_sg_count(scp), k) { + for_each_sg(sdb-table.sgl, sg, sdb-table.nents, k) { if (active) { kaddr = (unsigned char *) kmap_atomic(sg_page(sg), KM_USER0); @@ -623,10 +623,10 @@ static int fill_from_dev_buffer(struct scsi_cmnd * scp, unsigned char * arr, } req_len += sg-length; } - if (scsi_get_resid(scp)) - scsi_set_resid(scp, scsi_get_resid(scp) - act_len); + if (sdb-resid) + sdb-resid -= act_len; else - scsi_set_resid(scp, req_len - act_len); + sdb-resid = req_len - act_len; return 0; } @@ -643,8 +643,7 @@ static int fetch_to_dev_buffer(struct scsi_cmnd * scp, unsigned char * arr, return 0; if (NULL == scsi_sglist(scp)) return -1; - if (! ((scp-sc_data_direction == DMA_BIDIRECTIONAL) || - (scp-sc_data_direction == DMA_TO_DEVICE))) + if (!(scsi_bidi_cmnd(scp) || scp-sc_data_direction == DMA_TO_DEVICE)) return -1; req_len = fin = 0; scsi_for_each_sg(scp, sg, scsi_sg_count(scp), k) { -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] scsi_debug: add XDWRITEREAD_10 support
Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/scsi_debug.c | 70 + include/scsi/scsi.h |1 + 2 files changed, 71 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index d810aa7..1541c17 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -280,6 +280,8 @@ static int resp_write(struct scsi_cmnd * SCpnt, unsigned long long lba, unsigned int num, struct sdebug_dev_info * devip); static int resp_report_luns(struct scsi_cmnd * SCpnt, struct sdebug_dev_info * devip); +static int resp_xdwriteread(struct scsi_cmnd *scp, unsigned long long lba, + unsigned int num, struct sdebug_dev_info *devip); static int fill_from_dev_buffer(struct scsi_cmnd * scp, unsigned char * arr, int arr_len); static int fetch_to_dev_buffer(struct scsi_cmnd * scp, unsigned char * arr, @@ -334,6 +336,7 @@ static void get_data_transfer_info(unsigned char *cmd, break; case WRITE_10: case READ_10: + case XDWRITEREAD_10: *lba = cmd[5] + (cmd[4] 8) + (cmd[3] 16) + (cmd[2] 24); *num = cmd[8] + (cmd[7] 8); break; @@ -542,6 +545,28 @@ int scsi_debug_queuecommand(struct scsi_cmnd * SCpnt, done_funct_t done) case WRITE_BUFFER: errsts = check_readiness(SCpnt, 1, devip); break; + case XDWRITEREAD_10: + if (!scsi_bidi_cmnd(SCpnt)) { + mk_sense_buffer(devip, ILLEGAL_REQUEST, + INVALID_FIELD_IN_CDB, 0); + errsts = check_condition_result; + break; + } + + errsts = check_readiness(SCpnt, 0, devip); + if (errsts) + break; + if (scsi_debug_fake_rw) + break; + get_data_transfer_info(cmd, lba, num); + errsts = resp_read(SCpnt, lba, num, devip); + if (errsts) + break; + errsts = resp_write(SCpnt, lba, num, devip); + if (errsts) + break; + errsts = resp_xdwriteread(SCpnt, lba, num, devip); + break; default: if (SCSI_DEBUG_OPT_NOISE scsi_debug_opts) printk(KERN_INFO scsi_debug: Opcode: 0x%x not @@ -1948,6 +1973,50 @@ static int resp_report_luns(struct scsi_cmnd * scp, min((int)alloc_len, SDEBUG_RLUN_ARR_SZ)); } +static int resp_xdwriteread(struct scsi_cmnd *scp, unsigned long long lba, + unsigned int num, struct sdebug_dev_info *devip) +{ + int i, j, ret = -1; + unsigned char *kaddr, *buf; + unsigned int offset; + struct scatterlist *sg; + struct scsi_data_buffer *sdb = scsi_in(scp); + + /* better not to use temporary buffer. */ + buf = kmalloc(scsi_bufflen(scp), GFP_ATOMIC); + if (!buf) + return ret; + + offset = 0; + scsi_for_each_sg(scp, sg, scsi_sg_count(scp), i) { + kaddr = (unsigned char *)kmap_atomic(sg_page(sg), KM_USER0); + if (!kaddr) + goto out; + + memcpy(buf + offset, kaddr + sg-offset, sg-length); + offset += sg-length; + kunmap_atomic(kaddr, KM_USER0); + } + + offset = 0; + for_each_sg(sdb-table.sgl, sg, sdb-table.nents, i) { + kaddr = (unsigned char *)kmap_atomic(sg_page(sg), KM_USER0); + if (!kaddr) + goto out; + + for (j = 0; j sg-length; j++) + *(kaddr + sg-offset + j) ^= *(buf + offset + j); + + offset += sg-length; + kunmap_atomic(kaddr, KM_USER0); + } + ret = 0; +out: + kfree(buf); + + return ret; +} + /* When timer goes off this function is called. */ static void timer_intr_handler(unsigned long indx) { @@ -1981,6 +2050,7 @@ static int scsi_debug_slave_alloc(struct scsi_device * sdp) if (SCSI_DEBUG_OPT_NOISE scsi_debug_opts) printk(KERN_INFO scsi_debug: slave_alloc %u %u %u %u\n, sdp-host-host_no, sdp-channel, sdp-id, sdp-lun); + set_bit(QUEUE_FLAG_BIDI, sdp-request_queue-queue_flags); return 0; } diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h index 056c5af..19ca9e9 100644 --- a/include/scsi/scsi.h +++ b/include/scsi/scsi.h @@ -102,6 +102,7 @@ extern const unsigned char scsi_command_size[8]; #define READ_TOC 0x43 #define LOG_SELECT0x4c #define LOG_SENSE 0x4d +#define XDWRITEREAD_100x53 #define MODE_SELECT_100x55 #define RESERVE_10
Re: Performance of SCST versus STGT
On Sun, 20 Jan 2008 10:36:18 +0100 Bart Van Assche [EMAIL PROTECTED] wrote: On Jan 18, 2008 1:08 PM, Vladislav Bolkhovitin [EMAIL PROTECTED] wrote: [ ... ] So, seems I understood your slides correctly: the more valuable data for our SCST SRP vs STGT iSER comparison should be on page 26 for 1 command read (~480MB/s, i.e. ~60% from Bart's result on the equivalent hardware). At least in my tests SCST performed significantly better than STGT. These tests were performed with the currently available implementations of SCST and STGT. Which performance improvements are First, I recommend you to examine iSER stuff more since it has some parameters unlike SRP, which effects the performance, IIRC. At least, you could get the iSER performances similar to Pete's. possible for these projects (e.g. zero-copying), and by how much is it expected that these performance improvements will increase throughput and will decrease latency ? The major bottleneck about RDMA transfer is registering the buffer before transfer. stgt's iSER driver has pre-registered buffers and move data between page cache and thsse buffers, and then does RDMA transfer. The big problem of stgt iSER is disk I/Os (move data between disk and page cache). We need a proper asynchronous I/O mechanism, however, Linux doesn't provide such and we use a workaround, which incurs large latency. I guess, we cannot solve this until syslets is merged into mainline. The above approach still needs one memory copy (between the pre-registered buffers and page cahce). If we need more performance, we have to implement a new caching mechanism using the pre-registered buffers instead of just using page cache. AIO with O_DIRECT enables us to implement such caching mechanism (we can use eventfd so we don't need something like syslets, that is, we can implement such now). I'm not sure someone will implement such RDMA caching mechanism for stgt. Pete and his colleagues implemented stgt iSER driver (thanks!) but they are not interested in block I/Os (they are OSD people). - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] use dynamically allocated sense buffer
On Sun, 20 Jan 2008 21:37:41 -0700 Matthew Wilcox [EMAIL PROTECTED] wrote: On Mon, Jan 21, 2008 at 01:08:58PM +0900, FUJITA Tomonori wrote: On Sun, 20 Jan 2008 09:40:11 -0700 Matthew Wilcox [EMAIL PROTECTED] wrote: Longer-term, I want to allow low-level drivers to allocate the sense_buffer themselves so they can DMA directly into it (ie grown-up dma mapping, rather than this quaint x86 __GFP_DMA). This patch doesn't get Yeah, I think that the approach is one of candidates. If we go with it, I think that the major issue is that LLDs don't know when they can reclaim sense_buffer from scsi-ml; scsi-ml uses sense_buffer after scmd-scsi_done. The midlayer would call a function in the scsi_host_template to free the command. The sense_buffer would be freed at the same time. Yeah, it would work though I'm a bit concerned about adding another phase to the scsi_cmnd interaction between the midlayer and LLDs. Another possible option is that putting some sense_buffer information to the host template and the midlayer allocates sense buffers for LLDs. LLDs can ask for them when it's necessary. One disadvantage of this is the many LLDs have sense buffer in their own data structure so it's not fit well for these LLDs, I think. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] use dynamically allocated sense buffer
On Sun, 20 Jan 2008 09:40:11 -0700 Matthew Wilcox [EMAIL PROTECTED] wrote: On Wed, Jan 16, 2008 at 01:32:17PM +0900, FUJITA Tomonori wrote: This removes static array sense_buffer in scsi_cmnd and uses dynamically allocated sense_buffer (with GFP_DMA). The reason for doing this is that some architectures need cacheline aligned buffer for DMA: http://lkml.org/lkml/2007/11/19/2 The problems are that scsi_eh_prep_cmnd puts scsi_cmnd::sense_buffer to sglist and some LLDs directly DMA to scsi_cmnd::sense_buffer. It's necessary to DMA to scsi_cmnd::sense_buffer safely. This patch solves these issues. __scsi_get_command allocates sense_buffer via kmem_cache_alloc and attaches it to a scsi_cmnd so everything just work as before. I think this is fine for the moment. Longer-term, I want to allow low-level drivers to allocate the sense_buffer themselves so they can DMA directly into it (ie grown-up dma mapping, rather than this quaint x86 __GFP_DMA). This patch doesn't get Yeah, I think that the approach is one of candidates. If we go with it, I think that the major issue is that LLDs don't know when they can reclaim sense_buffer from scsi-ml; scsi-ml uses sense_buffer after scmd-scsi_done. us any closer to that, but it doesn't get us further away from it either. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] use dynamically allocated sense buffer
On Sun, 20 Jan 2008 10:36:56 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Wed, 2008-01-16 at 13:32 +0900, FUJITA Tomonori wrote: This is the third version of: http://marc.info/?l=linux-scsim=120038907123706w=2 [...] diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 54ff611..0a4a5b8 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -186,6 +190,21 @@ struct scsi_cmnd *__scsi_get_command(struct Scsi_Host *shost, gfp_t gfp_mask) list_del_init(cmd-list); } spin_unlock_irqrestore(shost-free_list_lock, flags); + + if (cmd) { + buf = cmd-sense_buffer; + memset(cmd, 0, sizeof(*cmd)); + cmd-sense_buffer = buf; + } + } else { + buf = kmem_cache_alloc(sense_buffer_slab, __GFP_DMA|gfp_mask); This is going to cause the enterprise some angst because ZONE_DMA can be very small, and the enterprise requrements for commands in flight very large. I also think its unnecessary if we know the host isn't requiring ISA DMA. Yes, I should have done properly. How about the below to fix this, it's based on the existing infrastructure for solving the very same problem. Looks nice. Integrating sense_buffer_pool into struct scsi_host_cmd_pool looks much better. James --- From e7ffbd81684779f5eb41d66d5f499b82af40e12b Mon Sep 17 00:00:00 2001 From: James Bottomley [EMAIL PROTECTED] Date: Sun, 20 Jan 2008 09:28:24 -0600 Subject: [SCSI] don't use __GFP_DMA for sense buffers if not required Only hosts which actually have ISA DMA requirements need sense buffers coming out of ZONE_DMA, so only use the __GFP_DMA flag for that case to avoid allocating this scarce resource if it's not necessary. Signed-off-by: James Bottomley [EMAIL PROTECTED] --- drivers/scsi/hosts.c |9 + drivers/scsi/scsi.c | 106 +++-- drivers/scsi/scsi_priv.h |2 - 3 files changed, 46 insertions(+), 71 deletions(-) (snip) @@ -310,7 +313,6 @@ int scsi_setup_command_freelist(struct Scsi_Host *shost) { struct scsi_host_cmd_pool *pool; struct scsi_cmnd *cmd; - unsigned char *sense_buffer; spin_lock_init(shost-free_list_lock); INIT_LIST_HEAD(shost-free_list); @@ -322,10 +324,13 @@ int scsi_setup_command_freelist(struct Scsi_Host *shost) mutex_lock(host_cmd_pool_mutex); pool = (shost-unchecked_isa_dma ? scsi_cmd_dma_pool : scsi_cmd_pool); if (!pool-users) { - pool-slab = kmem_cache_create(pool-name, - sizeof(struct scsi_cmnd), 0, - pool-slab_flags, NULL); - if (!pool-slab) + pool-cmd_slab = kmem_cache_create(pool-cmd_name, +sizeof(struct scsi_cmnd), 0, +pool-slab_flags, NULL); + pool-sense_slab = kmem_cache_create(pool-sense_name, + SCSI_SENSE_BUFFERSIZE, 0, + pool-slab_flags, NULL); + if (!pool-cmd_slab || !pool-sense_slab) goto fail; If one of the above allocations fails, the allocated slab leaks? diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 045e69d..1a9fba6 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -327,11 +327,16 @@ int scsi_setup_command_freelist(struct Scsi_Host *shost) pool-cmd_slab = kmem_cache_create(pool-cmd_name, sizeof(struct scsi_cmnd), 0, pool-slab_flags, NULL); + if (!pool-cmd_slab) + goto fail; + pool-sense_slab = kmem_cache_create(pool-sense_name, SCSI_SENSE_BUFFERSIZE, 0, pool-slab_flags, NULL); - if (!pool-cmd_slab || !pool-sense_slab) + if (!pool-sense_slab) { + kmem_cache_destroy(pool-cmd_slab); goto fail; + } } pool-users++; - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] remove use_sg_chaining
On Sun, 20 Jan 2008 21:54:21 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: On Sun, Jan 20 2008 at 21:24 +0200, James Bottomley [EMAIL PROTECTED] wrote: On Sun, 2008-01-20 at 21:18 +0200, Boaz Harrosh wrote: On Tue, Jan 15 2008 at 19:52 +0200, James Bottomley [EMAIL PROTECTED] wrote: this patch depends on the sg branch of the block tree James --- From: James Bottomley [EMAIL PROTECTED] Date: Tue, 15 Jan 2008 11:11:46 -0600 Subject: remove use_sg_chaining With the sg table code, every SCSI driver is now either chain capable or broken, so there's no need to have a check in the host template. Also tidy up the code by moving the scatterlist size defines into the SCSI includes and permit the last entry of the scatterlist pools not to be a power of two. --- I have a theoretical problem that BUGed me from the beginning. Could it happen that a memory critical IO, (that is needed to free memory), be collected into an sg-chained large IO, and the allocation of the multiple sg-pool-allocations fail, thous dead locking on out-of-memory? Is there a mechanism in place that will split large IO's into smaller chunks in the event of out-of-memory condition in prep_fn? Is it possible to call blk_rq_map_sg() with less then what is present at request to only map the starting portion? Obviously, that's why I was worrying about mempool size and default blocks a while ago. However, the deadlock only occurs if the device is swap or backing a filesystem with memory mapped files. The use cases for this are really tapes and other entities that need huge buffers. That's why we're keeping the system sector size at 1024 unless you alter it through sysfs (here gun, there foot ...) James OK Thanks for confirming my concern, In modern life with devices like iSCSI that have ~0 as it's max_sector, swapping over that should be considered and configured carefully. Once with pNFS over blocks/objects it should be addressed. Perhaps with a FAIL_FAST semantics for users like pNFS to split up the requests if they fail with out-of-memory. As James pointed out, with networked backed device, the things are much more complicated (I have no idea when such configuration will be possible). http://kerneltrap.org/Linux/Swap_Over_NFS - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Performance of SCST versus STGT
On Thu, 17 Jan 2008 10:27:08 +0100 Bart Van Assche [EMAIL PROTECTED] wrote: Hello, I have performed a test to compare the performance of SCST and STGT. Apparently the SCST target implementation performed far better than the STGT target implementation. This makes me wonder whether this is due to the design of SCST or whether STGT's performance can be improved to the level of SCST ? Test performed: read 2 GB of data in blocks of 1 MB from a target (hot cache -- no disk reads were performed, all reads were from the cache). Test command: time dd if=/dev/sde of=/dev/null bs=1M count=2000 STGT read SCST read performance (MB/s) performance (MB/s) Ethernet (1 Gb/s network)7789 IPoIB (8 Gb/s network) 82 229 SRP (8 Gb/s network)N/A 600 iSER (8 Gb/s network)80 N/A These results show that SCST uses the InfiniBand network very well (effectivity of about 88% via SRP), but that the current STGT version is unable to transfer data faster than 82 MB/s. Does this mean that there is a severe bottleneck present in the current STGT implementation ? I don't know about the details but Pete said that he can achieve more than 900MB/s read performance with tgt iSER target using ramdisk. http://www.mail-archive.com/[EMAIL PROTECTED]/msg4.html - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Performance of SCST versus STGT
On Thu, 17 Jan 2008 12:48:28 +0300 Vladislav Bolkhovitin [EMAIL PROTECTED] wrote: FUJITA Tomonori wrote: On Thu, 17 Jan 2008 10:27:08 +0100 Bart Van Assche [EMAIL PROTECTED] wrote: Hello, I have performed a test to compare the performance of SCST and STGT. Apparently the SCST target implementation performed far better than the STGT target implementation. This makes me wonder whether this is due to the design of SCST or whether STGT's performance can be improved to the level of SCST ? Test performed: read 2 GB of data in blocks of 1 MB from a target (hot cache -- no disk reads were performed, all reads were from the cache). Test command: time dd if=/dev/sde of=/dev/null bs=1M count=2000 STGT read SCST read performance (MB/s) performance (MB/s) Ethernet (1 Gb/s network)7789 IPoIB (8 Gb/s network) 82 229 SRP (8 Gb/s network)N/A 600 iSER (8 Gb/s network)80 N/A These results show that SCST uses the InfiniBand network very well (effectivity of about 88% via SRP), but that the current STGT version is unable to transfer data faster than 82 MB/s. Does this mean that there is a severe bottleneck present in the current STGT implementation ? I don't know about the details but Pete said that he can achieve more than 900MB/s read performance with tgt iSER target using ramdisk. http://www.mail-archive.com/[EMAIL PROTECTED]/msg4.html Please don't confuse multithreaded latency insensitive workload with single threaded, hence latency sensitive one. Seems that he can get good performance with single threaded workload: http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf But I don't know about the details so let's wait for Pete to comment on this. Perhaps Voltaire people could comment on the tgt iSER performances. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] atp870u: don't zero out sense_buffer in queuecommand
LLDs don't need to zero out scsi_cmnd::sense_buffer in queuecommand since scsi-ml does. This is a preparation of the future changes to allocate the sense_buffer only when necessary. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/atp870u.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/atp870u.c b/drivers/scsi/atp870u.c index db6de5e..fb27f2c 100644 --- a/drivers/scsi/atp870u.c +++ b/drivers/scsi/atp870u.c @@ -613,7 +613,6 @@ static int atp870u_queuecommand(struct scsi_cmnd * req_p, struct Scsi_Host *host; c = scmd_channel(req_p); - req_p-sense_buffer[0]=0; scsi_set_resid(req_p, 0); if (scmd_channel(req_p) 1) { req_p-result = 0x0004; -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] use dynamically allocated sense buffer
On Thu, 17 Jan 2008 09:58:11 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2008-01-17 at 18:13 +0900, FUJITA Tomonori wrote: On Wed, 16 Jan 2008 14:35:50 +0200 Benny Halevy [EMAIL PROTECTED] wrote: On Jan. 15, 2008, 17:20 +0200, James Bottomley [EMAIL PROTECTED] wrote: On Tue, 2008-01-15 at 18:23 +0900, FUJITA Tomonori wrote: This is the second version of http://marc.info/?l=linux-scsim=119933628210006w=2 I gave up once, but I found that the performance loss is negligible (within 1%) by using kmem_cache_alloc instead of mempool. I use scsi_debug with fake_rw=1 and disktest (DIO reads with 8 threads) again: scsi-misc (slub) | 486.9 MB/s IOPS 124652.9/s dynamic sense buf (slub) | 483.2 MB/s IOPS 123704.1/s scsi-misc (slab) | 467.0 MB/s IOPS 119544.3/s dynamic sense buf (slab) | 468.7 MB/s IOPS 119986.0/s The results are the averages of three runs with a server using two dual-core 1.60 GHz Xeon processors with DDR2 memory. I doubt think that someone will complain about the performance regression due to this patch. In addition, unlike scsi_debug, the real LLDs allocate the own data structure per scsi_cmnd so the performance differences would be smaller (and with the real hard disk overheads). Here's the full results: http://www.kernel.org/pub/linux/kernel/people/tomo/sense/results.txt Heh, that's one of those good news, bad news things. Certainly good news for you. The bad news for the rest of us is that you just implicated mempool in a performance problem and since they're the core of the SCSI scatterlist allocations and sit at the heart of the critical path in SCSI, we have a potential performance issue in the whole of SCSI. Looking at mempool's code this is peculiar as what seems to be its critical path for alloc and free looks pretty harmless and lightweight. Maybe an extra memory barrier, spin_{,un}lock_* and two extra function call (one of them can be eliminated BTW if the order of arguments to the mempool_{alloc,free}_t functions were the same as for kmem_cache_{alloc,free}). Yeah, so I wondered why the change made a big difference. After more testing, it turned out that mempool is not so slow. v1 patch reserves as many buffers as can_queue per shost. My test server allocates 1519 sense buffers in total and then needs to allocate more. Seems that it hurts the performance. I would bet it does. Mempools aren't a performance enhancer, they're a deadlock avoider. So you don't prefill them with 1519 entries per host, you prefill them with at most two so that we can always guarantee getting a writeout command down in the event the system is totally out of GFP_ATOMIC memory and needs to free something. Yes, I misunderstood how mempool works. BTW, I preallocated as many buffers as can_queue an then the server allocates 1519 buffers in total (with five scsi hosts). Plus, pool allocations of that size will get me hunted down and shot by the linux tiny (or other embedded) community. Definitely. I modified v3 patch to allocate unused 1519 sense buffers via kmem_cache_alloc. It achieved 96.2% of the scsi-misc performance (note that v1 patch achieved 94.6% of the scsi-misc). I modified v3 patch to use mempool to allocate one buffer per host. It achieved 98.3% of the scsi-misc (note that v3 patch achieved 99.3% of the scsi-misc). This is about the correct thing to do. Will you merge the v3 patch or the modified v3 patch to use mempool to allocate one buffer per host? Or always allocating sense buffer costs too much? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] firewire: fw-sbp2: prepare for s/g chaining
On Tue, 15 Jan 2008 21:10:50 +0100 (CET) Stefan Richter [EMAIL PROTECTED] wrote: Signed-off-by: Stefan Richter [EMAIL PROTECTED] --- Replacement of patch firewire: fw-sbp2: enable s/g chaining. It's the same, minus '+ .use_sg_chaining = ENABLE_SG_CHAINING,' hunk to prevent conflicts when James is going to remove .use_sg_chaining. drivers/firewire/fw-sbp2.c |7 --- 1 file changed, 4 insertions(+), 3 deletions(-) Index: linux/drivers/firewire/fw-sbp2.c === --- linux.orig/drivers/firewire/fw-sbp2.c +++ linux/drivers/firewire/fw-sbp2.c @@ -1107,9 +1107,9 @@ sbp2_map_scatterlist(struct sbp2_command * elements larger than 65535 bytes, some IOMMUs may merge sg elements * during DMA mapping, and Linux currently doesn't prevent this. */ On a relate note, I fixed the IOMMU merge issue. The patches have been -mm though I'm not sure whether they will go into v2.6.25. The patches enable you to remove the following workaround if you configure the maximum sg element length. From a quick look, fw-sbp2 uses scsi-ml in a different way so it would be a bit trick to configure the maximum sg element length. You call dma_map_sg with pci_dev::dev but don't call scsi_add_host with pci_dev::dev. If you set the maximum sg element length to pci_dev::dev, and then call scsi_add_host with it, the block layer and the IOMMU send you proper size sg elements. - for (i = 0, j = 0; i count; i++) { - sg_len = sg_dma_len(sg + i); - sg_addr = sg_dma_address(sg + i); + for (i = 0, j = 0; i count; i++, sg = sg_next(sg)) { + sg_len = sg_dma_len(sg); + sg_addr = sg_dma_address(sg); while (sg_len) { /* FIXME: This won't get us out of the pinch. */ if (unlikely(j = ARRAY_SIZE(orb-page_table))) { -- Stefan Richter -=-==--- ---= - http://arcgraph.de/sr/ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] use dynamically allocated sense buffer
This is the second version of http://marc.info/?l=linux-scsim=119933628210006w=2 I gave up once, but I found that the performance loss is negligible (within 1%) by using kmem_cache_alloc instead of mempool. I use scsi_debug with fake_rw=1 and disktest (DIO reads with 8 threads) again: scsi-misc (slub) | 486.9 MB/s IOPS 124652.9/s dynamic sense buf (slub) | 483.2 MB/s IOPS 123704.1/s scsi-misc (slab) | 467.0 MB/s IOPS 119544.3/s dynamic sense buf (slab) | 468.7 MB/s IOPS 119986.0/s The results are the averages of three runs with a server using two dual-core 1.60 GHz Xeon processors with DDR2 memory. I doubt think that someone will complain about the performance regression due to this patch. In addition, unlike scsi_debug, the real LLDs allocate the own data structure per scsi_cmnd so the performance differences would be smaller (and with the real hard disk overheads). Here's the full results: http://www.kernel.org/pub/linux/kernel/people/tomo/sense/results.txt = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH] use dynamically allocated sense buffer This removes static array sense_buffer in scsi_cmnd and uses dynamically allocated sense_buffer (with GFP_DMA). The reason for doing this is that some architectures need cacheline aligned buffer for DMA: http://lkml.org/lkml/2007/11/19/2 The problems are that scsi_eh_prep_cmnd puts scsi_cmnd::sense_buffer to sglist and some LLDs directly DMA to scsi_cmnd::sense_buffer. It's necessary to DMA to scsi_cmnd::sense_buffer safely. This patch solves these issues. __scsi_get_command allocates sense_buffer via kmem_cache_alloc and attaches it to a scsi_cmnd so everything just work as before. A scsi_host reserves one sense buffer for the backup command (shost-backup_sense_buffer). Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/hosts.c | 10 ++- drivers/scsi/scsi.c | 67 - drivers/scsi/scsi_priv.h |2 + include/scsi/scsi_cmnd.h |2 +- include/scsi/scsi_host.h |3 ++ 5 files changed, 80 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index 9a10b43..35c5f0e 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -205,10 +205,14 @@ int scsi_add_host(struct Scsi_Host *shost, struct device *dev) if (!shost-shost_gendev.parent) shost-shost_gendev.parent = dev ? dev : platform_bus; - error = device_add(shost-shost_gendev); + error = scsi_setup_command_sense_buffer(shost); if (error) goto out; + error = device_add(shost-shost_gendev); + if (error) + goto destroy_pool; + scsi_host_set_state(shost, SHOST_RUNNING); get_device(shost-shost_gendev.parent); @@ -248,6 +252,8 @@ int scsi_add_host(struct Scsi_Host *shost, struct device *dev) class_device_del(shost-shost_classdev); out_del_gendev: device_del(shost-shost_gendev); + destroy_pool: + scsi_destroy_command_sense_buffer(shost); out: return error; } @@ -267,6 +273,8 @@ static void scsi_host_dev_release(struct device *dev) scsi_free_queue(shost-uspace_req_q); } + scsi_destroy_command_sense_buffer(shost); + scsi_destroy_command_freelist(shost); if (shost-bqt) blk_free_tags(shost-bqt); diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 54ff611..d153da3 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -161,6 +161,9 @@ static struct scsi_host_cmd_pool scsi_cmd_dma_pool = { static DEFINE_MUTEX(host_cmd_pool_mutex); +static struct kmem_cache *sense_buffer_slab; +static int sense_buffer_slab_users; + /** * __scsi_get_command - Allocate a struct scsi_cmnd * @shost: host to transmit command @@ -186,6 +189,22 @@ struct scsi_cmnd *__scsi_get_command(struct Scsi_Host *shost, gfp_t gfp_mask) list_del_init(cmd-list); } spin_unlock_irqrestore(shost-free_list_lock, flags); + + if (cmd) { + memset(cmd, 0, sizeof(*cmd)); + cmd-sense_buffer = shost-backup_sense_buffer; + } + } else { + unsigned char *buf; + + buf = kmem_cache_alloc(sense_buffer_slab, __GFP_DMA|gfp_mask); + if (likely(buf)) { + memset(cmd, 0, sizeof(*cmd)); + cmd-sense_buffer = buf; + } else { + kmem_cache_free(shost-cmd_pool-slab, cmd); + cmd = NULL; + } } return cmd; @@ -212,7 +231,6 @@ struct scsi_cmnd *scsi_get_command(struct scsi_device *dev, gfp_t gfp_mask) if (likely(cmd != NULL)) { unsigned long flags; - memset(cmd, 0, sizeof(*cmd)); cmd-device = dev; init_timer(cmd-eh_timeout
Re: [PATCH v2] use dynamically allocated sense buffer
On Tue, 15 Jan 2008 15:56:56 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: On Tue, Jan 15 2008 at 11:23 +0200, FUJITA Tomonori [EMAIL PROTECTED] wrote: This is the second version of http://marc.info/?l=linux-scsim=119933628210006w=2 I gave up once, but I found that the performance loss is negligible (within 1%) by using kmem_cache_alloc instead of mempool. I use scsi_debug with fake_rw=1 and disktest (DIO reads with 8 threads) again: scsi-misc (slub) | 486.9 MB/s IOPS 124652.9/s dynamic sense buf (slub) | 483.2 MB/s IOPS 123704.1/s scsi-misc (slab) | 467.0 MB/s IOPS 119544.3/s dynamic sense buf (slab) | 468.7 MB/s IOPS 119986.0/s The results are the averages of three runs with a server using two dual-core 1.60 GHz Xeon processors with DDR2 memory. I doubt think that someone will complain about the performance regression due to this patch. In addition, unlike scsi_debug, the real LLDs allocate the own data structure per scsi_cmnd so the performance differences would be smaller (and with the real hard disk overheads). Here's the full results: http://www.kernel.org/pub/linux/kernel/people/tomo/sense/results.txt TOMO Hi. This is grate news. Thanks I like what you did here. and it's good to know. Why should a mempool be so slow ;) I have a small concern of a leak, please see below, but otherwise this is grate. = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH] use dynamically allocated sense buffer This removes static array sense_buffer in scsi_cmnd and uses dynamically allocated sense_buffer (with GFP_DMA). The reason for doing this is that some architectures need cacheline aligned buffer for DMA: http://lkml.org/lkml/2007/11/19/2 The problems are that scsi_eh_prep_cmnd puts scsi_cmnd::sense_buffer to sglist and some LLDs directly DMA to scsi_cmnd::sense_buffer. It's necessary to DMA to scsi_cmnd::sense_buffer safely. This patch solves these issues. __scsi_get_command allocates sense_buffer via kmem_cache_alloc and attaches it to a scsi_cmnd so everything just work as before. A scsi_host reserves one sense buffer for the backup command (shost-backup_sense_buffer). Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/hosts.c | 10 ++- drivers/scsi/scsi.c | 67 - drivers/scsi/scsi_priv.h |2 + include/scsi/scsi_cmnd.h |2 +- include/scsi/scsi_host.h |3 ++ 5 files changed, 80 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index 9a10b43..35c5f0e 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -205,10 +205,14 @@ int scsi_add_host(struct Scsi_Host *shost, struct device *dev) if (!shost-shost_gendev.parent) shost-shost_gendev.parent = dev ? dev : platform_bus; - error = device_add(shost-shost_gendev); + error = scsi_setup_command_sense_buffer(shost); if (error) goto out; + error = device_add(shost-shost_gendev); + if (error) + goto destroy_pool; + scsi_host_set_state(shost, SHOST_RUNNING); get_device(shost-shost_gendev.parent); @@ -248,6 +252,8 @@ int scsi_add_host(struct Scsi_Host *shost, struct device *dev) class_device_del(shost-shost_classdev); out_del_gendev: device_del(shost-shost_gendev); + destroy_pool: + scsi_destroy_command_sense_buffer(shost); out: return error; } @@ -267,6 +273,8 @@ static void scsi_host_dev_release(struct device *dev) scsi_free_queue(shost-uspace_req_q); } + scsi_destroy_command_sense_buffer(shost); + scsi_destroy_command_freelist(shost); if (shost-bqt) blk_free_tags(shost-bqt); diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 54ff611..d153da3 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -161,6 +161,9 @@ static struct scsi_host_cmd_pool scsi_cmd_dma_pool = { static DEFINE_MUTEX(host_cmd_pool_mutex); +static struct kmem_cache *sense_buffer_slab; +static int sense_buffer_slab_users; + /** * __scsi_get_command - Allocate a struct scsi_cmnd * @shost: host to transmit command @@ -186,6 +189,22 @@ struct scsi_cmnd *__scsi_get_command(struct Scsi_Host *shost, gfp_t gfp_mask) list_del_init(cmd-list); } spin_unlock_irqrestore(shost-free_list_lock, flags); + + if (cmd) { + memset(cmd, 0, sizeof(*cmd)); + cmd-sense_buffer = shost-backup_sense_buffer; [1] If command was put on free_list in __put_command(), then this here will leak the sense_buffer that was allocated for that command. See explanations below. + } + } else { + unsigned char *buf; + + buf
Re: [PATCH v2] use dynamically allocated sense buffer
On Tue, 15 Jan 2008 17:44:14 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: If __scsi_put_command puts a command to shost-free_list, it doesn't free scmd-sense_buffer since it's the sense_buffer for the backup sense_buffer. If __scsi_put_command puts a command to shost-cmd_pool-slab (if shost-free_list isn't empty), it alos puts its sense_buffer to sense_buffer_slab. Yes, but these are not necessarily the same commands. Think of this, Ah, sorry, I need to update shost-backup_sense_buffer when __scsi_put_command puts a command to the free_list. The run queues have commands in them, a request comes that demands a cmnd, out-of-memory condition causes the spare from free_list cmnd to be issued, and is put at tail of some run queue. Now comes the first done cmnd, it is immediately put to free_list, but it's sense_buffer was from sense_buffer_slab. I think the solution is simple just immediately allocate the sense_buffer in scsi_setup_command_freelist() and put it on that first free_list command. Then make sure that also the sense_buffer is freed in scsi_destroy_command_freelist(). This way sense_buffer is always allocated/freed together with cmnd and you don't need the shost-backup_sense_buffer pointer. Yeah, it's more straightforward. I'll submit an update patch soon. Thanks, - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3] use dynamically allocated sense buffer
This is the third version of: http://marc.info/?l=linux-scsim=120038907123706w=2 The changes from the second version are: - Fixed the memory leak bug that Boaz pointed out. shots-backup_sense_buffer has gone. One sense buffer is allocated per host and it's always attached to the scsi_cmnd linked with freelist (backup command). - Calling scsi_setup_command_sense_buffer in scsi_host_alloc instead of scsi_add_host. The first version tries to allocates as many buffers as shost-can_queue so scsi_setup_command_sense_buffer is called in scsi_add_host. But, it's not the case any more, so this calls scsi_setup_command_sense_buffer in scsi_host_alloc like scsi_setup_command_freelist. I did the same tests against this patch (though I knew there were not the performnace differences): dynamic sense buf (slub) | 483.5 MB/s IOPS 123772.7/s For convenience, here are the previous results: scsi-misc (slub) | 486.9 MB/s IOPS 124652.9/s dynamic sense buf (slub) | 483.2 MB/s IOPS 123704.1/s I put the results and the kernel configuration: http://www.kernel.org/pub/linux/kernel/people/tomo/sense/ = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH] use dynamically allocated sense buffer This removes static array sense_buffer in scsi_cmnd and uses dynamically allocated sense_buffer (with GFP_DMA). The reason for doing this is that some architectures need cacheline aligned buffer for DMA: http://lkml.org/lkml/2007/11/19/2 The problems are that scsi_eh_prep_cmnd puts scsi_cmnd::sense_buffer to sglist and some LLDs directly DMA to scsi_cmnd::sense_buffer. It's necessary to DMA to scsi_cmnd::sense_buffer safely. This patch solves these issues. __scsi_get_command allocates sense_buffer via kmem_cache_alloc and attaches it to a scsi_cmnd so everything just work as before. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/hosts.c |9 ++- drivers/scsi/scsi.c | 61 - drivers/scsi/scsi_priv.h |2 + include/scsi/scsi_cmnd.h |2 +- 4 files changed, 70 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index 9a10b43..f5d3fbb 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -268,6 +268,7 @@ static void scsi_host_dev_release(struct device *dev) } scsi_destroy_command_freelist(shost); + scsi_destroy_command_sense_buffer(shost); if (shost-bqt) blk_free_tags(shost-bqt); @@ -372,10 +373,14 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize) else shost-dma_boundary = 0x; - rval = scsi_setup_command_freelist(shost); + rval = scsi_setup_command_sense_buffer(shost); if (rval) goto fail_kfree; + rval = scsi_setup_command_freelist(shost); + if (rval) + goto fail_destroy_sense; + device_initialize(shost-shost_gendev); snprintf(shost-shost_gendev.bus_id, BUS_ID_SIZE, host%d, shost-host_no); @@ -399,6 +404,8 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize) fail_destroy_freelist: scsi_destroy_command_freelist(shost); + fail_destroy_sense: + scsi_destroy_command_sense_buffer(shost); fail_kfree: kfree(shost); return NULL; diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 54ff611..0a4a5b8 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -161,6 +161,9 @@ static struct scsi_host_cmd_pool scsi_cmd_dma_pool = { static DEFINE_MUTEX(host_cmd_pool_mutex); +static struct kmem_cache *sense_buffer_slab; +static int sense_buffer_slab_users; + /** * __scsi_get_command - Allocate a struct scsi_cmnd * @shost: host to transmit command @@ -172,6 +175,7 @@ static DEFINE_MUTEX(host_cmd_pool_mutex); struct scsi_cmnd *__scsi_get_command(struct Scsi_Host *shost, gfp_t gfp_mask) { struct scsi_cmnd *cmd; + unsigned char *buf; cmd = kmem_cache_alloc(shost-cmd_pool-slab, gfp_mask | shost-cmd_pool-gfp_mask); @@ -186,6 +190,21 @@ struct scsi_cmnd *__scsi_get_command(struct Scsi_Host *shost, gfp_t gfp_mask) list_del_init(cmd-list); } spin_unlock_irqrestore(shost-free_list_lock, flags); + + if (cmd) { + buf = cmd-sense_buffer; + memset(cmd, 0, sizeof(*cmd)); + cmd-sense_buffer = buf; + } + } else { + buf = kmem_cache_alloc(sense_buffer_slab, __GFP_DMA|gfp_mask); + if (likely(buf)) { + memset(cmd, 0, sizeof(*cmd)); + cmd-sense_buffer = buf; + } else { + kmem_cache_free(shost-cmd_pool-slab, cmd); + cmd = NULL; + } } return cmd; @@ -212,7
Re: INITIO scsi driver fails to work properly
On Tue, 15 Jan 2008 09:16:06 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Sun, 2008-01-13 at 14:28 +0200, Filippos Papadopoulos wrote: On 1/11/08, James Bottomley [EMAIL PROTECTED] wrote: On Fri, 2008-01-11 at 18:44 +0200, Filippos Papadopoulos wrote: On Jan 11, 2008 5:44 PM, James Bottomley [EMAIL PROTECTED] wrote: I havent reported initio: I/O port range 0x0 is busy. Sorry ... we appear to have several reporters of different bugs in this thread. That message was copied by Chuck Ebbert from a Red Hat bugzilla ... I was assuming it was the same problem. I applied the patch on 2.6.24-rc6-git9 but unfortunatelly same thing happens. First off, has this driver ever worked for you in 2.6? Just booting SLES9 (2.6.5) or RHEL4 (2.6.9) ... or one of their open equivalents to check a really old kernel would be helpful. If you can get it to work, then we can proceed with a patch reversion regime based on the assumption that the problem is a recent commit. Yes it works under 2.6.16.13. See the beginning of this thread, i mention there some things about newer versions. Thanks, actually, I see this: I tried to install OpenSUSE 10.3 (kernel 2.6.22.5) and the latest OpenSUSE 11.0 Alpha 0 (kernel 2.6.24-rc4) but although the initio drivergets loaded during the installation process, yast reports that no hard disk is found. Could you try with a vanilla 2.6.22 kernel? The reason for all of this is that 2.6.22 predates Alan's conversion of this driver (which was my 95% candidate for the source of the bug). I want you to try the vanilla kernel just in case the opensuse one contains a backport. Yes you are right. I compiled the vanilla 2.6.22 and initio driver works. Tell me if you want to apply any patch to it. That's good news ... at least we know where the issue lies; now the problem comes: there are two candidate patches for this issue: Alan's driver update patch and Tomo's accessors patch. Unfortunately, due to merge conflicts the two are pretty hopelessly intertwined. I think I already spotted one bug in the accessor conversion, so I'll look at that again. Alan's also going to acquire an inito board and retest his conversions. I'm afraid it might be a while before we have anything for you to test. Can you try this patch? Thanks, diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c index 01bf018..6891d2b 100644 --- a/drivers/scsi/initio.c +++ b/drivers/scsi/initio.c @@ -2609,6 +2609,7 @@ static void initio_build_scb(struct initio_host * host, struct scsi_ctrl_blk * c cblk-bufptr = cpu_to_le32((u32)dma_addr); cmnd-SCp.dma_handle = dma_addr; + cblk-sglen = nseg; cblk-flags |= SCF_SG; /* Turn on SG list flag */ total_len = 0; - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] sg: set class_data after success
If cdev_add fails in sg_add, sg_remove crashes since class_data is bogus. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/sg.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index f1871ea..92b4367 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1418,7 +1418,6 @@ sg_add(struct class_device *cl_dev, struct class_interface *cl_intf) goto out; } - class_set_devdata(cl_dev, sdp); error = cdev_add(cdev, MKDEV(SCSI_GENERIC_MAJOR, sdp-index), 1); if (error) goto cdev_add_err; @@ -1447,6 +1446,8 @@ sg_add(struct class_device *cl_dev, struct class_interface *cl_intf) Attached scsi generic sg%d type %d\n, sdp-index, scsidp-type); + class_set_devdata(cl_dev, sdp); + return 0; cdev_add_err: -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] sg: handle class_device_create failure properly
Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/sg.c | 11 +++ 1 files changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 92b4367..527e2eb 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1430,11 +1430,14 @@ sg_add(struct class_device *cl_dev, struct class_interface *cl_intf) MKDEV(SCSI_GENERIC_MAJOR, sdp-index), cl_dev-dev, %s, disk-disk_name); - if (IS_ERR(sg_class_member)) - printk(KERN_WARNING sg_add: - class_device_create failed\n); + if (IS_ERR(sg_class_member)) { + printk(KERN_ERR sg_add: + class_device_create failed\n); + error = PTR_ERR(sg_class_member); + goto cdev_add_err; + } class_set_devdata(sg_class_member, sdp); - error = sysfs_create_link(scsidp-sdev_gendev.kobj, + error = sysfs_create_link(scsidp-sdev_gendev.kobj, sg_class_member-kobj, generic); if (error) printk(KERN_ERR sg_add: unable to make symlink -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: panic in sg_add when class_device_create() fails
On Mon, 14 Jan 2008 14:11:36 -0800 Michael Reed [EMAIL PROTECTED] wrote: We're seeing an occasional panic in sg_add() when class_device_create() fails. It's obvious in the code that it uses the pointer to sg_class_member even though it's invalid. We do see the class_device_create failed message. class_set_devdata(cl_dev, sdp); error = cdev_add(cdev, MKDEV(SCSI_GENERIC_MAJOR, sdp-index), 1); if (error) goto cdev_add_err; if (sg_sysfs_valid) { struct class_device * sg_class_member; sg_class_member = class_device_create(sg_sysfs_class, NULL, MKDEV(SCSI_GENERIC_MAJOR, sdp-index), cl_dev-dev, %s, disk-disk_name); if (IS_ERR(sg_class_member)) printk(KERN_WARNING sg_add: class_device_create failed\n); class_set_devdata(sg_class_member, sdp); error = sysfs_create_link(scsidp-sdev_gendev.kobj, sg_class_member-kobj, generic); if (error) printk(KERN_ERR sg_add: unable to make symlink 'generic' back to sg%d\n, sdp-index); } else printk(KERN_WARNING sg_add: sg_sys Invalid\n); I'm uncertain of the correct fix. Perhaps it involves a call to cdev_unmap()? The following patches work for me: http://marc.info/?l=linux-scsim=120037070303939w=2 http://marc.info/?l=linux-scsim=120037070303941w=2 I don't have a good way to test a fix as this problem is not easily reproduced. I used the attached patch (though the fault injection feature can do something better, I guess). diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 527e2eb..4cdd213 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1424,12 +1424,16 @@ sg_add(struct class_device *cl_dev, struct class_interface *cl_intf) sdp-cdev = cdev; if (sg_sysfs_valid) { + static int count = 0; struct class_device * sg_class_member; - sg_class_member = class_device_create(sg_sysfs_class, NULL, - MKDEV(SCSI_GENERIC_MAJOR, sdp-index), - cl_dev-dev, %s, - disk-disk_name); + if (++count % 2) + sg_class_member = class_device_create(sg_sysfs_class, NULL, + MKDEV(SCSI_GENERIC_MAJOR, sdp-index), + cl_dev-dev, %s, + disk-disk_name); + else + sg_class_member = ERR_PTR(-ENOMEM); if (IS_ERR(sg_class_member)) { printk(KERN_ERR sg_add: class_device_create failed\n); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] libsas: fix sense_buffer overrun
Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/libsas/sas_scsi_host.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c index b784089..828fed1 100644 --- a/drivers/scsi/libsas/sas_scsi_host.c +++ b/drivers/scsi/libsas/sas_scsi_host.c @@ -108,7 +108,7 @@ static void sas_scsi_task_done(struct sas_task *task) break; case SAM_CHECK_COND: memcpy(sc-sense_buffer, ts-buf, - max(SCSI_SENSE_BUFFERSIZE, ts-buf_valid_size)); + min(SCSI_SENSE_BUFFERSIZE, ts-buf_valid_size)); stat = SAM_CHECK_COND; break; default: -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] don't zero out sense_buffer in queuecommand
LLDs don't need to zero out scsi_cmnd::sense_buffer in queuecommand since scsi-ml does. This is a preparation of the future changes to allocate the sense_buffer only when necessary. Many LLDs zero out the sense_buffer before touching it on the error case. This patch lets them alone for now because new APIs for them would be added later on. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/aic7xxx_old.c |1 - drivers/scsi/eata_pio.c |1 - drivers/scsi/ips.c |3 --- drivers/scsi/libsas/sas_scsi_host.c |1 - 4 files changed, 0 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/aic7xxx_old.c b/drivers/scsi/aic7xxx_old.c index 8f8db5f..2b402fa 100644 --- a/drivers/scsi/aic7xxx_old.c +++ b/drivers/scsi/aic7xxx_old.c @@ -10293,7 +10293,6 @@ static int aic7xxx_queue(struct scsi_cmnd *cmd, void (*fn)(struct scsi_cmnd *)) aic7xxx_position(cmd) = scb-hscb-tag; cmd-scsi_done = fn; cmd-result = DID_OK; - memset(cmd-sense_buffer, 0, sizeof(cmd-sense_buffer)); aic7xxx_error(cmd) = DID_OK; aic7xxx_status(cmd) = 0; cmd-host_scribble = NULL; diff --git a/drivers/scsi/eata_pio.c b/drivers/scsi/eata_pio.c index 9579507..b5a6092 100644 --- a/drivers/scsi/eata_pio.c +++ b/drivers/scsi/eata_pio.c @@ -369,7 +369,6 @@ static int eata_pio_queue(struct scsi_cmnd *cmd, cp = hd-ccb[y]; memset(cp, 0, sizeof(struct eata_ccb)); - memset(cmd-sense_buffer, 0, sizeof(cmd-sense_buffer)); cp-status = USED; /* claim free slot */ diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c index e54d30c..b1b2295 100644 --- a/drivers/scsi/ips.c +++ b/drivers/scsi/ips.c @@ -2736,8 +2736,6 @@ ips_next(ips_ha_t * ha, int intr) SC-result = DID_OK; SC-host_scribble = NULL; - memset(SC-sense_buffer, 0, sizeof (SC-sense_buffer)); - scb-target_id = SC-device-id; scb-lun = SC-device-lun; scb-bus = SC-device-channel; @@ -3821,7 +3819,6 @@ ips_send_cmd(ips_ha_t * ha, ips_scb_t * scb) /* attempted, a Check Condition occurred, and Sense */ /* Data indicating an Invalid CDB OpCode is returned. */ sp = (char *) scb-scsi_cmd-sense_buffer; - memset(sp, 0, sizeof (scb-scsi_cmd-sense_buffer)); sp[0] = 0x70; /* Error Code */ sp[2] = ILLEGAL_REQUEST;/* Sense Key 5 Illegal Req. */ diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c index 828fed1..9c04225 100644 --- a/drivers/scsi/libsas/sas_scsi_host.c +++ b/drivers/scsi/libsas/sas_scsi_host.c @@ -148,7 +148,6 @@ static struct sas_task *sas_create_task(struct scsi_cmnd *cmd, if (!task) return NULL; - *(u32 *)cmd-sense_buffer = 0; task-uldd_task = cmd; ASSIGN_SAS_TASK(cmd, task); -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] replace sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE
This is the second version of http://marc.info/?l=linux-scsim=119933628210012w=2 o I dropped fas216 since Boaz's patch in scsi-pending will be merged before solving the sense_buffer dma issue. o This is a 'grep and replace' style patch but cleans up dpt_i2o a bit as by permission of Mark (I use min macro). o The previous version overlooked some sizeof sense_buffer lines in aacraid and qla4xxx. o I overlooked the ncr53c8xx compile warning. = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH 2/2] replace sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE This replaces sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE in several LLDs. It's a preparation for the future changes to remove sense_buffer array in scsi_cmnd structure. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/ata/libata-scsi.c |4 ++-- drivers/message/fusion/mptscsih.c |2 +- drivers/message/i2o/i2o_scsi.c |2 +- drivers/scsi/53c700.c | 11 ++- drivers/scsi/BusLogic.c |2 +- drivers/scsi/aacraid/aachba.c | 32 drivers/scsi/advansys.c | 14 +++--- drivers/scsi/aha1542.c |4 ++-- drivers/scsi/aha1740.c |2 +- drivers/scsi/aic7xxx/aic79xx_osm.c |6 +++--- drivers/scsi/aic7xxx/aic7xxx_osm.c |6 +++--- drivers/scsi/aic7xxx_old.c | 10 +- drivers/scsi/arcmsr/arcmsr_hba.c|6 +++--- drivers/scsi/dc395x.c | 16 +++- drivers/scsi/dpt_i2o.c |5 ++--- drivers/scsi/eata.c |4 ++-- drivers/scsi/hptiop.c |2 +- drivers/scsi/ips.c |6 ++ drivers/scsi/ncr53c8xx.c|3 ++- drivers/scsi/qla1280.c |4 ++-- drivers/scsi/qla2xxx/qla_isr.c | 12 ++-- drivers/scsi/qla4xxx/ql4_isr.c | 11 --- drivers/scsi/qlogicpti.c|2 +- drivers/scsi/scsi_error.c |6 +++--- drivers/scsi/scsi_lib.c |2 +- drivers/scsi/sym53c8xx_2/sym_glue.c |5 ++--- drivers/scsi/tmscsim.c |6 +++--- drivers/scsi/u14-34f.c |4 ++-- drivers/scsi/ultrastor.c|2 +- 29 files changed, 92 insertions(+), 99 deletions(-) diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 4bb268b..b633341 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -2334,7 +2334,7 @@ static void atapi_request_sense(struct ata_queued_cmd *qc) DPRINTK(ATAPI request sense\n); /* FIXME: is this needed? */ - memset(cmd-sense_buffer, 0, sizeof(cmd-sense_buffer)); + memset(cmd-sense_buffer, 0, SCSI_SENSE_BUFFERSIZE); ap-ops-tf_read(ap, qc-tf); @@ -2344,7 +2344,7 @@ static void atapi_request_sense(struct ata_queued_cmd *qc) ata_qc_reinit(qc); - ata_sg_init_one(qc, cmd-sense_buffer, sizeof(cmd-sense_buffer)); + ata_sg_init_one(qc, cmd-sense_buffer, SCSI_SENSE_BUFFERSIZE); qc-dma_dir = DMA_FROM_DEVICE; memset(qc-cdb, 0, qc-dev-cdb_len); diff --git a/drivers/message/fusion/mptscsih.c b/drivers/message/fusion/mptscsih.c index 626bb3c..5c614ec 100644 --- a/drivers/message/fusion/mptscsih.c +++ b/drivers/message/fusion/mptscsih.c @@ -111,7 +111,7 @@ int mptscsih_suspend(struct pci_dev *pdev, pm_message_t state); intmptscsih_resume(struct pci_dev *pdev); #endif -#define SNS_LEN(scp) sizeof((scp)-sense_buffer) +#define SNS_LEN(scp) SCSI_SENSE_BUFFERSIZE /*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=*/ /** diff --git a/drivers/message/i2o/i2o_scsi.c b/drivers/message/i2o/i2o_scsi.c index aa6fb94..1bcdbbb 100644 --- a/drivers/message/i2o/i2o_scsi.c +++ b/drivers/message/i2o/i2o_scsi.c @@ -370,7 +370,7 @@ static int i2o_scsi_reply(struct i2o_controller *c, u32 m, */ if (cmd-result) memcpy(cmd-sense_buffer, msg-body[3], - min(sizeof(cmd-sense_buffer), (size_t) 40)); + min(SCSI_SENSE_BUFFERSIZE, 40)); /* only output error code if AdapterStatus is not HBA_SUCCESS */ if ((error 8) 0xff) diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c index 71ff3fb..f4c4fe9 100644 --- a/drivers/scsi/53c700.c +++ b/drivers/scsi/53c700.c @@ -608,7 +608,8 @@ NCR_700_scsi_done(struct NCR_700_Host_Parameters *hostdata, scsi_print_sense(53c700, SCp); #endif - dma_unmap_single(hostdata-dev, slot-dma_handle, sizeof(SCp-sense_buffer), DMA_FROM_DEVICE); + dma_unmap_single(hostdata-dev, slot-dma_handle, +SCSI_SENSE_BUFFERSIZE, DMA_FROM_DEVICE); /* restore the old result if the request sense was * successful
Re: Driver 'sd' needs updating
CC'ed linux-scsi and James, On Thu, 10 Jan 2008 08:51:50 + Nick Warne [EMAIL PROTECTED] wrote: Hi everybody - Happy New Year to you all! OK, updated to git rc7 yesterday - I now see this in syslog: Driver 'sd' needs updating - please use bus_type methods The warning never appeared in RC6, and all google reveals are other peoples logs that are posted about other issues. Do I need to fix up something here? No, you don't. It's harmless, a side effect of: commit 751bf4d7865e4ced406be93b04c7436d866d3684 Author: James Bottomley [EMAIL PROTECTED] Date: Wed Jan 2 11:14:30 2008 -0600 [SCSI] scsi_sysfs: restore prep_fn when ULD is removed It would be better to silence this warning. James, we need to reset prep_fn in each ULD? though it's not nice... - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bsg : Add support for io vectors in bsg
On Thu, 10 Jan 2008 16:46:05 -0500 Pete Wyckoff [EMAIL PROTECTED] wrote: Is there another async I/O mechanism? Userspace builds the CDBs, just needs some way to drop them in SCSI ML. BSG is almost perfect for this, but doesn't do iovec, leading to lots of memcpy. syslets? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] SCSI bug fixes for 2.6.24-rc3
CC'ed Jes, On Tue, 8 Jan 2008 14:15:53 +0300 Evgeniy Dushistov [EMAIL PROTECTED] wrote: On Sun, Nov 25, 2007 at 01:24:25PM +0200, James Bottomley wrote: This is a bit of a rash of bug fixes. The qla1280 is actually a bug fix (in spite of the title---it's actually correcting an existing problem with the qla1280 implementation of accessors that broke the current driver). Recently I build last Linus's git tree, and got: req_cnt is used uninitialized in this function, and see bellow The patch is available here: master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6.git The short changelog is: ... Jes Sorensen (1): qla1280: convert to use the data buffer accessors ... scsi/qla1280.c | 387 + ... /* Calculate number of entries and segments required. */ - req_cnt = 1; Initilization of req_cnt was removed, but in this function there are places like req_cnt += (seg_cnt - 4) / 7; or req_cnt++; This is should be so? req_cnt should not be removed. Jes tested the patch but this critical bug appears only with BITS_PER_LONG != 64 CONFIG_HIGHMEM=n case. So he didn't see this, I think. qla1280_32bit_start_scsi also gives the following warning: drivers/scsi/qla1280.c: In function 'qla1280_32bit_start_scsi': drivers/scsi/qla1280.c:3044: warning: unused variable 'dma_handle' diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c index 146d540..2886407 100644 --- a/drivers/scsi/qla1280.c +++ b/drivers/scsi/qla1280.c @@ -3041,7 +3041,6 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp) int cnt; int req_cnt; int seg_cnt; - dma_addr_t dma_handle; u8 dir; ENTER(qla1280_32bit_start_scsi); @@ -3050,6 +3049,7 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp) cmd-cmnd[0]); /* Calculate number of entries and segments required. */ + req_cnt = 1; seg_cnt = scsi_dma_map(cmd); if (seg_cnt) { /* - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bsg : Add support for io vectors in bsg
On Tue, 8 Jan 2008 17:09:18 -0500 Pete Wyckoff [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote on Sat, 05 Jan 2008 14:01 +0900: From: Deepak Colluru [EMAIL PROTECTED] Subject: [PATCH] bsg : Add support for io vectors in bsg Date: Fri, 4 Jan 2008 21:47:34 +0530 (IST) From: Deepak Colluru [EMAIL PROTECTED] Add support for io vectors in bsg. Signed-off-by: Deepak Colluru [EMAIL PROTECTED] --- bsg.c | 52 +--- 1 file changed, 49 insertions(+), 3 deletions(-) Thanks, but I have to NACK this. You can find the discussion about bsg io vector support and a similar patch in linux-scsi archive. I have no plan to support it since it needs the compat hack. You may recall this is one of the patches I need to use bsg with OSD devices. OSDs overload the SCSI buffer model to put mulitple fields in dataout and datain. Some is user data, but some is more logically created by a library. Memcpying in userspace to wedge all the segments into a single buffer is painful, and is required both on outgoing and incoming data buffers. There are two approaches to add iovec to bsg. 1. Define a new sg_iovec_v4 that uses constant width types. Both 32- and 64-bit userspace would hand arrays of this to the kernel. struct sg_v4_iovec { __u64 iov_base; __u32 iov_len; __u32 __pad1; }; Old patch here: http://article.gmane.org/gmane.linux.scsi/30461/ As I said before, I don't think that inventing a new iovec is a good idea. sgv3 use the common iovec. In addition, sg_io_v4 can be used by other OSes like sg_io_v3. 2. Do as Deepak has done, using the existing sg_iovec, but then also work around the compat issue. Old v3 sg_iovec is: typedef struct sg_iovec /* same structure as used by readv() Linux system */ { /* call. It defines one scatter-gather element. */ void __user *iov_base; /* Starting address */ size_t iov_len; /* Length in bytes */ } sg_iovec_t; Old patch here: http://article.gmane.org/gmane.linux.scsi/30460/ I took another look at the compat approach, to see if it is feasible to keep the compat handling somewhere else, without the use of #ifdef CONFIG_COMPAT and size-comparison code inside bsg.c. I don't see how. The use of iovec is within a write operation on a char device. It's not amenable to a compat_sys_ or a .compat_ioctl approach. I'm partial to #1 because the use of architecture-independent fields matches the rest of struct sg_io_v4. But if you don't want to have another iovec type in the kernel, could we do #2 but just return -EINVAL if the need for compat is detected? I.e. change dout_iovec_count to dout_iovec_length and do the math? If you are ok with removing the write/read interface and just have ioctl, we could can handle comapt stuff like others do. But I think that you (OSD people) really want to keep the write/read interface. Sorry, I think that there is no workaround to support iovec in bsg. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] scsi: Use new __dma_buffer to align sense buffer in scsi_cmnd
On Mon, 07 Jan 2008 15:25:36 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: On Mon, Jan 07 2008 at 8:53 +0200, FUJITA Tomonori [EMAIL PROTECTED] wrote: On Sun, 23 Dec 2007 13:09:05 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: On Fri, Dec 21 2007 at 4:30 +0200, Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: The sense buffer ins scsi_cmnd can nowadays be DMA'ed into directly by some low level drivers (that typically happens with USB mass storage). This is a problem on non cache coherent architectures such as embedded PowerPCs where the sense buffer can share cache lines with other structure members, which leads to various forms of corruption. This uses the newly defined __dma_buffer annotation to enforce that on such platforms, the sense_buffer is contained within its own cache line. This has no effect on cache coherent architectures. Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- include/scsi/scsi_cmnd.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-merge.orig/include/scsi/scsi_cmnd.h 2007-12-21 13:07:14.0 +1100 +++ linux-merge/include/scsi/scsi_cmnd.h 2007-12-21 13:07:29.0 +1100 @@ -88,7 +88,7 @@ struct scsi_cmnd { working on */ #define SCSI_SENSE_BUFFERSIZE96 - unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE]; + unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE] __dma_buffer; /* obtained by REQUEST SENSE when * CHECK CONDITION is received on original * command (auto-sense) */ This has the potential of leaving a big fat ugly hole in the middle of scsi_cmnd. I would suggest of *just* moving the sense_buffer array to be the *first member* of struct scsi_cmnd. The command itself is already cache aligned, allocated by the proper flags to it's slab. And put a fat comment near it's definition. This is until a proper fix is sent. I have in my Q a proposition for a more prominent solution, which I will send next month. Do to short comings in the sense handling and optimizations, but should definitely cover this problem. The code should have time to be discussed and tested, so it is only 2.6.26 material. For the duration of the 2.6.25 kernel we can live with a reorder of scsi_cmnd members, if it solves such a grave bug for some ARCHs. Boaz [RFD below] My proposed solution will be has follows: 1. Since the scsi protocol mandates an immediate REQUEST_SENSE after an error in effect the Q is frozen until the REQUEST_SENSE command returns. 2. The scsi-ml needs the sense buffer for its normal operation, independent from the ULD's request of the sence_buffer or not at request-sense. But in effect, 90% of all scsi-requests come with ULD's allocated buffer for sense, that is copied to, on command completion. 3. 99% of all commands complete successfully, so if an optimization is proposed for the successful case, sacrificing a few cycles for the error case than thats a good thing. My suggestion is to have a per Q, driver-overridable, sense buffer that is DMAed/written to by drivers. At the end of the REQUEST_SENSE command one of 2 things will be done. Either copy the sense to the ULD's supplied buffer, or if not available, allocate one from a dedicated mem_cache pool. So we are completely saving 92 bytes from scsi_cmnd by replacing the buffer with a pointer. We can always read the sense into a per Q buffer. And 10% of %1 of the times we will need to allocate a sense buffer from a dedicated mem_cache I would say thats a nice optimization. The changes to scsi_error/scsi_cmnd and friends, is pretty strait forward. But it depends on a conversion of 4/5 drivers to the new scsi_eh API for REQUEST_SENSE. I have only converted these drivers that interfered with the accessors effort + 1 other places. But there are a few more places that are not converted. Once done. The implementation can easily change with no affect on drivers. I think that removing the sense_buffer array from scsi_cmnd effects lots of LLDs. As I wrote in other mail, many LLDs assume that scsi_cmnd:sense_buffer is always available. Another big task is to take care about auto sense. Have you already had some patches? I've just started to work on this and I'd like to push that fix into 2.6.25. Tomo Hi. Since you ask to push this into 2.6.25, I have ask permission to prioritize this effort, as until now it was on a back burner. There are no short-term solusions and seems that __dma_buffer will not be merged. It would be better to fix this soon though it's a bit hard to fix this before 2.6.25, I think. I have only done 3 drivers up to now. (out of something like 15) I have seen 4 patterns of sense use. 1
Re: [PATCH 2/2] scsi: Use new __dma_buffer to align sense buffer in scsi_cmnd
On Sun, 23 Dec 2007 13:09:05 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: On Fri, Dec 21 2007 at 4:30 +0200, Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: The sense buffer ins scsi_cmnd can nowadays be DMA'ed into directly by some low level drivers (that typically happens with USB mass storage). This is a problem on non cache coherent architectures such as embedded PowerPCs where the sense buffer can share cache lines with other structure members, which leads to various forms of corruption. This uses the newly defined __dma_buffer annotation to enforce that on such platforms, the sense_buffer is contained within its own cache line. This has no effect on cache coherent architectures. Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- include/scsi/scsi_cmnd.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-merge.orig/include/scsi/scsi_cmnd.h 2007-12-21 13:07:14.0 +1100 +++ linux-merge/include/scsi/scsi_cmnd.h2007-12-21 13:07:29.0 +1100 @@ -88,7 +88,7 @@ struct scsi_cmnd { working on */ #define SCSI_SENSE_BUFFERSIZE 96 - unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE]; + unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE] __dma_buffer; /* obtained by REQUEST SENSE when * CHECK CONDITION is received on original * command (auto-sense) */ This has the potential of leaving a big fat ugly hole in the middle of scsi_cmnd. I would suggest of *just* moving the sense_buffer array to be the *first member* of struct scsi_cmnd. The command itself is already cache aligned, allocated by the proper flags to it's slab. And put a fat comment near it's definition. This is until a proper fix is sent. I have in my Q a proposition for a more prominent solution, which I will send next month. Do to short comings in the sense handling and optimizations, but should definitely cover this problem. The code should have time to be discussed and tested, so it is only 2.6.26 material. For the duration of the 2.6.25 kernel we can live with a reorder of scsi_cmnd members, if it solves such a grave bug for some ARCHs. Boaz [RFD below] My proposed solution will be has follows: 1. Since the scsi protocol mandates an immediate REQUEST_SENSE after an error in effect the Q is frozen until the REQUEST_SENSE command returns. 2. The scsi-ml needs the sense buffer for its normal operation, independent from the ULD's request of the sence_buffer or not at request-sense. But in effect, 90% of all scsi-requests come with ULD's allocated buffer for sense, that is copied to, on command completion. 3. 99% of all commands complete successfully, so if an optimization is proposed for the successful case, sacrificing a few cycles for the error case than thats a good thing. My suggestion is to have a per Q, driver-overridable, sense buffer that is DMAed/written to by drivers. At the end of the REQUEST_SENSE command one of 2 things will be done. Either copy the sense to the ULD's supplied buffer, or if not available, allocate one from a dedicated mem_cache pool. So we are completely saving 92 bytes from scsi_cmnd by replacing the buffer with a pointer. We can always read the sense into a per Q buffer. And 10% of %1 of the times we will need to allocate a sense buffer from a dedicated mem_cache I would say thats a nice optimization. The changes to scsi_error/scsi_cmnd and friends, is pretty strait forward. But it depends on a conversion of 4/5 drivers to the new scsi_eh API for REQUEST_SENSE. I have only converted these drivers that interfered with the accessors effort + 1 other places. But there are a few more places that are not converted. Once done. The implementation can easily change with no affect on drivers. I think that removing the sense_buffer array from scsi_cmnd effects lots of LLDs. As I wrote in other mail, many LLDs assume that scsi_cmnd:sense_buffer is always available. Another big task is to take care about auto sense. Have you already had some patches? I've just started to work on this and I'd like to push that fix into 2.6.25. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Open-FCoE on linux-scsi
On Fri, 4 Jan 2008 14:07:28 -0800 Dev, Vasu [EMAIL PROTECTED] wrote: _If_ there will indeed be dedicated FCoE HBAs in the future, the following stack could exist in addition to the one above: - SCSI core, scsi_transport_fc - FCoE HBA driver(s) Agreed. My FCoE initiator design would be something like: scsi-ml fcoe initiator driver libfcoe fc_transport_class (inclusing fcoe support) And FCoE HBA LLDs work like: scsi-ml FCoE HBA LLDs (some of them might use libfcoe) fc_transport_class (inclusing fcoe support) That's the way that other transport classes do, I think. For me, the current code tries to invent another fc class. For example, the code newly defines: struct fc_remote_port { struct list_head rp_list; /* list under fc_virt_fab */ struct fc_virt_fab *rp_vf; /* virtual fabric */ fc_wwn_trp_port_wwn;/* remote port world wide name */ fc_wwn_trp_node_wwn;/* remote node world wide name */ fc_fid_trp_fid; /* F_ID for remote_port if known */ atomic_trp_refcnt; /* reference count */ u_int rp_disc_ver;/* discovery instance */ u_int rp_io_limit;/* limit on outstanding I/Os */ u_int rp_io_count;/* count of outstanding I/Os */ u_int rp_fcp_parm;/* remote FCP service parameters */ u_int rp_local_fcp_parm; /* local FCP service parameters */ void*rp_client_priv; /* HBA driver private data */ void*rp_fcs_priv; /* FCS driver private data */ struct sa_event_list *rp_events; /* event list */ struct sa_hash_link rp_fid_hash_link; struct sa_hash_link rp_wwpn_hash_link; /* * For now, there's just one session per remote port. * Eventually, for multipathing, there will be more. */ u_char rp_sess_ready; /* session ready to be used */ struct fc_sess *rp_sess; /* session */ void*dns_lookup;/* private dns lookup */ int dns_lookup_count; /* number of attempted lookups */ }; /* * remote ports are created and looked up by WWPN. */ struct fc_remote_port *fc_remote_port_create(struct fc_virt_fab *, fc_wwn_t); struct fc_remote_port *fc_remote_port_lookup(struct fc_virt_fab *, fc_fid_t, fc_wwn_t wwpn); struct fc_remote_port *fc_remote_port_lookup_create(struct fc_virt_fab *, fc_fid_t, fc_wwn_t wwpn, fc_wwn_t wwnn); The FCoE LLD needs to exploit the exsting struct fc_rport and APIs. The openfc is software implementation of FC services such as FC login and target discovery and it is already using/exploiting existing fc transport class including fc_rport struct. You can see openfc using fc_rport in openfc_queuecommand() and using fc transport API fc_port_remote_add() for fc_rport. You just calls fc_remote_port_add. I don't think that reinventing the whole rport management like reference counting doesn't mean exploiting the exsting struct fc_rport and APIs. The fcoe module is just a first example of possible openfc transport but openfc can be used with other transports or HW HBAs also. The openfc does provide generic transport interface using fcdev which is currently used by FCoE module. One can certainly implement partly or fully openfc and fcoe modules in FCoE HBA. As pointed out in other mails, I believe that the similar job has done in other transport classes using scsi transport class infrastructure and the FCoE needs to follow the existing examples. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Open-FCoE on linux-scsi
On Sat, 05 Jan 2008 00:41:05 +0100 Stefan Richter [EMAIL PROTECTED] wrote: Dev, Vasu wrote: [FUJITA Tomonori wrote:] Agreed. My FCoE initiator design would be something like: scsi-ml fcoe initiator driver libfcoe fc_transport_class (inclusing fcoe support) And FCoE HBA LLDs work like: scsi-ml FCoE HBA LLDs (some of them might use libfcoe) fc_transport_class (inclusing fcoe support) Wouldn't it make more sense to think of fc_transport_class as a FCP layer, sitting between scsi-ml and the various FC interconnect drivers (among them Openfc and maybe more FCoE drivers)? I.e. you have SCSI command set layer -- SCSI core -- SCSI transport layer -- interconnect layer.¹ Oops, I should have depicted: scsi-ml fc_transport_class (inclusing fcoe support) FCoE HBA LLDs (some of them might use libfcoe) As you pointed out, that's the correct layering from the perspective of SCSI architecture. I put FCoE HBA LLDs over fc_transport_class just because LLDs directly interact with scsi-ml to perform the main work, queuecommand/done (as you explained in 1). I am not familiar with FCP/ FCoE/ FC-DA et al, but I guess the FCoE support in the FCP transport layer should then go to the extent of target discovery, login, lifetime management and representation of remote ports and so on as far as it pertains to FCP (the SCSI transport protocol, FC-4 layer) independently of the interconnect (FC-3...FC-0 layers).² [...] The FCoE LLD needs to exploit the exsting struct fc_rport and APIs. The openfc is software implementation of FC services such as FC login and target discovery and it is already using/exploiting existing fc transport class including fc_rport struct. You can see openfc using fc_rport in openfc_queuecommand() and using fc transport API fc_port_remote_add() for fc_rport. Hence, aren't there interconnect independent parts of target discovery and login which should be implemented in fc_transport_class? The interconnect dependent parts would then live in LLD methods to be provided in struct fc_function_template. Agreed. Then FCoE helper functions that aren't useful for all the FCoE LLDs would go libfcoe like iscsi class does (and sas class also does, I guess). I.e. not only make full use of the API of fc_transport_class, also think about changing the API _if_ necessary to become a more useful implementation of the interface below FC-4. --- ¹) The transport classes are of course not layers in such a sense that they would completely hide SCSI core from interconnect drivers. They don't really have to; they nevertheless live at a higher level of abstraction than LLDs and a lower level of abstraction than SCSI core. (One obvious example that SCSI core is less hidden than it possibly could be can be seen by the struct fc_function_template methods having struct scsi_target * and struct Scsi_Host * arguments, instead of struct fc_xyz * arguments.) ²) I'm using the term interconnect from the SCSI perspective, not from the FC perspective. -- Stefan Richter -=-==--- ---= --=-= http://arcgraph.de/sr/ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Open-FCoE on linux-scsi
On Thu, 3 Jan 2008 13:58:29 -0800 Love, Robert W [EMAIL PROTECTED] wrote: Talking about stability is a bit premature, I think. The first thing to do is finding a design that can be accepted into mainline. How can we get this started? We've provided our current solution, but need feedback to guide us in the right direction. We've received little quips about libsa and libcrc and now it looks like we should look at what we can move to userspace (see below), but that's all the feedback we've got so far. Can you tell us what you think about our current architecture? Then we could discuss your concerns... I think that you have got littel feedback since few people have read the code. Hopefully, this discussion gives some information. My main concern is transport class integration. But they are just mine. The SCSI maintainer and FC people might have different opinions. 2) Abstractions- We consider libsa a big bug, which we're trying to strip down piece by piece. Vasu took out the LOG_SA code and I'm looking into changing the ASSERTs to BUG_ON/WARN_ONs. That isn't all of it, but that's how we're breaking it down. Agreed, libsa (and libcrc) should be removed. 3) Target- The duplicate code of the target is too much. I want to integrate the target into our -upstream tree. Without doing that, fixes to the -upstream tree won't benefit the target and it will get into worse shape than it already is, unless someone is porting those patches to the target too. I think that ideally we'd want to reduce the target's profile and move it to userspace under tgt. 4) Userspace/Kernel interaction- It's our belief that netlink is the preferred mechanism for kernel/userspace interaction. Yi has converted the FCoE ioctl code to netlink and is looking into openfc next. There are other options and I'm not sure that netlink is the best. I think that there is no general consensus about the best mechanism for kernel/userspace interaction. Even ioctl is still accepted into mainline (e.g. kvm). I expect you get an idea to use netlink from open-iscsi, but unlike open-iscsi, for now the FCoE code does just configuration with kernel/userspace interaction. open-iscsi has non-data path in user space so the kernel need to send variable-length data (PDUs, event, etc) to user space via netlink. So open-iscsi really needs netlink. If you have the FCoE non-data path in user space, netlink would work well for you. We definitely got the netlink direction from open-iscsi. Combining your comment that It's hard to convince the kernel maintainers to merge something into mainline that which can be implemented in user space with If you have the FCoE non-data path in user space, netlink would work well for you, makes it sound like this is an architectural change we should consider. I think they are different topics (though they are related). It's hard to convince the kernel maintainers to merge something into mainline that which can be implemented in user space applies to the target driver. You can fully implement FCoE target software in user space, right? So if so, it's hard to push it into kernel. The trend to push the non-data path to user space applies to the initiator driver. Initiator drivers are expected to run in kernel space but open-iscsi driver was split and the non-data part was moved to user space. The kernel space and user-space parts work together. It's completely different from iSCSI target drivers that can be implemented fully in user space. I'm not sure how strong the trend is though. Is moving non data-path code to userspace a requirement? (you might have answered me already by saying you had 2x failed upstream attempts) I don't know. You need to ask James. I would add one TODO item, better integration with scsi_transport_fc. If we have HW FCoE HBAs in the future, we need FCoE support in the fc transport class (you could use its netlink mechanism for event notification). What do you have in mind in particular? Our layers are, SCSI Openfc FCoE net_devive NIC driver So, it makes sense to me that we fit under scsi_transport_fc. I like our layering- we clearly have SCSI on our top edge and net_dev at our bottom edge. My initial reaction would be to resist merging openfc and fcoe and creating a scsi_transport_fcoe.h interface. As I wrote in another mail, this part is the major issue for me. BTW, I think that the name 'openfc' is a bit strange. Surely, the mainline iscsi initiator driver is called 'open-iscsi' but it doesn't have any functions or files called 'open*'. It's just the project name. Understood, but open-iscsi doesn't have the layering scheme that we do. Since we're providing a Fibre Channel protocol processing layer that different transport types can register with I think the generic name is appropriate. Anyway, I don't think anyone here is terribly stuck on the name; it's not a high priority at this time.
Re: Open-FCoE on linux-scsi
On Fri, 04 Jan 2008 12:45:45 +0100 Stefan Richter [EMAIL PROTECTED] wrote: On 1/3/2008 10:58 PM, Love, Robert W wrote: [FUJITA Tomonori wrote] I would add one TODO item, better integration with scsi_transport_fc. If we have HW FCoE HBAs in the future, we need FCoE support in the fc transport class (you could use its netlink mechanism for event notification). What do you have in mind in particular? Our layers are, SCSI Openfc FCoE net_devive NIC driver So, it makes sense to me that we fit under scsi_transport_fc. I like our layering- we clearly have SCSI on our top edge and net_dev at our bottom edge. My initial reaction would be to resist merging openfc and fcoe and creating a scsi_transport_fcoe.h interface. AFAIU the stack should be: - SCSI core, scsi_transport_fc - Openfc (an FCoE implementation) - net_device - NIC driver _If_ there will indeed be dedicated FCoE HBAs in the future, the following stack could exist in addition to the one above: - SCSI core, scsi_transport_fc - FCoE HBA driver(s) Agreed. My FCoE initiator design would be something like: scsi-ml fcoe initiator driver libfcoe fc_transport_class (inclusing fcoe support) And FCoE HBA LLDs work like: scsi-ml FCoE HBA LLDs (some of them might use libfcoe) fc_transport_class (inclusing fcoe support) That's the way that other transport classes do, I think. For me, the current code tries to invent another fc class. For example, the code newly defines: struct fc_remote_port { struct list_head rp_list; /* list under fc_virt_fab */ struct fc_virt_fab *rp_vf; /* virtual fabric */ fc_wwn_trp_port_wwn;/* remote port world wide name */ fc_wwn_trp_node_wwn;/* remote node world wide name */ fc_fid_trp_fid; /* F_ID for remote_port if known */ atomic_trp_refcnt; /* reference count */ u_int rp_disc_ver;/* discovery instance */ u_int rp_io_limit;/* limit on outstanding I/Os */ u_int rp_io_count;/* count of outstanding I/Os */ u_int rp_fcp_parm;/* remote FCP service parameters */ u_int rp_local_fcp_parm; /* local FCP service parameters */ void*rp_client_priv; /* HBA driver private data */ void*rp_fcs_priv; /* FCS driver private data */ struct sa_event_list *rp_events; /* event list */ struct sa_hash_link rp_fid_hash_link; struct sa_hash_link rp_wwpn_hash_link; /* * For now, there's just one session per remote port. * Eventually, for multipathing, there will be more. */ u_char rp_sess_ready; /* session ready to be used */ struct fc_sess *rp_sess; /* session */ void*dns_lookup;/* private dns lookup */ int dns_lookup_count; /* number of attempted lookups */ }; /* * remote ports are created and looked up by WWPN. */ struct fc_remote_port *fc_remote_port_create(struct fc_virt_fab *, fc_wwn_t); struct fc_remote_port *fc_remote_port_lookup(struct fc_virt_fab *, fc_fid_t, fc_wwn_t wwpn); struct fc_remote_port *fc_remote_port_lookup_create(struct fc_virt_fab *, fc_fid_t, fc_wwn_t wwpn, fc_wwn_t wwnn); The FCoE LLD needs to exploit the exsting struct fc_rport and APIs. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 1/2] replace sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE
On Thu, 3 Jan 2008 11:10:04 -0500 Salyzyn, Mark [EMAIL PROTECTED] wrote: ACK on aacraid/ips/dpt_i2o bits. Inspected others, this patch IS inert. Thanks! NitMeBeingStupidAndAddingARiderToTheBill: I know it was a grep/replace. If you need to respin because of Boaz and do not mind, do not hesitate to optimize (?) and instead do: diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c index 70f48a1..c6380c0 100644 --- a/drivers/scsi/dpt_i2o.c +++ b/drivers/scsi/dpt_i2o.c @@ -2298,7 +2298,6 @@ static s32 adpt_i2o_to_scsi(void __iomem *reply, struct scsi_cmnd* cmd) // copy over the request sense data if it was a check // condition status - if(dev_status == 0x02 /*CHECK_CONDITION*/) { - u32 len = sizeof(cmd-sense_buffer); - len = (len 40) ? 40 : len; + if (dev_status == 0x02 /*CHECK_CONDITION*/) { + u32 len = (SCSI_SENSE_BUFFERSIZE 40) ? 40 : SCSI_SENSE_BUFFERSIZE; // Copy over the sense data memcpy_fromio(cmd-sense_buffer, (reply+28) , len); I see. I'll do if I need to send an updated patch. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bsg : Add support for io vectors in bsg
From: Deepak Colluru [EMAIL PROTECTED] Subject: [PATCH] bsg : Add support for io vectors in bsg Date: Fri, 4 Jan 2008 21:47:34 +0530 (IST) From: Deepak Colluru [EMAIL PROTECTED] Add support for io vectors in bsg. Signed-off-by: Deepak Colluru [EMAIL PROTECTED] --- bsg.c | 52 +--- 1 file changed, 49 insertions(+), 3 deletions(-) Thanks, but I have to NACK this. You can find the discussion about bsg io vector support and a similar patch in linux-scsi archive. I have no plan to support it since it needs the compat hack. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libata: eliminate the home grown dma padding in favour of that provided by the block layer
On Mon, 31 Dec 2007 15:56:08 -0600 James Bottomley [EMAIL PROTECTED] wrote: ATA requires that all DMA transfers begin and end on word boundaries. Because of this, a large amount of machinery grew up in ide to adjust scatterlists on this basis. However, as of 2.5, the block layer has a dma_alignment variable which ensures both the beginning and length of a DMA transfer are aligned on the dma_alignment boundary. Although the block layer does adjust the beginning of the transfer to ensure this happens, it doesn't actually adjust the length, it merely makes sure that space is allocated for transfers beyond the declared length. The upshot of this is that scatterlists may be padded to any size between the actual length and the length adjusted to the dma_alignment safely knowing that memory is allocated in this region. Great! diff --git a/include/linux/libata.h b/include/linux/libata.h index 124033c..2f40d57 100644 --- a/include/linux/libata.h +++ b/include/linux/libata.h @@ -282,7 +282,7 @@ enum { /* size of buffer to pad xfers ending on unaligned boundaries */ ATA_DMA_PAD_SZ = 4, - ATA_DMA_PAD_BUF_SZ = ATA_DMA_PAD_SZ * ATA_MAX_QUEUE, + ATA_DMA_PAD_MASK= ATA_DMA_PAD_SZ - 1, /* ering size */ ATA_ERING_SIZE = 32, @@ -446,12 +446,9 @@ struct ata_queued_cmd { unsigned long flags; /* ATA_QCFLAG_xxx */ unsigned inttag; unsigned intn_elem; - unsigned intn_iter; - unsigned intorig_n_elem; int dma_dir; - unsigned intpad_len; unsigned intsect_size; unsigned intnbytes; @@ -461,7 +458,6 @@ struct ata_queued_cmd { unsigned intcursg_ofs; struct scatterlist sgent; - struct scatterlist pad_sgent; void*buf_virt; /* DO NOT iterate over __sg manually, use ata_for_each_sg() */ @@ -606,9 +602,6 @@ struct ata_port { struct ata_prd *prd;/* our SG list */ dma_addr_t prd_dma; /* and its DMA mapping */ - void*pad; /* array of DMA pad buffers */ - dma_addr_t pad_dma; - struct ata_ioports ioaddr; /* ATA cmd/ctl/dma register blocks */ u8 ctl;/* cache of ATA control register */ @@ -1080,24 +1073,15 @@ extern void ata_port_pbar_desc(struct ata_port *ap, int bar, ssize_t offset, static inline struct scatterlist * ata_qc_first_sg(struct ata_queued_cmd *qc) { - qc-n_iter = 0; if (qc-n_elem) return qc-__sg; - if (qc-pad_len) - return qc-pad_sgent; return NULL; } static inline struct scatterlist * ata_qc_next_sg(struct scatterlist *sg, struct ata_queued_cmd *qc) { - if (sg == qc-pad_sgent) - return NULL; - if (++qc-n_iter qc-n_elem) - return sg_next(sg); - if (qc-pad_len) - return qc-pad_sgent; - return NULL; + return sg_next(sg); } #define ata_for_each_sg(sg, qc) \ How about removing ata_qc_first_sg and ata_qc_next_sg completely? Now we can just replace ata_qc_next_sg with sg_next. qc-__sg seems to be always initialized to NULL so we can remove ata_qc_first_sg too. diff --git a/include/linux/libata.h b/include/linux/libata.h index 4f6404c..2774882 100644 --- a/include/linux/libata.h +++ b/include/linux/libata.h @@ -1054,25 +1054,8 @@ extern void ata_port_pbar_desc(struct ata_port *ap, int bar, ssize_t offset, const char *name); #endif -/* - * qc helpers - */ -static inline struct scatterlist * -ata_qc_first_sg(struct ata_queued_cmd *qc) -{ - if (qc-n_elem) - return qc-__sg; - return NULL; -} - -static inline struct scatterlist * -ata_qc_next_sg(struct scatterlist *sg, struct ata_queued_cmd *qc) -{ - return sg_next(sg); -} - #define ata_for_each_sg(sg, qc) \ - for (sg = ata_qc_first_sg(qc); sg; sg = ata_qc_next_sg(sg, qc)) + for (sg = qc-__sg; sg; sg = sg_next(sg)) static inline unsigned int ata_tag_valid(unsigned int tag) { - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Open-FCoE on linux-scsi
From: Love, Robert W [EMAIL PROTECTED] Subject: RE: Open-FCoE on linux-scsi Date: Mon, 31 Dec 2007 08:34:38 -0800 Hello SCSI mailing list, I'd just like to introduce ourselves a bit before we get started. My name is Robert Love and I'm joined by a team of engineers including Vasu Dev, Chris Leech and Yi Zou. We are committed to maintaining the Open-FCoE project. Aside from Intel engineers we expect engineers from other companies to contribute to Open-FCoE. Our goal is to get the initiator code upstream. We have a lot of working code but recognize that we're early in this project's development. We're looking for direction from you, the experts, on what this project should grow into. I've just added a new fcoe target driver to tgt: http://stgt.berlios.de/ That's great; we'll check it out as soon as everyone is back from the holidays. It's still an experiment. Patches are welcome. The driver runs in user space unlike your target mode driver (I just modified your FCoE code to run it in user space). There seems to be a trend to move non-data-path code userspace, however, Implementing FCoE target drive in user space has no connection with a trend to move non-data-path code user space. It does all the data-path in user space. The examples of the trend to move non-data-path code userspace are open-iscsi, multi-path, etc, I think. I don't like having so much duplicate code. We were going to investigate if we could redesign the target code to have less of a profile and just depend on the initiator modules instead of recompiling openfc as openfc_tgt. What's the general opinion on this? Duplicate code vs. more kernel code? I can see that you're already starting to clean up the code that you ported. Does that mean the duplicate code isn't an issue to you? When we fix bugs in the initiator they're not going to make it into your tree unless you're diligent about watching the list. It's hard to convince the kernel maintainers to merge something into mainline that which can be implemented in user space. I failed twice (with two iSCSI target implementations). Yeah, duplication is not good but the user space code has some great advantages. Both approaches have the pros and cons. The initiator driver succeeded to log in a target, see logical units, and perform some I/Os. It's still very unstable but it would be useful for FCoE developers. I would like to help you push the Open-FCoE initiator to mainline too. What are on your todo list and what you guys working on now? We would really appreciate the help! The best way I could come up with to coordinate this effort was through the BZ- http://open-fcoe.org/bugzilla. I was going to write a BZ wiki entry to help new contributors, but since I haven't yet, here's the bottom line. Sign-up to the BZ, assign bugs to yourself from my name (I'm the default assignee now) and also file bugs as you find them. I don't want to impose much process, but this will allow all of us to know what everyone else is working on. The main things that I think need to be fixed are (in no particular order)- 1) Stability- Just straight up bug fixing. This is ongoing and everyone is looking at bugs. Talking about stability is a bit premature, I think. The first thing to do is finding a design that can be accepted into mainline. 2) Abstractions- We consider libsa a big bug, which we're trying to strip down piece by piece. Vasu took out the LOG_SA code and I'm looking into changing the ASSERTs to BUG_ON/WARN_ONs. That isn't all of it, but that's how we're breaking it down. Agreed, libsa (and libcrc) should be removed. 3) Target- The duplicate code of the target is too much. I want to integrate the target into our -upstream tree. Without doing that, fixes to the -upstream tree won't benefit the target and it will get into worse shape than it already is, unless someone is porting those patches to the target too. I think that ideally we'd want to reduce the target's profile and move it to userspace under tgt. 4) Userspace/Kernel interaction- It's our belief that netlink is the preferred mechanism for kernel/userspace interaction. Yi has converted the FCoE ioctl code to netlink and is looking into openfc next. There are other options and I'm not sure that netlink is the best. I think that there is no general consensus about the best mechanism for kernel/userspace interaction. Even ioctl is still accepted into mainline (e.g. kvm). I expect you get an idea to use netlink from open-iscsi, but unlike open-iscsi, for now the FCoE code does just configuration with kernel/userspace interaction. open-iscsi has non-data path in user space so the kernel need to send variable-length data (PDUs, event, etc) to user space via netlink. So open-iscsi really needs netlink. If you have the FCoE non-data path in user space, netlink would work well for you. I would add one TODO item, better integration with