RE: [PATCH v2 0/8] scsi: scsi transport for ufs devices
> On Sun, Aug 05, 2018 at 02:39:48PM +0300, Avri Altman wrote: > > Here is a proposal to use the scsi transport subsystem to manage > > ufs devices. > > This should superceed previous patches to provide provisioning > support through sysfs or configfs, right? Yes. Thanks, Avri
[PATCH] qla2xxx: Fix issue reported by static checker for qla2x00_els_dcmd2_sp_done()
From: Quinn Tran This patch fixes following Smatch complaint: drivers/scsi/qla2xxx/qla_iocb.c:2647 qla2x00_els_dcmd2_sp_done() error: we previously assumed 'e' could be null (see line 2631) Fixes: 8777e4314d39 ("scsi: qla2xxx: Migrate NVME N2N handling into state machine") Reported-by: Dan Carpenter Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani --- drivers/scsi/qla2xxx/qla_iocb.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c index 059f2c9dc192..213321295753 100644 --- a/drivers/scsi/qla2xxx/qla_iocb.c +++ b/drivers/scsi/qla2xxx/qla_iocb.c @@ -2642,6 +2642,7 @@ qla2x00_els_dcmd2_sp_done(void *ptr, int res) elsio->u.els_plogi.els_resp_pyld, elsio->u.els_plogi.els_resp_pyld_dma); sp->free(sp); + return; } e->u.iosb.sp = sp; qla2x00_post_work(vha, e); -- 2.17.0.rc1.35.g90bbd502d
Re: [PATCH RESEND v2] target: move the rx hdr buffer out of the stack
On 08/03/2018 03:48 AM, Maurizio Lombardi wrote: > When HeaderDigest is enabled on aarch64 machines, target triggers > a lot of failed crc checksum errors: > > example of output in dmesg: > [ 154.495031] HeaderDigest CRC32C failed, received 0x60571c8d, computed > 0x288c3ab9 > [ 154.583821] Got unknown iSCSI OpCode: 0xff > [ 162.979857] HeaderDigest CRC32C failed, received 0x0712e57c, computed > 0x288c3ab9 > ... > > The problem is that the iscsit_get_rx_pdu() function uses > a stack buffer as input for the scatterlist crypto library. > This should be avoided on kernels >= 4.9 because stack buffers > may not be directly mappable to struct page when the > CONFIG_VMAP_STACK option is enabled. > > This patch modifies the code so the buffer will be allocated on the heap > and adds a pointer to it in the iscsi_conn structure. > > v2: allocate conn_rx_buf in iscsi_target_login_thread() > and fix a memory leak > > Signed-off-by: Maurizio Lombardi > --- > drivers/target/iscsi/iscsi_target.c | 6 +- > drivers/target/iscsi/iscsi_target_login.c | 15 +++ > include/target/iscsi/iscsi_target_core.h | 2 ++ > 3 files changed, 22 insertions(+), 1 deletion(-) > > diff --git a/drivers/target/iscsi/iscsi_target.c > b/drivers/target/iscsi/iscsi_target.c > index 8e22379..149fa91 100644 > --- a/drivers/target/iscsi/iscsi_target.c > +++ b/drivers/target/iscsi/iscsi_target.c > @@ -3913,10 +3913,12 @@ static bool iscsi_target_check_conn_state(struct > iscsi_conn *conn) > static void iscsit_get_rx_pdu(struct iscsi_conn *conn) > { > int ret; > - u8 buffer[ISCSI_HDR_LEN], opcode; > + u8 *buffer, opcode; > u32 checksum = 0, digest = 0; > struct kvec iov; > > + buffer = conn->conn_rx_buf; If the buffer is only used in the loop below why does it need to be allocated in __iscsi_target_login_thread? It just seems like the error handling and freeing of that buffer would be more simple if done here.
Re: [PATCH v3] mpt3sas: correct reset of smid while clearing scsi tracker
On Tue, 2018-08-07 at 21:46 +0530, Sreekanth Reddy wrote: > [Sreekanth] In the patch description I mentioned that I have done a > manual mistake and I am correcting with this patch. > > Hope I have answered all of your quires. Not yet unfortunately. Why do you consider the current implementation a wrong? Why is the patch at the start of this e-mail thread considered a fix? Which behavior changes due to this patch? If this patch is a bug fix, how can the bug be triggered and why is this patch considered the right way to fix that bug? Thanks, Bart.
Re: [PATCH] aha1542: convert to DMA mapping API
On Tuesday 07 August 2018 09:29:59 Christoph Hellwig wrote: > Looks like the dma allocation is too late. Updated version below: > > ... Crashes a bit later now: [ 89.270529] scsi host2: Adaptec AHA-1542 (SCSI-ID 7) at IO 0x330, IRQ 11, DMA 7 [ 89.346535] scsi host2: Adaptec 1542 [ 89.358411] bounce: isa pool size: 16 pages [ 89.627133] WARNING: CPU: 0 PID: 0 at ./include/linux/dma-mapping.h:541 dma_free_attrs.constprop.8+0x29/0x76 [aha1542] [ 89.630753] Modules linked in: aha1542 i2c_dev nouveau psmouse serio_raw sg 8139cp wmi hwmon ttm parport_pc parport intel_agp [ 89.630753] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.17.0+ #263 [ 89.630753] Hardware name: /i440ZX-W977TF, BIOS 4.51 PG 07/12/00 [ 89.630753] EIP: dma_free_attrs.constprop.8+0x29/0x76 [aha1542] [ 89.630753] EFLAGS: 00010086 CPU: 0 [ 89.630753] EAX: cfbda200 EBX: 0086 ECX: c008d000 EDX: 0006 [ 89.630753] ESI: EDI: cf514aa0 EBP: cf80df64 ESP: cf80df54 [ 89.630753] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 [ 89.630753] CR0: 80050033 CR2: d0d3b000 CR3: 0fbad000 CR4: 0290 [ 89.630753] Call Trace: [ 89.630753] [ 89.630753] aha1542_free_cmd+0x29/0x40 [aha1542] [ 89.630753] aha1542_interrupt+0x1be/0x267 [aha1542] [ 89.630753] ? scsi_dev_queue_ready+0x7f/0x7f [ 89.630753] __handle_irq_event_percpu+0x2f/0xb4 [ 89.630753] ? irq_set_chained_handler_and_data+0x4a/0x4a [ 89.630753] handle_irq_event_percpu+0x17/0x3d [ 89.630753] handle_irq_event+0x22/0x3b [ 89.630753] handle_level_irq+0x55/0x7a [ 89.630753] handle_irq+0x6c/0x81 [ 89.630753] [ 89.630753] do_IRQ+0x35/0x95 [ 89.630753] ? acpi_idle_enter_s2idle+0x41/0x41 [ 89.630753] common_interrupt+0x34/0x3c [ 89.630753] EIP: cpuidle_enter_state+0xca/0x125 [ 89.630753] EFLAGS: 0246 CPU: 0 [ 89.630753] EAX: de2b83d5 EBX: ECX: 0014 EDX: cfde46e0 [ 89.630753] ESI: 0002 EDI: cf213c80 EBP: cf661f48 ESP: cf661f28 [ 89.630753] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 [ 89.630753] ? acpi_idle_enter_s2idle+0x41/0x41 [ 89.630753] cpuidle_enter+0xf/0x12 [ 89.630753] do_idle+0x162/0x1b1 [ 89.630753] cpu_startup_entry+0x19/0x1b [ 89.630753] rest_init+0x8c/0x8e [ 89.630753] start_kernel+0x306/0x30b [ 89.630753] i386_start_kernel+0x95/0x99 [ 89.630753] startup_32_smp+0x15f/0x164 [ 89.630753] Code: 5d c3 55 85 c0 89 e5 57 56 53 53 74 0a 8b b8 20 01 00 00 85 ff 75 0c 8b 3d 08 aa 66 cf 85 ff 75 02 0f 0b 9c 5b 0f ba e3 09 72 02 <0f> 0b 89 45 f0 8d 42 ff 89 cb c1 e8 0c 83 c9 ff 89 d6 0f bd d0 [ 89.630753] ---[ end trace 3524d2ff63a5e038 ]--- [ 90.240799] scsi host2: Unexpected interrupt [ 90.244724] scsi host2: tarstat=0, hastat=11 idlun=0 ccb#=0 [ 100.293508] systemd-logind[1782]: Failed to start user service, ignoring: Unknown unit: user@0.service [ 100.416746] systemd-logind[1782]: New session c2 of user root. [ 111.096323] scsi 2:0:1:0: tag#0 Trying device reset for target [ 111.105632] scsi 2:0:1:0: tag#0 Trying device reset for target [ 111.114020] scsi host2: Sent BUS RESET to scsi host 2 [ 111.114020] WARNING: CPU: 0 PID: 1821 at ./include/linux/dma-mapping.h:541 dma_free_attrs.constprop.8+0x29/0x76 [aha1542] [ 111.114020] Modules linked in: aha1542 i2c_dev nouveau psmouse serio_raw sg 8139cp wmi hwmon ttm parport_pc parport intel_agp [ 111.114020] CPU: 0 PID: 1821 Comm: scsi_eh_2 Tainted: GW 4.17.0+ #263 [ 111.114020] Hardware name: /i440ZX-W977TF, BIOS 4.51 PG 07/12/00 [ 111.114020] EIP: dma_free_attrs.constprop.8+0x29/0x76 [aha1542] [ 111.114020] EFLAGS: 00010086 CPU: 0 [ 111.114020] EAX: cfbda200 EBX: 0086 ECX: EDX: 0006 [ 111.114020] ESI: c009 EDI: cf514aa0 EBP: cbe89ebc ESP: cbe89eac [ 111.114020] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 [ 111.114020] CR0: 80050033 CR2: 0817ae44 CR3: 0bc2c000 CR4: 0290 [ 111.114020] Call Trace: [ 111.114020] aha1542_free_cmd+0x29/0x40 [aha1542] [ 111.114020] aha1542_reset+0xb0/0xe5 [aha1542] [ 111.114020] aha1542_bus_reset+0xd/0xf [aha1542] [ 111.114020] scsi_try_bus_reset+0x4b/0x94 [ 111.114020] scsi_eh_ready_devs+0x5a4/0x781 [ 111.114020] scsi_error_handler+0x268/0x4b0 [ 111.114020] ? __schedule+0x38f/0x3dd [ 111.114020] kthread+0xcc/0xce [ 111.114020] ? scsi_eh_get_sense+0x16e/0x16e [ 111.114020] ? kthread_cancel_delayed_work_sync+0xf/0xf [ 111.114020] ret_from_fork+0x2e/0x38 [ 111.114020] Code: 5d c3 55 85 c0 89 e5 57 56 53 53 74 0a 8b b8 20 01 00 00 85 ff 75 0c 8b 3d 08 aa 66 cf 85 ff 75 02 0f 0b 9c 5b 0f ba e3 09 72 02 <0f> 0b 89 45 f0 8d 42 ff 89 cb c1 e8 0c 83 c9 ff 89 d6 0f bd d0 [ 111.114020] ---[ end trace 3524d2ff63a5e039 ]--- [ 111.114020] WARNING: CPU: 0 PID: 1821 at ./include/linux/dma-mapping.h:541 dma_free_attrs.constprop.8+0x29/0x76 [aha1542] [ 111.114020] Modules linked in: aha1542 i2c_dev nouveau psmouse serio_raw sg 8139cp wmi hwmon ttm parport_pc parport intel_agp [ 111.114020] CPU: 0
Re: [PATCH v3] mpt3sas: correct reset of smid while clearing scsi tracker
On Tue, Aug 7, 2018 at 8:27 PM, Bart Van Assche wrote: > On Tue, 2018-08-07 at 20:03 +0530, Sreekanth Reddy wrote: >> The main intention of this patch to reset the smid to zero after >> resetting the corresponding smid entry's chain_offset to zero. While >> posting "mpt3sas: Fix calltrace observed while running IO & reset" >> patch I have done manual mistake that I reset the smid to zero before >> resetting the chain_offset which is wrong. Due to which driver >> always's reset the chain_offset of zero th entry of chain_lookup[] >> table which is wrong and hence I am correcting it with this patch. >> >> After that you asked me to add memory barriers and hence I have added >> memory barriers in subsequent versions of the patch. > > Hello Sreekanth, > > Please answer the questions I already asked you twice instead of sidestepping > these. Here I am copying your quires from your previous mails and I am replaying inline.. In mpt3sas_base_get_smid_scsiio() I found the following code: struct scsiio_tracker *request = scsi_cmd_priv(scmd); u16 smid = scmd->request->tag + 1; request->smid = smid; That confirms what you wrote about smid being unique for each I/O request. However, this also raises new questions. As far as I can see all code in the I/O path that accesses the chain_lookup[] array uses index smid - 1. [Sreekanth] Our IOC firmware always requires smid to be >=1, zero is illegal smid for the IOC firmware. Hence while framing MPT SCSI IO request driver always make sure that smid to be >=1, so it uses smid as tag + 1. where as chain_lookup[] starts with index zero, So driver is accessing this chain_lookup[] with smid-1 index. The patch at the start of this e-mail thread modifies two functions, namely mpt3sas_remove_dead_ioc_func() and mpt3sas_base_clear_st(). Since the chain_lookup[] array is indexed with the smid and hence contains request- private data, how is it even possible that these functions race against each other? [Sreekanth] This patch modifies in two functions namely '_base_get_chain_buffer_tracker()' & 'mpt3sas_base_clear_st()' not in the mpt3sas_remove_dead_ioc_func(). Their is no race between mpt3sas_remove_dead_ioc_func() & mpt3sas_base_clear_st(). Also their is no race between _base_get_chain_buffer_tracker() & mpt3sas_base_clear_st(), since for a particular IO _base_get_chain_buffer_tracker() is called during IO submission time for getting required chain buffer and mpt3sas_base_clear_st() is called to free the used chain buffers in the IO completion path. Also we have pre-allocated required chain buffers for each IO request and hence no need to main any list and their is no race between these two functions. Is there perhaps code that accesses the chain_lookup[] array and that is called from another context than the I/O completion path? [Sreekanth] No, it is called only from IO submission time. Is the patch at the start of this e-mail thread the result of a theoretical concern or of a real failure? In the latter case, which sequence of actions triggered the failure? The answer to this question should already have been mentioned in the description of v1 of this patch [Sreekanth] The main intention of this patch to reset the smid to zero after resetting the corresponding smid entry's chain_offset to zero in chain_lookup[]. While posting "mpt3sas: Fix calltrace observed while running IO & reset" patch I have done manual mistake that I reset the smid to zero before resetting the chain_offset which is wrong. Due to which driver always's reset the chain_offset of zero th entry of chain_lookup[] table which is wrong and we may see that scmd's will be returned from scsih_qcmd() with "host busy" status due to unavailability of chain buffers (as driver is resetting the wrong smid's i.e. zero th smid entry's chain_offset instead of correct smid entry's chain_offset) and IO's will be in hung state . So, I am correcting it with this patch. After that you asked me to add memory barriers and hence I have added memory barriers in subsequent versions of the patch. How can this patch be necessary if there are no races between I/O submission and I/O completion? [Sreekanth] As I explained above in this patch I am resetting the smid variable to zero only after resetting the smid's chain_offset in chain_lookup[] table. Due to manual mistake earlier I am resetting the smid variable to zero before resetting the correct smid entry's chain_offset in the chain_lookup[] table, which is wrong and I am correcting it with this patch. Changing the order of two memory writes only makes sense if there is a race condition involved. Since the block layer allocates a request tag before scsih_qcmd() is called and since mpt3sas_base_free_smid() is called before scmd->scsi_done(scmd) in _scsih_io_done(), the request tag is freed after the smid has been freed when I/O completes in a normal way. So there shouldn't be a race condition in that scenario. [Sreekanth] yes
Re: [PATCH v3] mpt3sas: correct reset of smid while clearing scsi tracker
On Tue, 2018-08-07 at 20:03 +0530, Sreekanth Reddy wrote: > The main intention of this patch to reset the smid to zero after > resetting the corresponding smid entry's chain_offset to zero. While > posting "mpt3sas: Fix calltrace observed while running IO & reset" > patch I have done manual mistake that I reset the smid to zero before > resetting the chain_offset which is wrong. Due to which driver > always's reset the chain_offset of zero th entry of chain_lookup[] > table which is wrong and hence I am correcting it with this patch. > > After that you asked me to add memory barriers and hence I have added > memory barriers in subsequent versions of the patch. Hello Sreekanth, Please answer the questions I already asked you twice instead of sidestepping these. Thanks, Bart.
Re: [PATCH v3] mpt3sas: correct reset of smid while clearing scsi tracker
On Tue, Aug 7, 2018 at 7:29 PM, Bart Van Assche wrote: > On Tue, 2018-08-07 at 12:10 +0530, Sreekanth Reddy wrote: >> This function _base_get_chain_buffer_tracker() is called only during >> the IO submission time and mpt3sas_base_clear_st() is called during IO >> completion time and hence I don't see any race condition here. >> >> Currently this patch is very much needed, so please consider this >> patch. May be we can start new email thread to discuss on the possible >> race conditions you are observing and I can clear your quires or I can >> fix them as on we can find the real race conditions. > > Hello Sreekanth, > > How can this patch be necessary if there are no races between I/O submission > and I/O completion? Changing the order of two memory writes only makes sense > if there is a race condition involved. Since the block layer allocates a > request tag before scsih_qcmd() is called and since mpt3sas_base_free_smid() > is called before scmd->scsi_done(scmd) in _scsih_io_done(), the request tag > is freed after the smid has been freed when I/O completes in a normal way. > So there shouldn't be a race condition in that scenario. > > By the way, you have not answered my question about why development of this > patch started. Does this patch address a theoretical concern or a real bug? > In the latter case, which scenario triggers the addressed bug? If you want > this patch to go upstream you should be able to explain why you think it is > useful, and that description should be included in the patch description. Bart, The main intention of this patch to reset the smid to zero after resetting the corresponding smid entry's chain_offset to zero. While posting "mpt3sas: Fix calltrace observed while running IO & reset" patch I have done manual mistake that I reset the smid to zero before resetting the chain_offset which is wrong. Due to which driver always's reset the chain_offset of zero th entry of chain_lookup[] table which is wrong and hence I am correcting it with this patch. After that you asked me to add memory barriers and hence I have added memory barriers in subsequent versions of the patch. Thanks, Sreekanth > > Thanks, > > Bart. >
Re: [PATCH v3] mpt3sas: correct reset of smid while clearing scsi tracker
On Tue, 2018-08-07 at 12:10 +0530, Sreekanth Reddy wrote: > This function _base_get_chain_buffer_tracker() is called only during > the IO submission time and mpt3sas_base_clear_st() is called during IO > completion time and hence I don't see any race condition here. > > Currently this patch is very much needed, so please consider this > patch. May be we can start new email thread to discuss on the possible > race conditions you are observing and I can clear your quires or I can > fix them as on we can find the real race conditions. Hello Sreekanth, How can this patch be necessary if there are no races between I/O submission and I/O completion? Changing the order of two memory writes only makes sense if there is a race condition involved. Since the block layer allocates a request tag before scsih_qcmd() is called and since mpt3sas_base_free_smid() is called before scmd->scsi_done(scmd) in _scsih_io_done(), the request tag is freed after the smid has been freed when I/O completes in a normal way. So there shouldn't be a race condition in that scenario. By the way, you have not answered my question about why development of this patch started. Does this patch address a theoretical concern or a real bug? In the latter case, which scenario triggers the addressed bug? If you want this patch to go upstream you should be able to explain why you think it is useful, and that description should be included in the patch description. Thanks, Bart.
Re: [PATCH v2 0/8] scsi: scsi transport for ufs devices
On Sun, Aug 05, 2018 at 02:39:48PM +0300, Avri Altman wrote: > Here is a proposal to use the scsi transport subsystem to manage > ufs devices. This should superceed previous patches to provide provisioning support through sysfs or configfs, right?
[bug report] scsi: qla2xxx: Migrate NVME N2N handling into state machine
Hello Quinn Tran, This is a semi-automatic email about new static checker warnings. The patch 8777e4314d39: "scsi: qla2xxx: Migrate NVME N2N handling into state machine" from Aug 2, 2018, leads to the following Smatch complaint: drivers/scsi/qla2xxx/qla_iocb.c:2647 qla2x00_els_dcmd2_sp_done() error: we previously assumed 'e' could be null (see line 2631) drivers/scsi/qla2xxx/qla_iocb.c 2630 e = qla2x00_alloc_work(vha, QLA_EVT_UNMAP); 2631 if (!e) { ^ New check for NULL 2632 struct srb_iocb *elsio = >u.iocb_cmd; 2633 2634 if (elsio->u.els_plogi.els_plogi_pyld) 2635 dma_free_coherent(>vha->hw->pdev->dev, 2636 elsio->u.els_plogi.tx_size, 2637 elsio->u.els_plogi.els_plogi_pyld, 2638 elsio->u.els_plogi.els_plogi_pyld_dma); 2639 2640 if (elsio->u.els_plogi.els_resp_pyld) 2641 dma_free_coherent(>vha->hw->pdev->dev, 2642 elsio->u.els_plogi.rx_size, 2643 elsio->u.els_plogi.els_resp_pyld, 2644 elsio->u.els_plogi.els_resp_pyld_dma); 2645 sp->free(sp); 2646 } 2647 e->u.iosb.sp = sp; Dereference without checking 2648 qla2x00_post_work(vha, e); 2649 } regards, dan carpenter
Re: [PATCH] aha1542: convert to DMA mapping API
Looks like the dma allocation is too late. Updated version below: --- >From 9dd7a3eb89d1cc72c001daeb80e44de7c5fb5b00 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 2 Aug 2018 14:22:46 +0200 Subject: aha1542: convert to DMA mapping API aha1542 is one of the last users of the legacy isa_*_to_bus APIs, which also isn't portable enough. Convert it to the proper DMA mapping API. Signed-off-by: Christoph Hellwig --- drivers/scsi/aha1542.c | 112 - 1 file changed, 77 insertions(+), 35 deletions(-) diff --git a/drivers/scsi/aha1542.c b/drivers/scsi/aha1542.c index 41add33e3f1f..fb9c8fdb5a09 100644 --- a/drivers/scsi/aha1542.c +++ b/drivers/scsi/aha1542.c @@ -58,8 +58,15 @@ struct aha1542_hostdata { int aha1542_last_mbi_used; int aha1542_last_mbo_used; struct scsi_cmnd *int_cmds[AHA1542_MAILBOXES]; - struct mailbox mb[2 * AHA1542_MAILBOXES]; - struct ccb ccb[AHA1542_MAILBOXES]; + struct mailbox *mb; + dma_addr_t mb_handle; + struct ccb *ccb; + dma_addr_t ccb_handle; +}; + +struct aha1542_cmd { + struct chain *chain; + dma_addr_t chain_handle; }; static inline void aha1542_intr_reset(u16 base) @@ -233,6 +240,17 @@ static int aha1542_test_port(struct Scsi_Host *sh) return 1; } +static void aha1542_free_cmd(struct scsi_cmnd *cmd) +{ + struct aha1542_cmd *acmd = scsi_cmd_priv(cmd); + struct device *dev = cmd->device->host->dma_dev; + + dma_free_coherent(dev, scsi_sg_count(cmd) * sizeof(struct chain), + acmd->chain, acmd->chain_handle); + acmd->chain = NULL; + scsi_dma_unmap(cmd); +} + static irqreturn_t aha1542_interrupt(int irq, void *dev_id) { struct Scsi_Host *sh = dev_id; @@ -303,7 +321,7 @@ static irqreturn_t aha1542_interrupt(int irq, void *dev_id) return IRQ_HANDLED; }; - mbo = (scsi2int(mb[mbi].ccbptr) - (isa_virt_to_bus([0]))) / sizeof(struct ccb); + mbo = (scsi2int(mb[mbi].ccbptr) - aha1542->ccb_handle) / sizeof(struct ccb); mbistatus = mb[mbi].status; mb[mbi].status = 0; aha1542->aha1542_last_mbi_used = mbi; @@ -331,8 +349,7 @@ static irqreturn_t aha1542_interrupt(int irq, void *dev_id) return IRQ_HANDLED; } my_done = tmp_cmd->scsi_done; - kfree(tmp_cmd->host_scribble); - tmp_cmd->host_scribble = NULL; + aha1542_free_cmd(tmp_cmd); /* Fetch the sense data, and tuck it away, in the required slot. The Adaptec automatically fetches it, and there is no guarantee that we will still have it in the cdb when we come back */ @@ -369,6 +386,7 @@ static irqreturn_t aha1542_interrupt(int irq, void *dev_id) static int aha1542_queuecommand(struct Scsi_Host *sh, struct scsi_cmnd *cmd) { + struct aha1542_cmd *acmd = scsi_cmd_priv(cmd); struct aha1542_hostdata *aha1542 = shost_priv(sh); u8 direction; u8 target = cmd->device->id; @@ -378,7 +396,6 @@ static int aha1542_queuecommand(struct Scsi_Host *sh, struct scsi_cmnd *cmd) int mbo, sg_count; struct mailbox *mb = aha1542->mb; struct ccb *ccb = aha1542->ccb; - struct chain *cptr; if (*cmd->cmnd == REQUEST_SENSE) { /* Don't do the command - we have the sense data already */ @@ -398,15 +415,17 @@ static int aha1542_queuecommand(struct Scsi_Host *sh, struct scsi_cmnd *cmd) print_hex_dump_bytes("command: ", DUMP_PREFIX_NONE, cmd->cmnd, cmd->cmd_len); } #endif - if (bufflen) { /* allocate memory before taking host_lock */ - sg_count = scsi_sg_count(cmd); - cptr = kmalloc_array(sg_count, sizeof(*cptr), -GFP_KERNEL | GFP_DMA); - if (!cptr) + sg_count = scsi_dma_map(cmd); + if (sg_count) { + acmd->chain = dma_alloc_coherent(sh->dma_dev, + sg_count * sizeof(struct chain), + >chain_handle, GFP_DMA); + if (!acmd->chain) { + scsi_dma_unmap(cmd); return SCSI_MLQUEUE_HOST_BUSY; + } } else { - sg_count = 0; - cptr = NULL; + acmd->chain = NULL; } /* Use the outgoing mailboxes in a round-robin fashion, because this @@ -437,7 +456,8 @@ static int aha1542_queuecommand(struct Scsi_Host *sh, struct scsi_cmnd *cmd) shost_printk(KERN_DEBUG, sh, "Sending command (%d %p)...", mbo, cmd->scsi_done); #endif - any2scsi(mb[mbo].ccbptr, isa_virt_to_bus([mbo])); /* This gets trashed for some reason */ + /* This gets trashed for some reason */ + any2scsi(mb[mbo].ccbptr,
Re: [PATCH v3] mpt3sas: correct reset of smid while clearing scsi tracker
On Mon, Aug 6, 2018 at 11:41 PM, Bart Van Assche wrote: > On Sat, 2018-08-04 at 18:56 +0530, Sreekanth Reddy wrote: >> No Bart, their is no race condition here. Since chain lookup table >> entry is uniquely accessed using smid value. And this smid (which is >> scmd->request->tag +1) is unique for each IO request. And >> _base_get_chain_buffer_tracker() function is called only during IO >> submission time (i.e. before submitting the IO to the IOC firmware) >> and mpt3sas_base_clear_st() function is called in the ISR (i.e during >> IO completion) path. So for a particular IO these two functions called >> at two different instances of time. > > Hello Sreekanth, > > In mpt3sas_base_get_smid_scsiio() I found the following code: > > struct scsiio_tracker *request = scsi_cmd_priv(scmd); > u16 smid = scmd->request->tag + 1; > request->smid = smid; > > That confirms what you wrote about smid being unique for each I/O request. > However, this also raises new questions. As far as I can see all code in > the I/O path that accesses the chain_lookup[] array uses index smid - 1. Our IOC firmware always requires smid to be >=1, zero is illegal smid for the IOC firmware. Hence while framing MPT SCSI IO request driver always make sure that smid to be >=1, so it uses smid as tag + 1. where as chain_lookup[] starts with index zero, So driver is accessing this chain_lookup[] with smid-1 index. > The patch at the start of this e-mail thread modifies two functions, namely > mpt3sas_remove_dead_ioc_func() and mpt3sas_base_clear_st(). This patch modifies in two functions namely '_base_get_chain_buffer_tracker()' & 'mpt3sas_base_clear_st()' not in the mpt3sas_remove_dead_ioc_func(), not sure why 'git format-patch' is showing 'mpt3sas_remove_dead_ioc_func' function name in the patch. This function _base_get_chain_buffer_tracker() is called only during the IO submission time and mpt3sas_base_clear_st() is called during IO completion time and hence I don't see any race condition here. Currently this patch is very much needed, so please consider this patch. May be we can start new email thread to discuss on the possible race conditions you are observing and I can clear your quires or I can fix them as on we can find the real race conditions. Thanks, Sreekanth Since the > chain_lookup[] array is indexed with the smid and hence contains request- > private data, how is it even possible that these functions race against > each other? Is there perhaps code that accesses the chain_lookup[] array > and that is called from another context than the I/O completion path? > > Is the patch at the start of this e-mail thread the result of a theoretical > concern or of a real failure? In the latter case, which sequence of actions > triggered the failure? The answer to this question should already have been > mentioned in the description of v1 of this patch. > > Thanks, > > Bart.