On Sat, 2018-08-04 at 18:56 +0530, Sreekanth Reddy wrote:
> No Bart, their is no race condition here. Since chain lookup table
> entry is uniquely accessed using smid value. And this smid (which is
> scmd->request->tag +1) is unique for each IO request. And
> _base_get_chain_buffer_tracker() function is called only during IO
> submission time (i.e. before submitting the IO to the IOC firmware)
> and mpt3sas_base_clear_st() function is called in the ISR (i.e during
> IO completion) path. So for a particular IO these two functions called
> at two different instances of time.

Hello Sreekanth,

In mpt3sas_base_get_smid_scsiio() I found the following code:

        struct scsiio_tracker *request = scsi_cmd_priv(scmd);
        u16 smid = scmd->request->tag + 1;
        request->smid = smid;

That confirms what you wrote about smid being unique for each I/O request.
However, this also raises new questions. As far as I can see all code in
the I/O path that accesses the chain_lookup[] array uses index smid - 1.
The patch at the start of this e-mail thread modifies two functions, namely
mpt3sas_remove_dead_ioc_func() and mpt3sas_base_clear_st(). Since the
chain_lookup[] array is indexed with the smid and hence contains request-
private data, how is it even possible that these functions race against
each other? Is there perhaps code that accesses the chain_lookup[] array
and that is called from another context than the I/O completion path?

Is the patch at the start of this e-mail thread the result of a theoretical
concern or of a real failure? In the latter case, which sequence of actions
triggered the failure? The answer to this question should already have been
mentioned in the description of v1 of this patch.

Thanks,

Bart.

Reply via email to