On Thu, 2019-01-17 at 16:05 -0500, Douglas Gilbert wrote: > On 2019-01-16 8:06 p.m., Bart Van Assche wrote: > > On Wed, 2019-01-16 at 19:54 -0500, Douglas Gilbert wrote: > > > On 2019-01-16 6:56 p.m., Bart Van Assche wrote: > > > > On Wed, 2019-01-16 at 10:57 -0500, Douglas Gilbert wrote: > > > > > The block layer assumes scsi_request:sense is always a valid > > > > > pointer. This is set up once in scsi_mq_init_request() and the > > > > > containing scsi_cmnd object is used often, being re-initialized > > > > > by scsi_init_command(). That works unless some code re-purposes > > > > > part of the scsi_cmnd object for something else. And that is > > > > > what bidi handling does in scsi_mq_prep_fn(). The result is an > > > > > oops at some later time when the partly overwritten object is > > > > > re-used. The overwrite is from d285203cf647d but 'git blame' > > > > > does not show removed code, so that commit may not be the > > > > > culprit. > > > > > > > > > > Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> > > > > > --- > > > > > > > > > > This was found while injecting errors (thus generating sense data) > > > > > into a sequence of bidi commands. At some later time the block > > > > > layer blew up with a scsi_request::sense NULL dereference in > > > > > sg_rq_end_io(). Without testing I'm confident the bsg driver, > > > > > the osd ULD and exofs are exposed to this bug. > > > > > > > > > > drivers/scsi/scsi_lib.c | 1 + > > > > > 1 file changed, 1 insertion(+) > > > > > > > > > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > > > > > index b13cc9288ba0..71259bd4040a 100644 > > > > > --- a/drivers/scsi/scsi_lib.c > > > > > +++ b/drivers/scsi/scsi_lib.c > > > > > @@ -1175,6 +1175,7 @@ void scsi_init_command(struct scsi_device *dev, > > > > > struct scsi_cmnd *cmd) > > > > > > > > > > cmd->device = dev; > > > > > cmd->sense_buffer = buf; > > > > > + cmd->req.sense = buf; > > > > > cmd->prot_sdb = prot; > > > > > cmd->flags = flags; > > > > > INIT_DELAYED_WORK(&cmd->abort_work, scmd_eh_abort_handler); > > > > > > > > Hi Doug, > > > > > > > > The description of this patch does not look correct to me. > > > > scsi_init_command() > > > > does not overwrite the sense pointer. From the body of that function: > > > > > > > > /* zero out the cmd, except for the embedded scsi_request */ > > > > memset((char *)cmd + sizeof(cmd->req), 0, > > > > sizeof(*cmd) - sizeof(cmd->req) + > > > > dev->host->hostt->cmd_size); > > > > > > > > It is not clear to me which code overwrites the sense pointer. I think > > > > that > > > > needs to be figured out before discussion of this patch can continue. > > > > > > Bart, > > > Please re-read the explanation. The two sentences in the middle should > > > tell > > > you that you are looking at the wrong memset(). > > > > > > And I'm waiting for the person who wrote the questionable code to comment. > > > > > > > > > If you don't believe me, try sending a device reset to a scsi_debug > > > device. > > > Then use sg_raw *** to send a XDWRITREAD(10) bidi command to the same > > > scsi_debug > > > device, you should get a Unit Attention concerning that device reset. If > > > you do, keep sending that bidi command. Make sure you have no important > > > files open because that machine will lock solid. Basically what happens > > > when a rq_end_io() callback de-references a NULL pointer. It looks like > > > it has been there since 2014 and took me 2 days to find. Sorry that I > > > can't > > > explain it in one simple sentence. > > > > Hi Doug, > > > > Is this perhaps the memset you are referring to? > > > > memset(bidi_sdb, 0, sizeof(struct scsi_data_buffer)); > > Yes.
Hi Doug, I think it's worse than what you explained in your patch description: some code interprets blk_mq_rq_to_pdu(cmd->request->next_rq) as a struct scsi_data_buffer, e.g. scsi_mq_prep_fn(). Other code in the SCSI core interprets the same data structure as a struct scsi_request, e.g. scsi_io_completion(). So I don't think that your patch is sufficient. I have already started working on a patch series that clears up this confusion. Bart.