Jay, et al,

        I've added the memory barriers, and I'm trying the patch as I
type this.  I'll get back to you with the results.

--Phil

Compaq:  High Performance Server Division/Benchmark Performance Engineering
---------------- Alpha, The Fastest Processor on Earth --------------------
[EMAIL PROTECTED]        |C|O|M|P|A|Q|        [EMAIL PROTECTED]
------------------- See the results at www.spec.org -----------------------

On Thu, 5 Apr 2001, Jay Estabrook wrote:

> On Thu, Apr 05, 2001 at 11:03:44AM -0400, Phillip Ezolt wrote:
> >
> > The qlogic-isp driver causes a kernel crash on my Alpha/Linux machine.
> >
> > Unfortunately, it only happens under very heavy load, so it is not
> > easily reproducible.
> >
> > The machine is running kernel 2.4.2 with TUX-W2 applied. (This is
> > basically 2.4.2-ac24 with the tux server added.)
> >
> > The machine is a single-CPU DS10 with the following:
> >
> > SCSI subsystem driver Revision: 1.00
> > qlogicisp: new isp1020 revision ID (5)
> > scsi0: QLogic ISP1020 SCSI on PCI bus 00 device 78 irq 39 I/O base 0x8000
> >
> > 00:0f.0 SCSI storage controller: Q Logic ISP1020 (rev 05)
> >
> > I think that I have the line, and the reason for the crash.
> >
> > The line that causes it is qlogicisp.c (1054):
> > ...
> > void isp1020_intr_handler()
> > ....
> >
> >                 if (sts->hdr.entry_type == ENTRY_STATUS)
> >                         Cmnd->result = isp1020_return_status(sts);
> >                 else
> >                         Cmnd->result = DID_ERROR << 16;
> >                 if (Cmnd->use_sg)
> >
> > >From the values in v0, and s0, I believe that Cmnd is NULL.
> >
> > The crash is the first place that Cmnd is accessed within the function.
> >
> > Is there a fix for this?
>
> I've seen that before, but thought it had gone away... :-\
>
> Essentially, IIRC, it gets the Cmnd value via a table entry, the entry
> being determined by read of memory-mapped status info set by the
> controller. If the status info is not valid for any reason, then odds
> are that the Cmnd value will be also.
>
> Suggestion:
>
> Change the following code in the interrupt handler:
>
>                 cmd_slot = sts->handle;
>                 Cmnd = hostdata->cmd_slots[cmd_slot];
>                 hostdata->cmd_slots[cmd_slot] = NULL;
>
> to:
>
>               mb();
>                 cmd_slot = sts->handle;
>                 Cmnd = hostdata->cmd_slots[cmd_slot];
>               mb();
>                 hostdata->cmd_slots[cmd_slot] = NULL;
>
> and see if that makes a difference.
>
> If it doesn't, we've more thinking to do... :-\
>
> One could retry the sts->handle fetch a second time and compare values.
>
> It would be interesting (but harder, I suspect) to know if we ever get
> a bad value for cmd_slot but *not* end up with Cmnd== NULL...
>
> --Jay++
>
> -----------------------------------------------------------------------------
> Jay A Estabrook                            Alpha Engineering - LINUX Project
> Compaq Computer Corp. - MRO1-2/K20         (508) 467-2080
> 200 Forest Street, Marlboro MA 01752       [EMAIL PROTECTED]
> -----------------------------------------------------------------------------
>

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]

Reply via email to