Robert Hancock wrote:
> Tejun Heo wrote:
>> Tejun Heo wrote:
>>> Robert Hancock wrote:
>>>>> Okay, just succeeded on the current #upstream-fixes, attaching the
>>>>> log.
>>>>>  The machine is a brick after the crash.
>>>> I assume the cable got reconnected at 325 seconds? It looks like that
>>>> was during error handling for the previous unplug?
>>> I don't remember too well (the console was more than two meters away and
>>> I was just keeping disconnecting and reconnecting.  I noticed the
>>> machine was frozen after I came back to console, so...
>>>
>>>> [  314.987885] ata3: timeout waiting for ADMA IDLE, stat=0x400
>>>> [  314.993556] ata3: timeout waiting for ADMA LEGACY, stat=0x400
>>>> [  315.009915] ata3.00: exception Emask 0x10 SAct 0x1 SErr 0x1910000
>>>> action 0xa frozen
>>>> [  315.017708] ata3.00: ADMA status 0x00000402: , hot unplug
>>>> [  315.017714] ata3: SError: { PHYRdyChg Dispar LinkSeq TrStaTrns }
>>>> [  315.029239] ata3.00: cmd 60/01:00:92:d7:12/00:00:05:00:00/40 tag 0
>>>> ncq 512 in
>>>> [  315.029240]          res 40/00:04:92:d7:12/00:04:92:d7:12/40 Emask
>>>> 0x10 (ATA bus error)
>>>> [  315.029243] ata3.00: status: { DRDY }
>>>> [  315.048236] ata3: hard resetting link
>>>> [  315.774982] ata3: SATA link down (SStatus 0 SControl 300)
>>>> [  315.780498] ata3: failed to recover some devices, retrying in 5 secs
>>>> [  320.788427] ata3: hard resetting link
>>>> [  325.242220] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>>
>>>> Not sure if the port would be frozen at this point or not?
>>>>
>>>> It would be useful to add some printks to narrow down at what point the
>>>> lockup happens. If it's a loop, interrupt storm or something then we
>>>> can
>>>> likely fix it, but if the controller's just locking up then we may be
>>>> out of luck..
>>> I think it's machine hard lock up.  NMI watchdog doesn't get triggered.
> 
> Is NMI watchdog actually working on this machine?
> 
> [   34.466899] testing NMI watchdog ... <4>WARNING: CPU#0: NMI appears
> to be stuck (0->0)!
> [   34.555056] WARNING: CPU#1: NMI appears to be stuck (0->0)!

Oops, missed that.  I'll see whether there's IRQ storm going on.

>> Ah.. another thing.  Sometimes when I swap two drives, sata_nv fails to
>> detect the new drive.  If I pull out the plug and replug it, it then
>> recognizes the new drive.
> 
> No output in that case, I assume?

It seems what happens is sata_nv EH loses hotplug events during
hardreset is going on.  This is a bit tricky.  I'm not sure whether it's
sata_nv's fault or other drivers are working out of dumb luck.  I'll
reproduce the problem and post the log when I get some time.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to