On Wed, Feb 19, 2003 at 10:20:12AM +1100, Bruce Evans wrote: > On Tue, 18 Feb 2003, Ruslan Ermilov wrote: > > > On Fri, Feb 14, 2003 at 05:10:40AM -0800, Alfred Perlstein wrote: > > > alfred 2003/02/14 05:10:40 PST > > > > > > Modified files: > > > sys/kern kern_intr.c > > > sys/dev/ata ata-all.c > > > Log: > > > Fix crash dumps on ata and scsi. > > > > > [...] > > > To fix ata, use what appears to be a polling method if we're dumping, > > > I stole this from tmm but added code to ensure that this change is > > > only in effect while dumping. > > > > > > Tested by: des > > > > > FWIW, if I propagate this change to the !dumping case, it also > > fixes the ``resume stucks in "ata1: resetting devices .."'' bug > > I was having with my ThinkPad 600X: > > > > %%% > > Index: ata-all.c > > =================================================================== > > RCS file: /home/ncvs/src/sys/dev/ata/ata-all.c,v > > retrieving revision 1.165 > > diff -u -p -r1.165 ata-all.c > > --- ata-all.c 14 Feb 2003 13:10:40 -0000 1.165 > > +++ ata-all.c 18 Feb 2003 10:08:22 -0000 > > @@ -486,8 +486,7 @@ ata_getparam(struct ata_device *atadev, > > > > /* apparently some devices needs this repeated */ > > do { > > - if (ata_command(atadev, command, 0, 0, 0, > > - dumping ? ATA_WAIT_READY : ATA_WAIT_INTR)) { > > + if (ata_command(atadev, command, 0, 0, 0, ATA_WAIT_READY)) { > > ata_prtdev(atadev, "%s identify failed\n", > > command == ATA_C_ATAPI_IDENTIFY ? "ATAPI" : "ATA"); > > free(ata_parm, M_ATA); > > %%% > > There is, or was, something near here that made the whole system go > unresponsive (as seen by nfs clients) for several seconds. I guess > the main problem was just using polled mode in all cases here. In > RELENG_4, polling is done at splbio() so normally only disk devices > are blocked, but under -current almost everything is blocked by Giant. > The symptoms were as following. The console is blocked, and if I type something, I don't see it unless I enter into the DDB -- then what I have typed is displayed.
> > The resume session (with apm(4)) now looks like this: > > > > : cbb0: PCI Memory allocated: 50103000 > > : cbb1: PCI Memory allocated: 50102000 > > : pcm0: detached > > : csa: card is Thinkpad 600X/A20/T20 > > : pcm0: <CS461x PCM Audio> on csa0 > > : pcm0: <Cirrus Logic CS4297A ac97 codec> > > : wakeup from sleeping state (slept 00:00:10) > > : ata0: resetting devices .. > > : done > > : ata1: resetting devices .. > > : ata1-slave: timeout waiting for cmd=ec s=01 e=24 > > : ata1-slave: ATA identify failed > > : done > > Apparently the timeout is too short or the interrupt got lost. The > timeout seems to be too short. It is 10 seconds, but IIRC the spec > is says 30 seconds for reset of the master and a bit more for the > slave. Since things work with polling, we know that the device state > changed properly. We could test for this state change instead of > always aborting after the timeout, and do finer grained and more sleeps > to determine the precise timeout required. > I recall seeing the ``stray irq 15'' too, so yes, that may likely be the case here. I will try bumping up the ATA_WAIT_INTR timeout later today and let you know the results. Cheers, -- Ruslan Ermilov Sysadmin and DBA, [EMAIL PROTECTED] Sunbay Software AG, [EMAIL PROTECTED] FreeBSD committer, +380.652.512.251 Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age
msg52695/pgp00000.pgp
Description: PGP signature