James wrote:
> Juergen Keil wrote:
> >
> > Can anyone please provide a few more details on this bug:
> >
> > Bug ID: 6398361
> > Synopsis: Solaris root disk corrupted by suspend/resumed
> > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6398361
> >
> >
> > Public description text is just "Please see comments."
> >
> > Was this a software corruption? That is, the root filesystem became
> > corrupted?
> > Or was it a hardware corruption / defect? That is, something like
> > "the ATA HDD became unusable, after using the new ata/dadk/cmdk code
> > that implements ata power management"?
>
> Hi Juergen,
> there are two other bugs referred to in the report:
>
> 6397774 marrakesh often hangs in suspend-state 33
> 6455736 ata/dadk/cmdk should support DDI_SUSPEND/DDI_RESUME
>
>
> and the evaluation indicates that 6398361 was tentatively
> assumed to have been resolved along the way with 6455736.
>
> This is an excerpt from the comments:
>
> My guess is that this corruptions is due to a disk driver problem
> (also the likely cause of 6397774). If the driver doesn't properly
> suspend, or disk activity is attempted _after_ the driver is suspended,
> I believe that corruption can occur.
>
>
>
> > I'm asking because a "Western Digital WD1000BB" ATA HDD appears to have died
> > after the box was bfu'ed to current onnv-gate bits, which are containing
> > the changes for "6455736 ata/dadk/cmdk should support
> > DDI_SUSPEND/DDI_RESUME"
> > as part of the "PSARC/2005/469 X86 Energy Star compliance" putback.
> > (The "Western Digital WD1000BB" ATA HDD has been frequently spinning down,
> > and the BIOS doesn't detect the HDD any more)
> > And a replacement HDD is making similar strange noises (HDD stops and
> > restarts) during S-x86 kernel boot time, which I've now traced to the
> > ATC_STANDBY_IM 0xe0 command in the ata_power() entry point.
>
> If you bfu to the night before Randy's putback does the issue
> go away? Can you bfu to the night before?
I havn't tried to bfu backwards; but the issue goes away if I change
the ata driver to use the ata command "IDLE IMMEDIATE" 0xe1 instead of
"STANDBY IMMEDIATE" 0xe0 when ata is told enter fully powered up mode
by a ata_power(PM_LEVEL_D0) call.
I think I'm going to file the following bug:
==============================================================================
Software: snv_75, bfu'ed to onnv-gate ~ 2007-10-23
HW: ATA HDD: "IBM DTLA 307030"
When an x86 machine with ata hdds boots, I hear that around the time
the ata hdds attach, the disk is spinning down, heads are parked,
and after that the disk immediatelly is spun up again.
That didn't happen before the changes for "PSARC/2005/469 X86 Energy Star
compliance" had been put back.
The new behaviour:
- slows down boot (by a few seconds)
- cause wear on the hdd hardware
What I found out so far is, that when ata_power(PM_LEVEL_D0) is called
to fully power up the ata disk, an ATC_STANDBY_IM 0xe0 command is send to
the ata device. The hdd will immediatelly enter standby mode.
This seems counter intuitive. Why do we force the disk into standby mode
when we're supposed to power the disk up? I would expect that the
driver tries to get the disk into "active" mode.
Reading the T13 power management specs, an ata device can be in one
of these power states ("active" -> "sleep" decreases power consumption):
- active
- idle
- standby
- sleep
Shouldn't ata_power(PM_LEVEL_D0) use ATC_IDLE_IMMED 0xe1 to at least bring
the device up to "idle" mode (given that there's no way to change the power
mode to "active" using an ata command, other than sending a command forcing a
media access)?
Suggested fix:
==============
diff -r 33cb98223b2d usr/src/uts/intel/io/dktp/controller/ata/ata_common.c
--- a/usr/src/uts/intel/io/dktp/controller/ata/ata_common.c Wed Oct 24
20:00:39
2007 -0700
+++ b/usr/src/uts/intel/io/dktp/controller/ata/ata_common.c Fri Oct 26
15:11:23
2007 +0200
@@ -3607,7 +3610,7 @@ ata_power(dev_info_t *dip, int component
if (ata_save_pci_config)
(void) pci_restore_config_regs(dip);
ata_ctlp->ac_pm_level = PM_LEVEL_D0;
- cmd = ATC_STANDBY_IM;
+ cmd = ATC_IDLE_IMMED;
break;
case PM_LEVEL_D3:
if (ata_save_pci_config)
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss