It looks like the IntelliPark feature on a Western Digital Caviar Green
HDD can cause issues with OpenBSD, which can be fixed/mitigated by
disabling IntelliPark.

About 6 months ago, I built myself a new amd64 machine.  I decided to
optimize for low wattage--reducing power costs and waste heat,
increasing UPS runtime--and so I chose a single Western Digital Caviar
Green HDD.  Although these drives are intended/marketed for something
more like nearline storage, according to bonnie++, the drive performed
roughly as well as the 7200RPM PATA-100 2-drive mirror in my old
machine.

The machine I built, initially running 4.7/amd64, then 4.8/amd64 (both
unmodified -RELEASE) was never stable for more than a couple of days at
a time.  The machine would freeze hard, sometimes with the HDD light lit
solid, usually not.  I worked around a number of bugs, trying a patched
kernel with http://marc.info/?l=openbsd-misc&m=128897915014154&w=2, and
disabling installing an fxp(4) so I could disable the onboard re(4).  I
wrote scripts to monitor hw.sensors, SMART, and various stats from
systat(1), and graph them using rrdtool.  What I noticed was that my
machine would generally crash right before an IO-intensive cronjob
started.

I also noticed that SMART stat 193 (Load/Unload Cycle Count) was very
high, and climbing rapidly.  Doing some research on this stat, I found
out that WD Caviar Green drives have a feature called IntelliPark that
parks the HDD heads after 8 seconds of inactivity.  This is supposed to
make the HDD more efficient, but has been reported not to play well with
Linux, and WD provides a workaround: the WDIDLE3 utility, which would
allow me to change/disable the IntelliPark 8-second timeout.  I ran
WDIDLE3 on my WD Caviar Green HDD, setting the timeout to the maximum
allowed (300 seconds).  I have a monitoring process running that writes
to disk roughly every 60 seconds, so IntelliPark is effectively disabled
for me.  As of now, the system has been up a record 19.5 days without
issue.

Disabling IntelliPark fixed the major freeze issue I was having.  I
don't know exactly what was going on, but it seems like the drive would
get stuck in a state in which the head reloading had failed, or had not
completed within a certain timespan, and the OS and the drive controller
become deadlocked.  Attempting to reproduce the problem is painful, both
in terms of how long it can take to cause a freeze, and for the wearing
out it did of the drive.  I'm not sure if I should file this as a PR, or
consider this a design flaw in the drive (or a consequence of
"off-label" use) and just be content with the fix/workaround that I've
found.

If anyone has any recommendations, or any experiences with the Caviar
Green drives, I'd like to hear them.

Reply via email to