Hi Dave,
I'm unclear about the autoreplace behavior with one spare that is
connected to two pools. I don't see how it could work if the autoreplace
property is enabled on both pools, which formats and replaces a spare
disk that might be in-use in another pool (?) Maybe I misunderstand.
1. I think autoreplace behavior might be inconsistent when a device is
removed. CR 6935332 was filed recently but is not available yet through
our public bug database.
2. The current issue with adding a spare disk to a ZFS root pool is that
if a root pool mirror disk fails and the spare kicks in, the bootblock
is not applied automatically. We're working on improving this
experience.
My advice would be to create a 3-way mirrored root pool until we have a
better solution for root pool spares.
3. For simplicity and ease of recovery, consider using your disks as
whole disks, even though you must use slices for the root pool.
If one disk is part of two pools and it fails, two pool are impacted.
The beauty of ZFS is no longer having to deal with slice administration,
except for the root pool.
I like your mirror pool configurations but I would simplify it by
converting store1 to using whole disks, and keep separate spare disks.`
One for the store1 pool, and either create a 3-way mirrored root pool
or keep a spare disk connected to the system but unconfigured.
Thanks,
Cindy
On 03/17/10 10:25, Dave Johnson wrote:
From pages 29,83,86,90 and 284 of the 10/09 Solaris ZFS Administration
guide, it sounds like a disk designated as a hot spare will:
1. Automatically take the place of a bad drive when needed
2. The spare will automatically be detached back to the spare
pool when a new device is inserted and brought up to replace the
original compromised one.
Should this work the same way for slices?
I have four active disks in a RAID 10 configuration,
for a storage pool, and the same disks are used
for mirrored root configurations, but only
only one of the possible mirrored root slice
pairs is currently active.
I wanted to designate slices on a 5th disk as
hot spares for the two existing pools, so
after partitioning the 5th disk (#4) identical
to the four existing disks, I ran:
# zpool add rpool spare c0t4d0s0
# zpool add store1 spare c0t4d0s7
# zpool status
pool: rpool
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror ONLINE 0 0 0
c0t0d0s0 ONLINE 0 0 0
c0t1d0s0 ONLINE 0 0 0
spares
c0t4d0s0AVAIL
errors: No known data errors
pool: store1
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
store1ONLINE 0 0 0
mirror ONLINE 0 0 0
c0t0d0s7 ONLINE 0 0 0
c0t1d0s7 ONLINE 0 0 0
mirror ONLINE 0 0 0
c0t2d0s7 ONLINE 0 0 0
c0t3d0s7 ONLINE 0 0 0
spares
c0t4d0s7AVAIL
errors: No known data errors
--
So It looked like everything was set up how I was
hoping until I emulated a disk failure by pulling
one of the online disks. The root pool responded
how I expected, but the storage pool, on slice 7,
did not appear to perform the autoreplace:
Not too long after pulling one of the online disks:
# zpool status
pool: rpool
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: resilver in progress for 0h0m, 10.02% done, 0h5m to go
config:
NAMESTATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
mirrorDEGRADED 0 0 0
c0t0d0s0ONLINE 0 0 0
spare DEGRADED84 0 0
c0t1d0s0 REMOVED 0 0 0
c0t4d0s0 ONLINE 0 084 329M resilvered
spares
c0t4d0s0 INUSE currently in use
errors: No known data errors
pool: store1
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
store1ONLINE 0 0 0
mirror ONLINE 0 0 0
c0t0d0s7 ONLINE 0 0 0
c0t1d0s7 ONLINE 0 0 0
mirror ONLINE 0 0 0
c0t2d0s7 ONLINE 0 0 0
c0t3d0s7 ONLINE 0 0 0
spares
c0t4d0s7AVAIL
errors: No known data errors
I