A couple of days after updating to oi_151a5, the first of my two boot
drives stopped being able to do I/O and zfs removed it from the pool.
I thought that this was interesting since I learned about it after
seeing someone post on the list that the first of his two boot drives
was removed from the pool not long after updating to oi_151a5. I did a
'zpool status rpool' to see the state of my own pool. My pool was in
the same condition as his. Later this same other person posted that
they downgraded to oi_151a4 and then the OS could see the drive and do
I/O with it.
This evening I replaced the failed drive with a completely different
one. The OS is able to query the drive info but is still completely
unable to perform I/O on it.
# zpool status rpool
pool: rpool
state: DEGRADED
status: One or more devices are faulted in response to persistent
errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
repaired.
scan: scrub repaired 0 in 0h6m with 0 errors on Tue Jul 10 20:45:50 2012
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
c3t0d0s0 FAULTED 0 0 0 too many errors
c3t1d0s0 ONLINE 0 0 0
errors: No known data errors
# dd if=/dev/rdsk/c3t0d0s0 of=/dev/null bs=64k count=1024
dd: opening `/dev/rdsk/c3t0d0s0': I/O error
AVAILABLE DISK SELECTIONS:
0. c3t0d0 <ATA-ST1000NM0011-SN02 cyl 60798 alt 2 hd 255 sec 126>
/pci@0,0/pci15d9,62c@1f,2/disk@0,0
1. c3t1d0 <ATA-WDCWD5003ABYX-0-1S02 cyl 60798 alt 2 hd 255 sec 63>
/pci@0,0/pci15d9,62c@1f,2/disk@1,0
Iostat does not show any errors logged against my new drive:
# iostat -xe
extended device statistics ----
errors ---
device r/s w/s kr/s kw/s wait actv svc_t %w %b s/w h/w trn tot
sd1 0.2 0.0 3.6 0.0 0.0 0.0 0.1 0 0 0 0 0 0
sd2 3.9 2.4 122.5 23.2 0.0 0.0 6.4 1 1 0 0 0 0
# iostat -E
sd1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: ST1000NM0011 Revision: SN02 Serial No: Z1N21SQN
Size: 1000.20GB <1000204886016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 108 Predictive Failure Analysis: 0
sd2 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: WDC WD5003ABYX-0 Revision: 1S02 Serial No:
WD-WMAYP3661514
Size: 500.11GB <500107862016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 33 Predictive Failure Analysis: 0
# cfgadm -f -c configure sata0/0::dsk/c3t0d0
cfgadm: Library error: Cannot determine sata port number for ap_id:
/devices/pci@0,0/pci15d9,62c@1f,2:0::dsk/c3t0d0
The above seems really strange since it sounds like the OS has become
confused about the device.
Is there a known kernel configuration or driver issue which might
cause the OS to forget how to do I/O with SATA drives, and
particularly the first boot drive?
Bob
--
Bob Friesenhahn
[email protected], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com