I once ran into a very severe AHCI timeout problem. After months of trying to
figure it out and insane Hardware_ECC_Recovered error values, I found that
the error was with the power connector plug / sata HDD interface. All errors
disappeared after replacing that cable. Since you have error on more than 1
HDD, I suggest:
1. Check smartctl output for each AND all HDD
2. Check whether your power supply unit is still healthy or if it is
supplying inconsistent power.
3. Check the main power supply line and whether it shows any voltage
fluctuations or if there is a new heavy consumer of amps on the same power
line as the server is plugged to.
I've deliberately chose a different server that has a different chipset, and
that there were no problems with the HDD.
Added kernel support:
device ahci # AHCI-compatible SATA controllers
And now, after 2.5 days fell off one HDD.
[3:14]beastie:root-/root# zpool status
pool: tank
state: DEGRADED
status: One or more devices has been removed by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: none requested
config:
NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
mirror-0 ONLINE 0 0 0
gpt/disk0ONLINE 0 0 0
gpt/disk2ONLINE 0 0 0
mirror-1 DEGRADED 0 0 0
gpt/disk1ONLINE 0 0 0
4931885954389536913 REMOVED 0 0 0 was /dev/gpt/disk3
errors: No known data errors
Jan 30 09:49:28 beastie kernel: ahcich3: Timeout on slot 29 port 0
Jan 30 09:49:28 beastie kernel: ahcich3: is cs 2000 ss rs
2000 tfd c0 serr cmd 0004dd17
Jan 30 09:49:28 beastie kernel: (ada3:ahcich3:0:0:0): FLUSHCACHE48. ACB: ea 00
00 00 00 40 00 00 00 00 00 00
Jan 30 09:49:28 beastie kernel: (ada3:ahcich3:0:0:0): CAM status: Command
timeout
Jan 30 09:49:28 beastie kernel: (ada3:ahcich3:0:0:0): Retrying command
Jan 30 09:51:31 beastie kernel: ahcich3: AHCI reset: device not ready after
31000ms (tfd = 0080)
Jan 30 09:51:31 beastie kernel: ahcich3: Timeout on slot 29 port 0
Jan 30 09:51:31 beastie kernel: ahcich3: is cs 2000 ss rs
2000 tfd 80 serr cmd 0004dd17
Jan 30 09:51:31 beastie kernel: (aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec
00 00 00 00 40 00 00 00 00 00 00
Jan 30 09:51:31 beastie kernel: (aprobe0:ahcich3:0:0:0): CAM status: Command
timeout
Jan 30 09:51:31 beastie kernel: (aprobe0:ahcich3:0:0:0): Error 5, Retry was
blocked
Jan 30 09:51:31 beastie kernel: ahcich3: AHCI reset: device not ready after
31000ms (tfd = 0080)
Jan 30 09:51:31 beastie kernel: ahcich3: Timeout on slot 29 port 0
Jan 30 09:51:31 beastie kernel: ahcich3: is cs ss rs
2000 tfd 58 serr cmd 0004dd17
Jan 30 09:51:31 beastie kernel: (aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec
00 00 00 00 40 00 00 00 00 00 00
Jan 30 09:51:31 beastie kernel: (aprobe0:ahcich3:0:0:0): CAM status: Command
timeout
Jan 30 09:51:31 beastie kernel: (aprobe0:ahcich3:0:0:0): Error 5, Retry was
blocked
Jan 30 09:51:31 beastie kernel: (ada3:ahcich3:0:0:0): lost device
Jan 30 09:51:31 beastie kernel: (pass3:ahcich3:0:0:0): passdevgonecb: devfs
entry is gone
--
Vladislav V. Prodan
System Network Administrator
http://support.od.ua
+380 67 4584408, +380 99 4060508
VVP88-RIPE
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org