Hello storage-discuss,
First - I'm aware of Proposal: ZFS hotplug support and
autoconfiguration by Eric Schrock.
I have presented each physical disk from EMC CX3-40 as a LUN and then
created RAID-10 using zfs. All devices are under MPxIO, system is
S10U3+patches (x64).
Now I removed physically two disks from the array.
Mar 29 12:02:10 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:02:10 XXXXXXX.srv /scsi_vhci/[EMAIL PROTECTED] (sd81): path
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL
PROTECTED],1/[EMAIL PROTECTED],
0 (fp1) target address 5006016841e03566,b is now STANDBY because of an
externally initiated failover
Mar 29 12:02:10 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:02:10 XXXXXXX.srv /scsi_vhci/[EMAIL PROTECTED] (sd81): path
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL PROTECTED]/[EMAIL
PROTECTED],0
(fp0) target address 5006016041e03566,b is now ONLINE because of an externally
initiated failover
Mar 29 12:02:15 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:02:15 XXXXXXX.srv /scsi_vhci/[EMAIL PROTECTED] (sd80): path
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL PROTECTED]/[EMAIL
PROTECTED],0
(fp0) target address 5006016041e03566,c is now STANDBY because of an externally
initiated failover
Mar 29 12:02:15 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:02:15 XXXXXXX.srv /scsi_vhci/[EMAIL PROTECTED] (sd80): path
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL
PROTECTED],1/[EMAIL PROTECTED],
0 (fp1) target address 5006016841e03566,c is now ONLINE because of an
externally initiated failover
Mar 29 12:02:20 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:02:20 XXXXXXX.srv /scsi_vhci/[EMAIL PROTECTED] (sd78): path
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL PROTECTED]/[EMAIL
PROTECTED],0
(fp0) target address 5006016041e03566,e is now STANDBY because of an externally
initiated failover
Mar 29 12:02:20 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:02:20 XXXXXXX.srv /scsi_vhci/[EMAIL PROTECTED] (sd78): path
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL
PROTECTED],1/[EMAIL PROTECTED],
0 (fp1) target address 5006016841e03566,e is now ONLINE because of an
externally initiated failover
Mar 29 12:03:24 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:03:24 XXXXXXX.srv /scsi_vhci/[EMAIL PROTECTED] (sd64): path
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL
PROTECTED],1/[EMAIL PROTECTED],
0 (fp1) target address 5006016841e03566,1c is now STANDBY because of an
externally initiated failover
Mar 29 12:03:29 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:03:29 XXXXXXX.srv Initiating failover for device disk (GUID
6006016062231b00ba53c25a19d9db11)
Mar 29 12:03:31 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:03:31 XXXXXXX.srv Initiating failover for device disk (GUID
6006016062231b00ba53c25a19d9db11)
Mar 29 12:03:33 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:03:33 XXXXXXX.srv Initiating failover for device disk (GUID
6006016062231b00ba53c25a19d9db11)
Mar 29 12:03:35 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:03:35 XXXXXXX.srv Initiating failover for device disk (GUID
6006016062231b00ba53c25a19d9db11)
Mar 29 12:03:36 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:03:36 XXXXXXX.srv Initiating failover for device disk (GUID
6006016062231b00ba53c25a19d9db11)
[...]
The last two lines are constantly repeating (several entries per
second).
bash-3.00# iostat -xnz 1|egrep " c6|devic"
[skipping first output]
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.0 0.0 0.0 3.0 0.0 0.0 0.0 100 0
c6t6006016062231B00A07323E419D9DB11d0
0.0 0.0 0.0 0.0 34.0 0.0 0.0 0.0 100 0
c6t6006016062231B00BA53C25A19D9DB11d0
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.0 0.0 0.0 3.0 0.0 0.0 0.0 100 0
c6t6006016062231B00A07323E419D9DB11d0
0.0 0.0 0.0 0.0 34.0 0.0 0.0 0.0 100 0
c6t6006016062231B00BA53C25A19D9DB11d0
It's been like that for ever 30 minutes (and it still is).
ZFS hasn't noticed too of course.
1. zpool status
bash-3.00# zpool status
pool: f4-1
state: ONLINE
scrub: scrub stopped with 0 errors on Wed Mar 28 12:11:50 2007
^C^C^C^C^C
I can't stop it and I can't get output (zpool list, zfs list are
working).
2. MPxIO - it tries to failover disk to second SP but looks like it
tries it forever (or very very long). After some time it should
have generated disk IO failure...
3. I guess that in such a case Eric's proposal probably won't help and
the real problem is with MPxIO - right?
--
Best regards,
Robert mailto:[EMAIL PROTECTED]
http://milek.blogspot.com
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss