Hello storage-discuss,

First - I'm aware of Proposal: ZFS hotplug support and
autoconfiguration by Eric Schrock.

I have presented each physical disk from EMC CX3-40 as a LUN and then
created RAID-10 using zfs. All devices are under MPxIO, system is
S10U3+patches (x64).

Now I removed physically two disks from the array.

Mar 29 12:02:10 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:02:10 XXXXXXX.srv    /scsi_vhci/[EMAIL PROTECTED] (sd81): path 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
PROTECTED],1/[EMAIL PROTECTED],
0 (fp1) target address 5006016841e03566,b is now STANDBY because of an 
externally initiated failover
Mar 29 12:02:10 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:02:10 XXXXXXX.srv    /scsi_vhci/[EMAIL PROTECTED] (sd81): path 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
(fp0) target address 5006016041e03566,b is now ONLINE because of an externally 
initiated failover
Mar 29 12:02:15 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:02:15 XXXXXXX.srv    /scsi_vhci/[EMAIL PROTECTED] (sd80): path 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
(fp0) target address 5006016041e03566,c is now STANDBY because of an externally 
initiated failover
Mar 29 12:02:15 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:02:15 XXXXXXX.srv    /scsi_vhci/[EMAIL PROTECTED] (sd80): path 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
PROTECTED],1/[EMAIL PROTECTED],
0 (fp1) target address 5006016841e03566,c is now ONLINE because of an 
externally initiated failover
Mar 29 12:02:20 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:02:20 XXXXXXX.srv    /scsi_vhci/[EMAIL PROTECTED] (sd78): path 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
(fp0) target address 5006016041e03566,e is now STANDBY because of an externally 
initiated failover
Mar 29 12:02:20 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:02:20 XXXXXXX.srv    /scsi_vhci/[EMAIL PROTECTED] (sd78): path 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
PROTECTED],1/[EMAIL PROTECTED],
0 (fp1) target address 5006016841e03566,e is now ONLINE because of an 
externally initiated failover
Mar 29 12:03:24 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:03:24 XXXXXXX.srv    /scsi_vhci/[EMAIL PROTECTED] (sd64): path 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
PROTECTED],1/[EMAIL PROTECTED],
0 (fp1) target address 5006016841e03566,1c is now STANDBY because of an 
externally initiated failover
Mar 29 12:03:29 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:03:29 XXXXXXX.srv    Initiating failover for device disk (GUID 
6006016062231b00ba53c25a19d9db11)
Mar 29 12:03:31 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:03:31 XXXXXXX.srv    Initiating failover for device disk (GUID 
6006016062231b00ba53c25a19d9db11)
Mar 29 12:03:33 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:03:33 XXXXXXX.srv    Initiating failover for device disk (GUID 
6006016062231b00ba53c25a19d9db11)
Mar 29 12:03:35 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:03:35 XXXXXXX.srv    Initiating failover for device disk (GUID 
6006016062231b00ba53c25a19d9db11)
Mar 29 12:03:36 XXXXXXX.srv scsi: [ID 243001 kern.info] /scsi_vhci (scsi_vhci0):
Mar 29 12:03:36 XXXXXXX.srv    Initiating failover for device disk (GUID 
6006016062231b00ba53c25a19d9db11)
[...]

The last two lines are constantly repeating (several entries per
second).

bash-3.00# iostat -xnz 1|egrep " c6|devic"
[skipping first output]
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  3.0  0.0    0.0    0.0 100   0 
c6t6006016062231B00A07323E419D9DB11d0
    0.0    0.0    0.0    0.0 34.0  0.0    0.0    0.0 100   0 
c6t6006016062231B00BA53C25A19D9DB11d0
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  3.0  0.0    0.0    0.0 100   0 
c6t6006016062231B00A07323E419D9DB11d0
    0.0    0.0    0.0    0.0 34.0  0.0    0.0    0.0 100   0 
c6t6006016062231B00BA53C25A19D9DB11d0


It's been like that for ever 30 minutes (and it still is).

ZFS hasn't noticed too of course.

1. zpool status

bash-3.00# zpool status
  pool: f4-1
 state: ONLINE
 scrub: scrub stopped with 0 errors on Wed Mar 28 12:11:50 2007
^C^C^C^C^C

I can't stop it and I can't get output (zpool list, zfs list are
working).


2. MPxIO - it tries to failover disk to second SP but looks like it
   tries it forever (or very very long). After some time it should
   have generated disk IO failure...

3. I guess that in such a case Eric's proposal probably won't help and
   the real problem is with MPxIO - right?


-- 
Best regards,
 Robert                          mailto:[EMAIL PROTECTED]
                                     http://milek.blogspot.com

_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to