Hi Venku, Thank you for information. Yes, if doing switch in a sequence it works fine, only bulk import experiencing the error. What actually happens if one node fails? Does it do sequential import? I didn't notice any problems with import after doing immediate failfast with cmm_ctl.
Aleks On 22/07/08 12:48 +0530, Venkateswarlu Tella wrote: >Hi Aleks, > >The CR i am mentioning is 6689452 >6689452 zfs often fails to import a zpool if several zfs import commands >are running at the same time. > >This problem also faced as part of evacuate RG's only. > >It is fixed in Nevada 92 and will be backported in s10u6. > >Thanks >-Venku > >Venkateswarlu Tella wrote: >>Hi Aleks, >>I am wondering whether you can test, by switchover the RG's with zpools >>sequentially and see whether any problem exists. >>I remember there are some issues in zfs when simultaneous import/export >>happens on multiple zpools. >> >>Thanks >>-Venku >> >>Aleks Feltin wrote: >>>Marty, >>> >>>Thanks for your help. I took a look on timeout parameter. I can use it, >>>no problem. >>>I am afraid the problem can occur if node suddenly goes down and >>>similarly the pool will not be imported because of "Invalid argument" >>>not allowing other resources in RG to execute startup method. >>> >>>Aleks F. >>> >>>On 21/07/08 11:57 -0700, Martin Rattner wrote: >>>>Aleks, >>>> >>>>By default, resource groups are prevented from failing back onto an >>>>evacuated node for 60 seconds after the last RG goes offline. To >>>>lengthen this time interval, add the option "-T <seconds>" to the >>>>"clnode evacuate" command. >>>> >>>>For example, >>>> clnode evacuate -n node1 -T 1200 >>>> >>>>will prevent failbacks onto node1 for twenty minutes after the >>>>evacuation has completed. The maximum allowed value for the -T >>>>argument is 65535. >>>> >>>> >>>>This might be exposing some additional problems with ZFS on >>>>hastorageplus, however, I don't have the expertise to comment on that. >>>> >>>>Regards, >>>>--Marty >>>> >>>> >>>>Aleks Feltin wrote: >>>>>Hello, >>>>> >>>>>I have SC 3.2u1. While trying to evacuate all resource groups (clnode >>>>>evacuate -n node1), I faced unplesant situation with HA-ZFS, namely >>>>>some groups did a failback and in debug the message "cannot discover >>>>>pools : Invalid argument" appeared for the pool. >>>>>I can reproduce it easily with same evacuate command. Error is not >>>>>bound to particular pool. I have 24 pools in total. >>>>> >>>>>Jul 16 10:49:03 node1 >>>>>SC[SUNW.HAStoragePlus:6,NFSrg,NFSds,hastorageplus_prenet_start]: [ID >>>>>471757 daemon.error] cannot discover pools : Invalid argument >>>>>Jul 16 10:49:03 node1 Cluster.RGM.rgmd: [ID 938318 daemon.error] >>>>>Method <hastorageplus_prenet_start> failed on resource <NFSds> in >>>>>resource group <NFSrg> [exit code <1>, time used: 12% of timeout >>>>><1800 seconds>] >>>>> >>>>>Solaris Cluster 3.2u1 for Solaris 10 sparc >>>>>5.10 Generic_137111-01 >>>>> >>>>>Please assist in understanding this. >>>>> >>>>>Aleks F. >>>>> >>>>> >>>>>------------------------------------------------------------------------ >>>>> >>>>>_______________________________________________ >>>>>ha-clusters-discuss mailing list >>>>>ha-clusters-discuss at opensolaris.org >>>>>http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss >>>------------------------------------------------------------------------ >>> >>>_______________________________________________ >>>ha-clusters-discuss mailing list >>>ha-clusters-discuss at opensolaris.org >>>http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss >>_______________________________________________ >>ha-clusters-discuss mailing list >>ha-clusters-discuss at opensolaris.org >>http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://mail.opensolaris.org/pipermail/ha-clusters-discuss/attachments/20080722/a54a8cea/attachment.bin>
