[ha-clusters-discuss] HAStoragePlus resource with a zone on top, unable to migrate

Tundra Slosek Tue, 08 Dec 2009 17:06:47 PST

> Hi Tundra,
> 
> I didn't see much explanatory message about the
> reason of gds_svc_start 
> Timeout, and what it is expecting in the timeout
> duration.
> 
> > root at mltproc1:~# zfs set
> mountpoint=/common_pool0/common_zone/root/personal_poo
> l0/personal personal_pool0/personal
> 
> Don't set the mountpoint to "/common_pool0/ ..."
> which becomes hardcoded. It 
> has to default value. The reason is when HASP imports
> the pool for a zone, all
> the file systems will be relative to zone root, and
> setting mountpoint to 
> hardcoded will have issues.
> 
> I suggest you to to keep the dependencies as we
> discussed earlier and remove 
> the lofs part as well.
> 
> Now bring the all the resource groups online, and
> check all the file systems 
> hierarchy to ensure that all are mounted in expected
> order at expected mountpoints.
> If the above is fine try offlin'ing the RG's .
> 
> If you have issues, you can send the information of
> dependencies, the pools 
> involved, and file system hierarchy (i.e zpool list,
> zfs list, df -F zfs) 
> during RG online.
>


I haven't been able deal with this yet because I've been dealing with another 
issue within the cluster (a zpool got corrupt because it was imported into 
multiple nodes at once, and then any attempt to bring it back online would 
cause the importing node to crash eventually [either when doing zpool 
list/scrub or eventually out of the blue]) -- that issue is now resolved, 
however something that I saw during that leads me to think that this issue of 
'second zpool in the zone' is a red herring. 

I have a number of resource groups which exist to serve as a container for a 
zone, along with the backing resources that zone needs. In a number of these 
cases (backup1_rg, watchdog1_rg, adv1_rg, news1_rg, common_shares), this 
includes a large second zpool for 'user files'; however in a different 
category, I have a number of resource groups which contain zones which have 
just one zpool (smb1_rg, smb2_rg, gate1_rg). In the flogging back and forth of 
nodes crashing (or shutting down) due the above zpool issue and my shifting 
resource groups around attempting to minimize the impact of the node shutdowns, 
I have seen a number of times that the resource groups with just 'logical 
hostname, zone zpool, zone gds' (and no exta zpools) have also had the 'cannot 
unmount... device busy' error that I started out with originally.  

This makes me think that no matter how right or wrong I am about my 
configuration/setup of the second zpool within the named zone, I've got a 
different root problem causing my 'cannot unmount' error. 

To this end, I present the following, for the node which is presently running 
smb1_rg:

root at mltproc1:~# clrg show -v smb1_rg | grep smb1
Resource Group:                                 smb1_rg
  --- Resources for Group smb1_rg ---          
  Resource:                                     smb1_lhname
    Group:                                         smb1_rg
    HostnameList:                               smb1
  Resource:                                     smb1_zpool
    Group:                                         smb1_rg
    Zpools:                                     smb1_pool0
  Resource:                                     smb1_zone
    Group:                                         smb1_rg
    Resource_dependencies:                         smb1_lhname smb1_zpool
    Start_command:                              
/opt/SUNWsczone/sczbt/bin/start_sczbt -R smb1_zone -G smb1_rg -P 
/smb1_pool0/smb1_zone/parameters 
    Stop_command:                               
/opt/SUNWsczone/sczbt/bin/stop_sczbt -R smb1_zone -G smb1_rg -P 
/smb1_pool0/smb1_zone/parameters
    Probe_command:                              
/opt/SUNWsczone/sczbt/bin/probe_sczbt -R smb1_zone -G smb1_rg -P 
/smb1_pool0/smb1_zone/parameters
    Network_resources_used:                     smb1_lhname


root at mltproc1:~# zpool list | grep smb1
smb1_pool0  1.98G   926M  1.08G    45%  ONLINE  /

root at mltproc1:~# zfs list | grep smb1
smb1_pool0                                  926M  1.05G    21K  /smb1_pool0
smb1_pool0/smb1_zone                        925M  1.05G  27.5K  
/smb1_pool0/smb1_zone
smb1_pool0/smb1_zone/ROOT                   925M  1.05G    19K  legacy
smb1_pool0/smb1_zone/ROOT/zbe               925M  1.05G   925M  legacy

root at mltproc1:~# df -F zfs
smb1_pool0             1099635        21   1099614   1% /smb1_pool0
smb1_pool0/smb1_zone   1099641        28   1099614   1% /smb1_pool0/smb1_zone
smb1_pool0/smb1_zone/ROOT/zbe
                       2047250    947637   1099614  47% 
/smb1_pool0/smb1_zone/root
-- 
This message posted from opensolaris.org

[ha-clusters-discuss] HAStoragePlus resource with a zone on top, unable to migrate

Reply via email to