> Hi Tundra, > I see two problems in your configuration > 1. Keeping the dependencies. And you have answered it > already. > common_zone -> personal pool -> common pool > This ensure proper start and stop and does the > mounting > and unmounting of the file system in ZFS pool. > > <zone name="common" > zonepath="/common_pool0/common_zone" autoboot="false" > > brand="ipkg" limitpriv="default,sys_smb"> > > <dataset name="personal_pool0/personal"/> > > </zone> > > In general, it is not recommended to add the a ZFS > pool dataset to zone using > zonecfg(1M), that is being controlled by > HAStoragePlus. The reason is when a > pool is imported on another physical cluster node as > part failover/switchover, > the booting of zone on the current node will have a > problem as the dataset is > not available. > > I suggest to remove the dataset from zonecfg(1M) and > also tunn off the zoned > property of that file system. > > Thanks > -Venku
Venku, thanks for your help so far, however I am still doing something wrong. This is what I've done: First, on all nodes in order to remove the personal_pool0/personal dataset from the zone named 'common', I issued: zonecfg -z common remove dataset Which leaves the following in /etc/zones/common.xml: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE zone PUBLIC "-//Sun Microsystems Inc//DTD Zones//EN" "file:///usr/share/lib/xml/dtd/zonecfg.dtd.1"> <!-- DO NOT EDIT THIS FILE. Use zonecfg(1M) instead. --> <zone name="common" zonepath="/common_pool0/common_zone" autoboot="false" brand="ipkg" limitpriv="default,sys_smb"/> And then I attempted to set the dependencies with: root at mltproc1:~# zfs get zoned,mountpoint personal_pool0/personal NAME PROPERTY VALUE SOURCE personal_pool0/personal zoned on local personal_pool0/personal mountpoint /personal_pool0/personal default root at mltproc1:~# zfs set zoned=off personal_pool0/personal root at mltproc1:~# zfs set mountpoint=/common_pool0/common_zone/root/personal_pool0/personal personal_pool0/personal root at mltproc1:~# zfs get zoned,mountpoint personal_pool0/personal NAME PROPERTY VALUE SOURCE personal_pool0/personal zoned off local personal_pool0/personal mountpoint /common_pool0/common_zone/root/personal_pool0/personal local root at mltstore0:~# clrs set -p Resource_dependencies+=common_zpool personal_pool root at mltstore0:~# clrs set -p Resource_dependencies+=personal_pool common_zone I didn't get any errors in this process. Now when I attempt to switch or start the 'common_shares' resource group, it just keeps migrating from node to node, never getting out of 'Pending online', and in the logging host for the cluster I see the following, which looks to me like the zone just doesn't start (timeout at 11:08:40 - looks like 192.168.11.21 and 192.168.11.22 are 2 sec off in timesync), but I don't see any details of why: Dec 7 11:03:37 [192.168.11.21.214.62] SC[,SUNW.gds:6,common_shares,common_zone,gds_svc_start]: [ID 661560 daemon.info] All the SUNW.HAStoragePlus resources that this resource depends on are online on the local node. Proceeding with the checks for the existence and permissions of the start/stop/probe commands. Dec 7 11:03:37 [192.168.11.21.214.62] SC[,SUNW.gds:6,common_shares,common_zone,gds_svc_start]: [ID 268646 daemon.info] Extension property <network_aware> has a value of <1> Dec 7 11:03:37 [192.168.11.21.214.62] SC[,SUNW.LogicalHostname:3,common_shares,common_lhname,hafoip_monitor_start]: [ID 211198 daemon.info] Completed successfully. Dec 7 11:03:37 [192.168.11.21.214.62] Cluster.RGM.global.rgmd: [ID 515159 daemon.notice] method <hafoip_monitor_start> completed successfully for resource <common_lhname>, resource group <common_shares>, node <mltstore1>, time used: 0% of timeout <300 seconds> Dec 7 11:03:35 [192.168.11.22.236.224] Cluster.RGM.global.rgmd: [ID 443746 daemon.notice] resource common_lhname state on node mltstore1 change to R_ONLINE Dec 7 11:03:37 [192.168.11.21.214.62] SC[,SUNW.gds:6,common_shares,common_zone,gds_svc_start]: [ID 887138 daemon.info] Extension property <Child_mon_level> has a value of <-1> Dec 7 11:03:37 [192.168.11.21.214.62] SC[,SUNW.gds:6,common_shares,common_zone,gds_svc_start]: [ID 833212 daemon.info] Attempting to start the data service under process monitor facility. Dec 7 11:03:37 [192.168.11.21.214.62] SC[,SUNW.gds:6,common_shares,common_zone,gds_svc_start]: [ID 569559 daemon.info] Start of /opt/SUNWsczone/sczbt/bin/start_sczbt -R common_zone -G common_shares -P /common_pool0/common_zone/parameters completed successfully. Dec 7 11:03:37 [192.168.11.21.214.62] SC[,SUNW.gds:6,common_shares,common_zone,gds_svc_start]: [ID 268646 daemon.info] Extension property <network_aware> has a value of <1> Dec 7 11:03:38 [192.168.11.21.214.62] genunix: [ID 408114 kern.info] /pseudo/zconsnex at 1/zcons at 1 (zcons1) online Dec 7 11:08:40 [192.168.11.21.214.62] Cluster.RGM.global.rgmd: [ID 764140 daemon.error] Method <gds_svc_start> on resource <common_zone>, resource group <common_shares>, node <mltstore1>: Timeout. Dec 7 11:08:38 [192.168.11.22.236.224] Cluster.RGM.global.rgmd: [ID 443746 daemon.error] resource common_zone state on node mltstore1 change to R_START_FAILED Dec 7 11:08:38 [192.168.11.22.236.224] Cluster.RGM.global.rgmd: [ID 784560 daemon.notice] resource common_zone status on node mltstore1 change to R_FM_FAULTED Dec 7 11:08:38 [192.168.11.22.236.224] Cluster.RGM.global.rgmd: [ID 922363 daemon.notice] resource common_zone status msg on node mltstore1 change to <> Dec 7 11:08:38 [192.168.11.22.236.224] Cluster.RGM.global.rgmd: [ID 529407 daemon.error] resource group common_shares state on node mltstore1 change to RG_PENDING_OFF_START_FAILED Dec 7 11:08:38 [192.168.11.22.236.224] Cluster.RGM.global.rgmd: [ID 784560 daemon.notice] resource common_zone status on node mltstore1 change to R_FM_UNKNOWN Dec 7 11:08:38 [192.168.11.22.236.224] Cluster.RGM.global.rgmd: [ID 922363 daemon.notice] resource common_zone status msg on node mltstore1 change to <Stopping> Dec 7 11:08:38 [192.168.11.22.236.224] Cluster.RGM.global.rgmd: [ID 443746 daemon.notice] resource common_zone state on node mltstore1 change to R_STOPPING Dec 7 11:08:40 [192.168.11.21.214.62] Cluster.RGM.global.rgmd: [ID 224900 daemon.notice] launching method <hastorageplus_monitor_stop> for resource <personal_pool>, resource group <common_shares>, node <mltstore1>, timeout <90> seconds Dec 7 11:08:40 [192.168.11.21.214.62] Cluster.RGM.global.rgmd: [ID 224900 daemon.notice] launching method <hafoip_monitor_stop> for resource <common_lhname>, resource group <common_shares>, node <mltstore1>, timeout <300> seconds Dec 7 11:08:40 [192.168.11.21.214.62] Cluster.RGM.global.rgmd: [ID 224900 daemon.notice] launching method <hastorageplus_monitor_stop> for resource <common_zpool>, resource group <common_shares>, node <mltstore1>, timeout <90> seconds Dec 7 11:08:40 [192.168.11.21.214.62] Cluster.RGM.global.rgmd: [ID 224900 daemon.notice] launching method <gds_svc_stop> for resource <common_zone>, resource group <common_shares>, node <mltstore1>, timeout <300> seconds Dec 7 11:08:40 [192.168.11.21.214.62] Cluster.RGM.global.rgmd: [ID 669833 daemon.debug] 68 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hastorageplus/hastorageplus_monitor_stop>:tag=<common_shares.personal_pool.8>: Calling security_clnt_connect(..., host=<mltstore1>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...) Dec 7 11:08:40 [192.168.11.21.214.62] Cluster.RGM.global.rgmd: [ID 653003 daemon.debug] 73 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hastorageplus/hastorageplus_monitor_stop>:tag=<common_shares.common_zpool.8>: Calling security_clnt_connect(..., host=<mltstore1>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...) Dec 7 11:08:40 [192.168.11.21.214.62] Cluster.RGM.global.rgmd: [ID 846460 daemon.debug] 65 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hafoip/hafoip_monitor_stop>:tag=<common_shares.common_lhname.8>: Calling security_clnt_connect(..., host=<mltstore1>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...) Dec 7 11:08:40 [192.168.11.21.214.62] Cluster.RGM.global.rgmd: [ID 170767 daemon.debug] 71 fe_rpc_command: cmd_type(enum):<1>:cmd=</opt/SUNWscgds/bin/gds_svc_stop>:tag=<common_shares.common_zone.1>: Calling security_clnt_connect(..., host=<mltstore1>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...) Dec 7 11:08:40 [192.168.11.21.214.62] Cluster.RGM.global.rgmd: [ID 515159 daemon.notice] method <hastorageplus_monitor_stop> completed successfully for resource <personal_pool>, resource group <common_shares>, node <mltstore1>, time used: 0% of timeout <90 seconds> -- This message posted from opensolaris.org