Hi,
RHEL 5.4: cluster2 (I think). I expected to be able to freeze a service on a node and restart rgmanager on that node without interrupting the service. In practice, starting rgmanager causes the service to be stopped. Is this what is supposed to happen ? I thought the whole point of freezing services was to allow maintenance (including restarting cluster software). Are there any options to prevent the services from being stopped when rgmanager is started ? One effect of rgmanager stopping the service is that the cluster reaches an inconsistent state. Once rgmanager has restarted, the cluster believes that the services are still frozen, where in reality they are stopped. Any attempt to unfreeze the service causes the service to failover to a standby node. regards, Martin sudo /usr/sbin/clustat Cluster Status for EDISV1DBM @ Mon Jun 21 16:27:05 2010 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ svXprdclu001 1 Online, rgmanager svXprdclu002 2 Online, Local, rgmanager svXprdclu003 3 Online, rgmanager svXprdclu004 4 Online, rgmanager svXprdclu005 5 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:ACTIVESITE svXprdclu002 started service:MASTERVIP svXprdclu002 started [mar...@cp1edidbm002 ~]$ sudo /usr/sbin/clusvcadm -Z ACTIVESITE Local machine freezing service:ACTIVESITE...Success [mar...@cp1edidbm002 ~]$ sudo /usr/sbin/clusvcadm -Z MASTERVIP Local machine freezing service:MASTERVIP...Success [mar...@cp1edidbm002 ~]$ sudo /usr/sbin/clustat Cluster Status for EDISV1DBM @ Mon Jun 21 16:34:02 2010 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ svXprdclu001 1 Online, rgmanager svXprdclu002 2 Online, Local, rgmanager svXprdclu003 3 Online, rgmanager svXprdclu004 4 Online, rgmanager svXprdclu005 5 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:ACTIVESITE svXprdclu002 started [Z] service:MASTERVIP svXprdclu002 started [Z] [mar...@cp1edidbm002 ~]$ sudo /etc/init.d/rgmanager stop Shutting down Cluster Service Manager... Waiting for services to stop: [ OK ] Cluster Service Manager is stopped. [mar...@cp1edidbm002 ~]$ sudo /etc/init.d/rgmanager start Starting Cluster Service Manager: [ OK ] # # the services are stopped by rgmanager start. Ugh! # [mar...@cp1edidbm002 ~]$ sudo /usr/sbin/clustat Cluster Status for EDISV1DBM @ Mon Jun 21 16:35:34 2010 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ svXprdclu001 1 Online, rgmanager svXprdclu002 2 Online, Local, rgmanager svXprdclu003 3 Online, rgmanager svXprdclu004 4 Online, rgmanager svXprdclu005 5 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:ACTIVESITE svXprdclu002 started [Z] service:MASTERVIP svXprdclu002 started [Z] ========================================= The logs show that the service is stopped as rgmanager is started on svXprdclu002. Jun 21 16:31:19 cp1edidbm002 clurgmgrd: [14256]: <info> Executing /home/martin/dc-dsm status Jun 21 16:34:58 cp1edidbm002 rgmanager: [15526]: <notice> Shutting down Cluster Service Manager... Jun 21 16:34:58 cp1edidbm002 clurgmgrd[14256]: <notice> Shutting down Jun 21 16:35:08 cp1edidbm002 clurgmgrd[14256]: <notice> Shutdown complete, exiting Jun 21 16:35:08 cp1edidbm002 rgmanager: [15526]: <notice> Cluster Service Manager is stopped. Jun 21 16:35:16 cp1edidbm002 kernel: dlm: Using TCP for communications Jun 21 16:35:16 cp1edidbm002 kernel: dlm: got connection from 4 Jun 21 16:35:16 cp1edidbm002 kernel: dlm: got connection from 5 Jun 21 16:35:16 cp1edidbm002 kernel: dlm: got connection from 1 Jun 21 16:35:16 cp1edidbm002 kernel: dlm: got connection from 3 Jun 21 16:35:17 cp1edidbm002 clurgmgrd[15574]: <notice> Resource Group Manager Starting Jun 21 16:35:17 cp1edidbm002 clurgmgrd[15574]: <info> Loading Service Data Jun 21 16:35:17 cp1edidbm002 clurgmgrd[15574]: <info> Initializing Services Jun 21 16:35:17 cp1edidbm002 clurgmgrd: [15574]: <info> Executing /bin/true stop Jun 21 16:35:17 cp1edidbm002 clurgmgrd: [15574]: <info> Removing IPv4 address 10.3.17.20/24 from bond0 Jun 21 16:35:27 cp1edidbm002 clurgmgrd: [15574]: <info> Executing /home/martin/dc-dsm stop Jun 21 16:35:27 cp1edidbm002 clurgmgrd[15574]: <info> Services Initialized Jun 21 16:35:27 cp1edidbm002 clurgmgrd[15574]: <info> State change: Local UP Jun 21 16:35:27 cp1edidbm002 clurgmgrd[15574]: <info> State change: svXprdclu001 UP Jun 21 16:35:27 cp1edidbm002 clurgmgrd[15574]: <info> State change: svXprdclu003 UP Jun 21 16:35:27 cp1edidbm002 clurgmgrd[15574]: <info> State change: svXprdclu004 UP Jun 21 16:35:27 cp1edidbm002 clurgmgrd[15574]: <info> State change: svXprdclu005 UP
-- Linux-cluster mailing list [email protected] https://www.redhat.com/mailman/listinfo/linux-cluster
