relocate does not work from either of the nodes so the problem with both the nodes ? they are in good health, and stay in quorum perfectly fine. i can also bring up the services separately on both the nodes. just the relocate which seem to fail when manually tried. when the service is online in node2, if i halt it, it just simple stops but does not auto-relocate to node1, does not even attempt to do. how do i debug with more information to know where the problem lies ?
Param On Mon, Aug 27, 2012 at 5:44 PM, emmanuel segura <emi2f...@gmail.com> wrote: > if relocate doesn't work maybe you have a problem in one node > > > 2012/8/27 PARAM KRISH <mkpa...@gmail.com> > >> Nope, that did not help. it still the same. enable/disable works fine on >> both nodes but relocate, nope. >> >> On Mon, Aug 27, 2012 at 3:45 PM, emmanuel segura <emi2f...@gmail.com>wrote: >> >>> Fix your config before >>> >>> 2012/8/27 PARAM KRISH <mkpa...@gmail.com> >>> >>>> Just noticed that disable and enable of that service works >>>> fine(clusvcadm -d WEB and clusvcadm -e WEB) on both the nodes but the >>>> relocate does not. >>>> >>>> Aug 27 20:57:18 node2 rgmanager[1771]: Stopping service service:WEB >>>> Aug 27 20:57:19 node2 rgmanager[23541]: Stopping Service apache:WEB >>>> Aug 27 20:57:19 node2 rgmanager[23561]: Checking Existence Of File >>>> /var/run/cluster/apache/apache:WEB.pid [apache:WEB] > Failed - File Doesn't >>>> Exist >>>> Aug 27 20:57:19 node2 rgmanager[23581]: Stopping Service apache:WEB > >>>> Succeed >>>> Aug 27 20:57:19 node2 rgmanager[1771]: Service service:WEB is disabled >>>> Aug 27 20:57:32 node2 rgmanager[1771]: Starting disabled service >>>> service:WEB >>>> Aug 27 20:57:33 node2 rgmanager[23716]: Adding IPv4 address >>>> 192.168.18.50/24 to eth0 >>>> Aug 27 20:57:37 node2 rgmanager[23882]: Starting Service apache:WEB >>>> Aug 27 20:57:37 node2 rgmanager[23937]: Query failed: Invalid argument >>>> (/cluster/rm/service[@name="WEB"]/ip[2]/@address) >>>> Aug 27 20:57:39 node2 rgmanager[1771]: Service service:WEB started >>>> Aug 27 20:58:35 node2 rgmanager[1771]: Stopping service service:WEB >>>> Aug 27 20:58:36 node2 rgmanager[24554]: Stopping Service apache:WEB >>>> Aug 27 20:58:37 node2 rgmanager[24579]: Stopping Service apache:WEB > >>>> Failed - Application Is Still Running >>>> Aug 27 20:58:38 node2 rgmanager[24599]: Stopping Service apache:WEB > >>>> Failed >>>> Aug 27 20:58:38 node2 rgmanager[1771]: stop on apache "WEB" returned 1 >>>> (generic error) >>>> Aug 27 20:58:38 node2 rgmanager[24648]: Removing IPv4 address >>>> 192.168.18.50/24 from eth0 >>>> Aug 27 20:58:48 node2 rgmanager[1771]: #12: RG service:WEB failed to >>>> stop; intervention required >>>> Aug 27 20:58:48 node2 rgmanager[1771]: Service service:WEB is failed >>>> Aug 27 20:58:48 node2 rgmanager[1771]: #70: Failed to relocate >>>> service:WEB; restarting locally >>>> Aug 27 20:58:48 node2 rgmanager[1771]: #43: Service service:WEB has >>>> failed; can not start. >>>> Aug 27 20:58:48 node2 rgmanager[1771]: #2: Service service:WEB returned >>>> failure code. Last Owner: node2.localdomain >>>> Aug 27 20:58:48 node2 rgmanager[1771]: #4: Administrator intervention >>>> required. >>>> Aug 27 20:59:13 node2 rgmanager[1771]: Stopping service service:WEB >>>> Aug 27 20:59:13 node2 rgmanager[24841]: Stopping Service apache:WEBAug >>>> 27 20:59:14 node2 rgmanager[24861]: Checking Existence Of File >>>> /var/run/cluster/apache/apache:WEB.pid [apache:WEB] > Failed - File Doesn't >>>> Exist >>>> Aug 27 20:59:14 node2 rgmanager[24881]: Stopping Service apache:WEB > >>>> Succeed >>>> Aug 27 20:59:14 node2 rgmanager[1771]: Service service:WEB is disabled >>>> Aug 27 21:01:06 node2 rgmanager[1771]: #43: Service service:WEB has >>>> failed; can not start. >>>> >>>> >>>> What could be missing ? >>>> >>>> On Mon, Aug 27, 2012 at 3:01 PM, PARAM KRISH <mkpa...@gmail.com> wrote: >>>> >>>>> Hi >>>>> >>>>> I think i am almost there. I have started using RHEL6 hoping it would >>>>> not give me any night-mare this time to setup a 2 Node Cluster for a >>>>> Apache >>>>> cluster service. and i think i have done pretty much everything. >>>>> >>>>> In short, >>>>> >>>>> 1. Two nodes having private IP's eth0 configured with 192.168.18.10 >>>>> and 192.168.18.11 >>>>> 2. Nodes are named as node1.localdomain, node2.localdomain, /etc/hosts >>>>> taken care >>>>> 3. I created the cluster, added two nodes, added the service WEB ( >>>>> added the child :IP and :apache to it) >>>>> 4. Cluster is in quorum and detects other node going offline >>>>> fantastically >>>>> 5. Tested the start/stop of this resource WEB using "rg_test" , it >>>>> worked just fine on both the nodes. >>>>> 6. But, for some reasons, its not starting or failing over to other >>>>> node when i manually test(using clusvcadm -e WEB) or do a reboot or >>>>> whatever. >>>>> >>>>> 7. Please let me know how do i verify the cluster startup and failover >>>>> manually to make sure everything works >>>>> 8. What is it i am missing that makes this not work now ? Please >>>>> assist. >>>>> >>>>> Please go through the output of all the commands attached herewith. >>>>> >>>>> Let me know if there is still required. >>>>> >>>>> Param >>>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster@redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> >>> -- >>> esta es mi vida e me la vivo hasta que dios quiera >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster@redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster@redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster@redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster >
-- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster