On Fri, Feb 19, 2010 at 4:58 PM, Ruiyuan Jiang <[email protected]> wrote:
> Hi, Andreas
>
> I am not sure what you mean to monitor it. Can you give me more detailed info?

In the apache OCF script, there is a function called monitor.
Check what it does, see if that works for you.

> I did reboot both nodes and then I found Apache did not start. After reboot, 
> I did manually start Apache because Apache did not start automatically and it 
> started and I could access the web site. Thanks.
>
> Ryan
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Andrew Beekhof
> Sent: Friday, February 19, 2010 8:18 AM
> To: General Linux-HA mailing list
> Subject: Re: [Linux-HA] Setup cluster
>
> On Thu, Feb 18, 2010 at 11:51 PM, Ruiyuan Jiang <[email protected]> wrote:
>> Thanks, Andreas. That is what I suspected too. Once stonith disabled, the 
>> cluster starts.
>> I have not tried to set quorum yet. I will try next.
>> Now I have another problem. Apache does not start but virtual IP address 
>> bonded to the NIC.
>>
>> [r...@usnbrl51 ~]# crm configure show
>> node $id="01428973-0d27-48ee-9142-2da9cb5c1e4b" usnbrl50.liz.com
>> node $id="0988f7f3-c858-4c12-af3b-b6a8bf83a0ae" usnbrl51.liz.com
>> primitive ClusterIP ocf:heartbeat:IPaddr2 \
>>        params ip="156.146.22.48" cidr_netmask="32" \
>>        op monitor interval="30s"
>> primitive WebSite ocf:heartbeat:apache \
>>        params configfile="/etc/httpd/conf/httpd.conf" \
>>        op monitor interval="1min"
>> location prefer-usnbrl50 WebSite 50: usnbrl50
>> colocation website-with-ip inf: WebSite ClusterIP
>> order apache-after-ip inf: ClusterIP WebSite
>> property $id="cib-bootstrap-options" \
>>        dc-version="1.0.6-fdba003eafa6af1b8d81b017aa535a949606ca0d" \
>>        cluster-infrastructure="Heartbeat" \
>>        no-quorum-policy="ignore" \
>>        stonith-enabled="false"
>> rsc_defaults $id="rsc-options" \
>>        resource-stickiness="100"
>> [r...@usnbrl51 ~]#
>>
>> Last updated: Thu Feb 18 17:28:14 2010
>> Stack: Heartbeat
>> Current DC: usnbrl51.liz.com (0988f7f3-c858-4c12-af3b-b6a8bf83a0ae) - 
>> partition WITHOUT quorum
>> Version: 1.0.6-fdba003eafa6af1b8d81b017aa535a949606ca0d
>> 2 Nodes configured, unknown expected votes
>> 2 Resources configured.
>> Online: [ usnbrl51.liz.com usnbrl50.liz.com ]
>>
>> ClusterIP       (ocf::heartbeat:IPaddr2 Started usnbrl50.liz.com
>>    WebSite_start_0 (node=usnbrl51.liz.com, call=6, rc=1, status=complete): 
>> unknown error
>>    WebSite_start_0 (node=usnbrl50.liz.com, call=6, rc=1, status=complete): 
>> unknown error
>>
>> In the Apache's error log, it shows "caught SIGTERM, shuting down".
>>
>> On the /var/log/messages, it does not say why Apache can't start also. I can 
>> manually start Apache no problem.
>
> But can you monitor it.
> There's some option that need to be passed to apache to enable
> monitoring, I'm guessing its not set
>
>>
>>
>> Feb 18 16:42:51 usnbrl50 crmd: [3610]: info: process_lrm_event: LRM 
>> operation Cl
>> usterIP_start_0 (call=4, rc=0, cib-update=15, confirmed=true) ok
>> Feb 18 16:42:53 usnbrl50 crmd: [3610]: info: do_lrm_rsc_op: Performing 
>> key=9:4:0
>> :ea164eb4-9fea-4d79-83f6-0ad29f8521a5 op=ClusterIP_monitor_30000 )
>> Feb 18 16:42:53 usnbrl50 lrmd: [3607]: info: rsc:ClusterIP:5: monitor
>> Feb 18 16:42:53 usnbrl50 crmd: [3610]: info: do_lrm_rsc_op: Performing 
>> key=10:4:
>> 0:ea164eb4-9fea-4d79-83f6-0ad29f8521a5 op=WebSite_start_0 )
>> Feb 18 16:42:53 usnbrl50 lrmd: [3607]: info: rsc:WebSite:6: start
>> Feb 18 16:42:53 usnbrl50 crmd: [3610]: info: process_lrm_event: LRM 
>> operation ClusterIP_monitor_30000 (call=5, rc=0, cib-update=16, 
>> confirmed=false) ok
>> Feb 18 16:42:53 usnbrl50 apache[3754]: [3818]: INFO: apache not running
>> Feb 18 16:42:53 usnbrl50 apache[3754]: [3820]: INFO: waiting for apache 
>> /etc/httpd/conf/httpd.conf to come up
>> Feb 18 16:42:54 usnbrl50 crmd: [3610]: info: process_lrm_event: LRM 
>> operation WebSite_start_0 (call=6, rc=1, cib-update=17, confirmed=true) 
>> unknown error
>> Feb 18 16:42:55 usnbrl50 attrd: [3609]: info: attrd_ha_callback: Update 
>> relayed from usnbrl51.liz.com
>> Feb 18 16:42:55 usnbrl50 attrd: [3609]: info: attrd_trigger_update: Sending 
>> flush op to all hosts for: fail-count-WebSite (INFINITY)
>> Feb 18 16:42:55 usnbrl50 crmd: [3610]: info: do_lrm_rsc_op: Performing 
>> key=2:5:0:ea164eb4-9fea-4d79-83f6-0ad29f8521a5 op=WebSite_stop_0 )
>> Feb 18 16:42:55 usnbrl50 lrmd: [3607]: info: rsc:WebSite:7: stop
>> Feb 18 16:42:55 usnbrl50 attrd: [3609]: info: attrd_perform_update: Sent 
>> update 16: fail-count-WebSite=INFINITY
>> Feb 18 16:42:55 usnbrl50 attrd: [3609]: info: attrd_ha_callback: Update 
>> relayed from usnbrl51.liz.com
>> Feb 18 16:42:55 usnbrl50 attrd: [3609]: info: attrd_trigger_update: Sending 
>> flush op to all hosts for: last-failure-WebSite (1266529375)
>> Feb 18 16:42:55 usnbrl50 attrd: [3609]: info: attrd_perform_update: Sent 
>> update 19: last-failure-WebSite=1266529375
>> Feb 18 16:42:55 usnbrl50 lrmd: [3607]: info: RA output: 
>> (ClusterIP:start:stderr)
>> ARPING 192.168.9.101 from 192.168.9.101 eth0 Sent 5 probes (5 broadcast(s)) 
>> Received 0 response(s)
>> Feb 18 16:42:56 usnbrl50 lrmd: [3607]: info: RA output: 
>> (WebSite:stop:stderr) /usr/lib/ocf/resource.d//heartbeat/apache: line 437: 
>> kill: (3816) - No such process
>> Feb 18 16:42:56 usnbrl50 apache[3842]: [3876]: INFO: Killing apache PID 3816
>> Feb 18 16:42:56 usnbrl50 apache[3842]: [3878]: INFO: apache stopped.
>> Feb 18 16:42:56 usnbrl50 crmd: [3610]: info: process_lrm_event: LRM 
>> operation WebSite_stop_0 (call=7, rc=0, cib-update=18, confirmed=true) ok
>> Feb 18 16:44:08 usnbrl50 pengine: [4004]: info: crm_log_init: Changed active 
>> directory to /usr/local/var/lib/heartbeat/cores/root
>> Feb 18 16:44:08 usnbrl50 pengine: [4004]: info: Invoked: 
>> /usr/local/lib64/heartbeat/pengine metadata
>>
>>
>> Thanks.
>> Ryan
>>
>
>
>
>
> This message (including any attachments) is intended
> solely for the specific individual(s) or entity(ies) named
> above, and may contain legally privileged and
> confidential information. If you are not the intended
> recipient, please notify the sender immediately by
> replying to this message and then delete it.
> Any disclosure, copying, or distribution of this message,
> or the taking of any action based on it, by other than the
> intended recipient, is strictly prohibited.
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to