[Linux-HA] Linux-HA Digest, Vol 80, Issue 20

me,apporc Fri, 09 Jul 2010 20:44:34 -0700

help

On Sat, Jul 10, 2010 at 12:18 AM, <[email protected]>wrote:


> Send Linux-HA mailing list submissions to
>        [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://lists.linux-ha.org/mailman/listinfo/linux-ha
> or, via email, send a message with subject or body 'help' to
>        [email protected]
>
> You can reach the person managing the list at
>        [email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Linux-HA digest..."
>
>
> Today's Topics:
>
>   1. Re: Two curious problems with heartbeat and ldirector
>      (Dejan Muhamedagic)
>   2. Re: Two curious problems with heartbeat and ldirector
>      (Schaefer, Dirk Alexander)
>   3. Re: Tomcat Resource Agent always leaves dead process on stop
>      or restart (Dejan Muhamedagic)
>   4. Re: Two curious problems with heartbeat and ldirector
>      (Dejan Muhamedagic)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 9 Jul 2010 17:06:54 +0200
> From: Dejan Muhamedagic <[email protected]>
> Subject: Re: [Linux-HA] Two curious problems with heartbeat and
>        ldirector
> To: General Linux-HA mailing list <[email protected]>
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=iso-8859-1
>
> Hi,
>
> On Fri, Jul 09, 2010 at 04:25:29PM +0200, Schaefer, Dirk Alexander wrote:
> > Hi,
> >
> > well, that's a good to know information ;) it's latest version
> > offered by gentoo's package manager. I already played with
> > version 3.0.3. something I haven't understand so far is how to
> > configure the part ldirectord plays in my current setup with
> > heartbeat 3.0.3. I cannot find it anymore once installed
> > version 3.0.3. is there any substitute for it available?
>
> It is probably a separate package (ldirectord). Depends on the
> distribution.
>
> Thanks,
>
> Dejan
>
> >
> > Thanks and...
> >
> > Mit freundlichen Gruessen / With kind regards
> >
> > Dirk Alexander Schaefer
> >
> > -----Urspr?ngliche Nachricht-----
> > Von: [email protected] [mailto:
> [email protected]] Im Auftrag von Michael Schwartzkopff
> > Gesendet: Freitag, 9. Juli 2010 12:15
> > An: General Linux-HA mailing list
> > Betreff: Re: [Linux-HA] Two curious problems with heartbeat and ldirector
> >
> > Am Freitag, den 09.07.2010, 11:40 +0200 schrieb Schaefer, Dirk
> > Alexander:
> > > Hello,
> > >
> > >
> > >
> > > I'm trying to setup a two nodes ha loadbalancer for two webservers and
> i've
> > > got two curious problems with heartbeat/ldirectord.
> > >
> > >
> > >
> > > The first is, that I cannot active the broadcast mechanism. If i
> uncomment
> > > 'bcast ethXY' in the ha.cf file then the following error is reported
> in the
> > > logs during the startup:
> > >
> > >
> > >
> > >                 heartbeat[31484]: 2010/07/09_11:10:04 ERROR: glib:
> Error
> > > setting socket option SO_BINDTODEVICE: Protocol not available
> > >
> > > heartbeat[31484]: 2010/07/09_11:10:04 ERROR: cannot open bcast eth0
> > >
> > >
> > >
> > > and the service is getting stopped.
> > >
> > >
> > >
> > > the second problem is that although everything seems to work fine I
> cannot
> > > access the webServers through the vip. Ipvsadmin shows that the vip and
> the
> > > real servers are registered correctly. It also shows that requests to
> the
> > > vip are getting received. Even more, the failover to the second
> > > heartbeat/ldirectord machine works once I shutdown the current muster.
> But
> > > what ever I do the requests to the vip seems not to get routed/nated/.
> (I've
> > > tried gate and masq definitions in ldirectord conf) to the real
> servers.
> > >
> > >
> > >
> > > My setup:
> > >
> > >
> > >
> > > Os: gentoo 10.1 (amd64) (XEN DomU)
> > >
> > > Heartbeat: 2.0.8
> > >
> > >
> > >
> > > ha.cf:
> > >
> > > debugfile /var/log/ha-debug
> > >
> > > logfile       /var/log/ha-log
> > >
> > > logfacility   local0
> > >
> > > mcast eth0 225.0.0.1 694 1 0
> > >
> > > auto_failback off
> > >
> > > node   lb1 lb2
> > >
> > > respawn root /usr/lib64/heartbeat/ipfail
> > >
> > > apiauth ipfail gid=cluster uid=cluster
> > >
> > >
> > >
> > >
> > >
> > > harecourses:
> > >
> > > lb1 \
> > >
> > >         ldirectord::/etc/ha.d/ldirectord.www.cf \
> > >
> > >         LVSSyncDaemonSwap::master \
> > >
> > >         IPaddr2::172.22.40.50/20/eth1/172.22.47.255
> > >
> > >
> > >
> > >
> > >
> > > ldirector.www.cf:
> > >
> > > checktimeout=10
> > >
> > > checkinterval=2
> > >
> > > autoreload=yes
> > >
> > > logfile="local0"
> > >
> > > logfile="/var/log/ldirectord.log"
> > >
> > > quiescent=yes
> > >
> > >
> > >
> > > virtual=172.22.40.50:80
> > >
> > >         real=172.22.40.110:80 masq
> > >
> > >         real=172.22.40.111:80 masq
> > >
> > >         service=http
> > >
> > >         request="lbProbe.html"
> > >
> > >         receive="OK"
> > >
> > >         scheduler=rr
> > >
> > >
> > >
> > >
> > >
> > > kernel config:
> > >
> > > CONFIG_IP_VS=m
> > >
> > > # CONFIG_IP_VS_IPV6 is not set
> > >
> > > CONFIG_IP_VS_DEBUG=y
> > >
> > > CONFIG_IP_VS_TAB_BITS=12
> > >
> > > # IPVS transport protocol load balancing support
> > >
> > > CONFIG_IP_VS_PROTO_TCP=y
> > >
> > > CONFIG_IP_VS_PROTO_UDP=y
> > >
> > > CONFIG_IP_VS_PROTO_AH_ESP=y
> > >
> > > CONFIG_IP_VS_PROTO_ESP=y
> > >
> > > CONFIG_IP_VS_PROTO_AH=y
> > >
> > > CONFIG_IP_VS_PROTO_SCTP=y
> > >
> > > # IPVS scheduler
> > >
> > > CONFIG_IP_VS_RR=m
> > >
> > > CONFIG_IP_VS_WRR=m
> > >
> > > CONFIG_IP_VS_LC=m
> > >
> > > CONFIG_IP_VS_WLC=m
> > >
> > > CONFIG_IP_VS_LBLC=m
> > >
> > > CONFIG_IP_VS_LBLCR=m
> > >
> > > CONFIG_IP_VS_DH=m
> > >
> > > CONFIG_IP_VS_SH=m
> > >
> > > CONFIG_IP_VS_SED=m
> > >
> > > CONFIG_IP_VS_NQ=m
> > >
> > > # IPVS application helper
> > >
> > > CONFIG_IP_VS_FTP=m
> > >
> > > # CONFIG_SCSI_MVSAS is not set
> > >
> > >
> > >
> > > I have no idea what to check else. Can someone imagine what the problem
> > > could be?
> > >
> > >
> > >
> > > Thanks a lot and .
> > >
> > >
> > >
> > > Mit freundlichen Gruessen / With kind regards
> > >
> > >
> > >
> > > Dirk Alexander Schaefer
> > >
> > >
> > >
> > > _______________________________________________
> > > Linux-HA mailing list
> > > [email protected]
> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > See also: http://linux-ha.org/ReportingProblems
> >
> >
> > First of all, do NOT use 2.0.8 any more. It is for about 5 years out of
> > date, buggy and, well, bad.
> >
> >
> > --
> > Dr. Michael Schwartzkopff
> > Guardinistr. 63
> > 81375 M?nchen
> >
> > mob: 0163 172 50 98
> >
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 9 Jul 2010 17:26:53 +0200
> From: "Schaefer, Dirk Alexander" <[email protected]>
> Subject: Re: [Linux-HA] Two curious problems with heartbeat and
>        ldirector
> To: "'General Linux-HA mailing list'" <[email protected]>
> Message-ID: <[email protected]@bluewin.ch>
> Content-Type: text/plain;       charset="iso-8859-1"
>
> Hi,
>
> unfortunately it isn't. is there a similar functionality in the latest
> heartbeat vesion?
>
> Mit freundlichen Gruessen / With kind regards
>
> Dirk Alexander Schaefer
>
>
> -----Urspr?ngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von Dejan
> Muhamedagic
> Gesendet: Freitag, 9. Juli 2010 17:07
> An: General Linux-HA mailing list
> Betreff: Re: [Linux-HA] Two curious problems with heartbeat and ldirector
>
> Hi,
>
> On Fri, Jul 09, 2010 at 04:25:29PM +0200, Schaefer, Dirk Alexander wrote:
> > Hi,
> >
> > well, that's a good to know information ;) it's latest version
> > offered by gentoo's package manager. I already played with
> > version 3.0.3. something I haven't understand so far is how to
> > configure the part ldirectord plays in my current setup with
> > heartbeat 3.0.3. I cannot find it anymore once installed
> > version 3.0.3. is there any substitute for it available?
>
> It is probably a separate package (ldirectord). Depends on the
> distribution.
>
> Thanks,
>
> Dejan
>
> >
> > Thanks and...
> >
> > Mit freundlichen Gruessen / With kind regards
> >
> > Dirk Alexander Schaefer
> >
> > -----Urspr?ngliche Nachricht-----
> > Von: [email protected]
> [mailto:[email protected]] Im Auftrag von Michael
> Schwartzkopff
> > Gesendet: Freitag, 9. Juli 2010 12:15
> > An: General Linux-HA mailing list
> > Betreff: Re: [Linux-HA] Two curious problems with heartbeat and ldirector
> >
> > Am Freitag, den 09.07.2010, 11:40 +0200 schrieb Schaefer, Dirk
> > Alexander:
> > > Hello,
> > >
> > >
> > >
> > > I'm trying to setup a two nodes ha loadbalancer for two webservers and
> i've
> > > got two curious problems with heartbeat/ldirectord.
> > >
> > >
> > >
> > > The first is, that I cannot active the broadcast mechanism. If i
> uncomment
> > > 'bcast ethXY' in the ha.cf file then the following error is reported
> in
> the
> > > logs during the startup:
> > >
> > >
> > >
> > >                 heartbeat[31484]: 2010/07/09_11:10:04 ERROR: glib:
> Error
> > > setting socket option SO_BINDTODEVICE: Protocol not available
> > >
> > > heartbeat[31484]: 2010/07/09_11:10:04 ERROR: cannot open bcast eth0
> > >
> > >
> > >
> > > and the service is getting stopped.
> > >
> > >
> > >
> > > the second problem is that although everything seems to work fine I
> cannot
> > > access the webServers through the vip. Ipvsadmin shows that the vip and
> the
> > > real servers are registered correctly. It also shows that requests to
> the
> > > vip are getting received. Even more, the failover to the second
> > > heartbeat/ldirectord machine works once I shutdown the current muster.
> But
> > > what ever I do the requests to the vip seems not to get routed/nated/.
> (I've
> > > tried gate and masq definitions in ldirectord conf) to the real
> servers.
> > >
> > >
> > >
> > > My setup:
> > >
> > >
> > >
> > > Os: gentoo 10.1 (amd64) (XEN DomU)
> > >
> > > Heartbeat: 2.0.8
> > >
> > >
> > >
> > > ha.cf:
> > >
> > > debugfile /var/log/ha-debug
> > >
> > > logfile       /var/log/ha-log
> > >
> > > logfacility   local0
> > >
> > > mcast eth0 225.0.0.1 694 1 0
> > >
> > > auto_failback off
> > >
> > > node   lb1 lb2
> > >
> > > respawn root /usr/lib64/heartbeat/ipfail
> > >
> > > apiauth ipfail gid=cluster uid=cluster
> > >
> > >
> > >
> > >
> > >
> > > harecourses:
> > >
> > > lb1 \
> > >
> > >         ldirectord::/etc/ha.d/ldirectord.www.cf \
> > >
> > >         LVSSyncDaemonSwap::master \
> > >
> > >         IPaddr2::172.22.40.50/20/eth1/172.22.47.255
> > >
> > >
> > >
> > >
> > >
> > > ldirector.www.cf:
> > >
> > > checktimeout=10
> > >
> > > checkinterval=2
> > >
> > > autoreload=yes
> > >
> > > logfile="local0"
> > >
> > > logfile="/var/log/ldirectord.log"
> > >
> > > quiescent=yes
> > >
> > >
> > >
> > > virtual=172.22.40.50:80
> > >
> > >         real=172.22.40.110:80 masq
> > >
> > >         real=172.22.40.111:80 masq
> > >
> > >         service=http
> > >
> > >         request="lbProbe.html"
> > >
> > >         receive="OK"
> > >
> > >         scheduler=rr
> > >
> > >
> > >
> > >
> > >
> > > kernel config:
> > >
> > > CONFIG_IP_VS=m
> > >
> > > # CONFIG_IP_VS_IPV6 is not set
> > >
> > > CONFIG_IP_VS_DEBUG=y
> > >
> > > CONFIG_IP_VS_TAB_BITS=12
> > >
> > > # IPVS transport protocol load balancing support
> > >
> > > CONFIG_IP_VS_PROTO_TCP=y
> > >
> > > CONFIG_IP_VS_PROTO_UDP=y
> > >
> > > CONFIG_IP_VS_PROTO_AH_ESP=y
> > >
> > > CONFIG_IP_VS_PROTO_ESP=y
> > >
> > > CONFIG_IP_VS_PROTO_AH=y
> > >
> > > CONFIG_IP_VS_PROTO_SCTP=y
> > >
> > > # IPVS scheduler
> > >
> > > CONFIG_IP_VS_RR=m
> > >
> > > CONFIG_IP_VS_WRR=m
> > >
> > > CONFIG_IP_VS_LC=m
> > >
> > > CONFIG_IP_VS_WLC=m
> > >
> > > CONFIG_IP_VS_LBLC=m
> > >
> > > CONFIG_IP_VS_LBLCR=m
> > >
> > > CONFIG_IP_VS_DH=m
> > >
> > > CONFIG_IP_VS_SH=m
> > >
> > > CONFIG_IP_VS_SED=m
> > >
> > > CONFIG_IP_VS_NQ=m
> > >
> > > # IPVS application helper
> > >
> > > CONFIG_IP_VS_FTP=m
> > >
> > > # CONFIG_SCSI_MVSAS is not set
> > >
> > >
> > >
> > > I have no idea what to check else. Can someone imagine what the problem
> > > could be?
> > >
> > >
> > >
> > > Thanks a lot and .
> > >
> > >
> > >
> > > Mit freundlichen Gruessen / With kind regards
> > >
> > >
> > >
> > > Dirk Alexander Schaefer
> > >
> > >
> > >
> > > _______________________________________________
> > > Linux-HA mailing list
> > > [email protected]
> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > See also: http://linux-ha.org/ReportingProblems
> >
> >
> > First of all, do NOT use 2.0.8 any more. It is for about 5 years out of
> > date, buggy and, well, bad.
> >
> >
> > --
> > Dr. Michael Schwartzkopff
> > Guardinistr. 63
> > 81375 M?nchen
> >
> > mob: 0163 172 50 98
> >
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>
>
> ------------------------------
>
> Message: 3
> Date: Fri, 9 Jul 2010 18:17:05 +0200
> From: Dejan Muhamedagic <[email protected]>
> Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead
>        process on stop or restart
> To: General Linux-HA mailing list <[email protected]>
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=us-ascii
>
> Hi,
>
> On Fri, Jul 09, 2010 at 04:01:14PM +0100, Brett Delle Grazie wrote:
> > Hi,
> >
> > Yes I meant parent of the process is init.
> >
> > Timeout for start operation is: 120 seconds - process is still
> > around (so is tomcat) after this.
>
> According to these log messages:
>
> > Jul 09 15:39:47 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:184:
> start
> > Jul 09 15:39:51 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM
> operation tomcat_tc1:0_start_0 (call=184, rc=0, cib-update=208,
> confirmed=true) ok
> > Jul 09 15:47:43 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:186:
> stop
> > Jul 09 15:47:49 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM
> operation tomcat_tc1:0_stop_0 (call=186, rc=0, cib-update=210,
> confirmed=true) ok
> > Jul 09 15:47:50 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op:
> Performing key=23:227:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584
> op=tomcat_tc1:0_start_0 )
> > Jul 09 15:47:50 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:187:
> start
> > Jul 09 15:47:54 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM
> operation tomcat_tc1:0_start_0 (call=187, rc=0, cib-update=211,
> confirmed=true) ok
>
> everything seems to run fine. The start operation takes around 4
> seconds. It seems to exit just fine. I really don't know what's
> happening. Can you run this again with debug turned on. Then
> produce a hb_report. If it's too big to send to the list, you can
> open a bugzilla.
>
> > heartbeat-libs - taken from linbit repo packages where we have
> > a support contract as we use DRBD in other stuff -
>
> OK. Perhaps you can upgrade from linbit? Though I can't recall
> ever seeing anything like this.
>
> Thanks,
>
> Dejan
>
> > I didn't realise they were renamed to cluster-libs on
> > clusterlabs.org.
> > Hmm. I can update but can the packages on clusterlabs still use heartbeat
> or will I need to switch to corosync?
> >
> > The log from pacemaker / heartbeat is:
> > /var/log/ha-debug
> > Jul 09 15:39:47 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:184:
> start
> > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback:
> flush message from fmp-dun-tapp2
> > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback:
> flush message from fmp-dun-tapp2
> > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback:
> flush message from fmp-dun-tapp2
> > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback:
> flush message from fmp-dun-tapp2
> > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback:
> flush message from fmp-dun-tapp2
> > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback:
> flush message from fmp-dun-tapp2
> > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback:
> flush message from fmp-dun-tapp2
> > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback:
> flush message from fmp-dun-tapp2
> > Jul 09 15:39:48 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback:
> flush message from fmp-dun-tapp2
> > Jul 09 15:39:48 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback:
> flush message from fmp-dun-tapp2
> > Jul 09 15:39:48 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback:
> flush message from fmp-dun-tapp2
> > Jul 09 15:39:51 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM
> operation tomcat_tc1:0_start_0 (call=184, rc=0, cib-update=208,
> confirmed=true) ok
> > Jul 09 15:39:51 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op:
> Performing key=24:225:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584
> op=tomcat_tc1:0_monitor_10000 )
> > Jul 09 15:39:51 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:185:
> monitor
> > Jul 09 15:39:51 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM
> operation tomcat_tc1:0_monitor_10000 (call=185, rc=0, cib-update=209,
> confirmed=false) ok
> > Jul 09 15:40:16 fmp-dun-tapp1 lrmd: [4264]: info: rsc:cmon_html0:0:174:
> monitor
> > Jul 09 15:42:12 fmp-dun-tapp1 cib: [4263]: info: cib_stats: Processed 198
> operations (909.00us average, 0% utilization) in the last 10min
> > Jul 09 15:47:43 fmp-dun-tapp1 lrmd: [4264]: info: cancel_op: operation
> monitor[185] on ocf::tomcat::tomcat_tc1:0 for client 4267, its parameters:
> CRM_meta_interval=[10000] catalina_home=[/opt/tomcat]
> catalina_base=[/home/tomcat/tc-1] tomcat_user=[tomcat]
> catalina_pid=[/home/tomcat/tc-1/temp/tomcat.pid] catalina_rotate_log=[YES]
> CRM_meta_timeout=[30000] CRM_meta_clone_max=[2] crm_feature_set=[3.0.1]
> java_home=[/usr/lib/jvm/java] CRM_meta_globally_unique=[false]
> CRM_meta_name=[monitor] script_log=[/home/tomcat/tc-1/logs/tc-1.log]
> statusurl=[http://127.0.0.1:10305/exam cancelled
> > Jul 09 15:47:43 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op:
> Performing key=24:226:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584
> op=tomcat_tc1:0_stop_0 )
> > Jul 09 15:47:43 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:186:
> stop
> > Jul 09 15:47:43 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM
> operation tomcat_tc1:0_monitor_10000 (call=185, status=1, cib-update=0,
> confirmed=true) Cancelled
> > Jul 09 15:47:43 fmp-dun-tapp1 cib: [31183]: info: write_cib_contents:
> Archived previous version as /var/lib/heartbeat/crm/cib-11.raw
> > Jul 09 15:47:43 fmp-dun-tapp1 cib: [31183]: info: write_cib_contents:
> Wrote version 0.259.0 of the CIB to disk (digest:
> e028d9e440a93208328ecb4eada8fdf6)
> > Jul 09 15:47:43 fmp-dun-tapp1 cib: [31183]: info: retrieveCib: Reading
> cluster configuration from: /var/lib/heartbeat/crm/cib.8JcqTH (digest:
> /var/lib/heartbeat/crm/cib.V73jBK)
> > Jul 09 15:47:49 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM
> operation tomcat_tc1:0_stop_0 (call=186, rc=0, cib-update=210,
> confirmed=true) ok
> > Jul 09 15:47:50 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op:
> Performing key=23:227:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584
> op=tomcat_tc1:0_start_0 )
> > Jul 09 15:47:50 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:187:
> start
> > Jul 09 15:47:54 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM
> operation tomcat_tc1:0_start_0 (call=187, rc=0, cib-update=211,
> confirmed=true) ok
> > Jul 09 15:47:54 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op:
> Performing key=24:227:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584
> op=tomcat_tc1:0_monitor_10000 )
> > Jul 09 15:47:54 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:188:
> monitor
> > Jul 09 15:47:54 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM
> operation tomcat_tc1:0_monitor_10000 (call=188, rc=0, cib-update=212,
> confirmed=false) ok
> >
> > Process list is:
> > root     23162     1  0 15:39 ?        00:00:00   /bin/sh
> /usr/lib/ocf/resource.d//intact/tomcat start
> > root     31372     1  0 15:47 ?        00:00:00   /bin/sh
> /usr/lib/ocf/resource.d//intact/tomcat start
> > tomcat   31408     1  0 15:47 ?        00:00:03
> /usr/lib/jvm/java/bin/java
> -Djava.util.logging.config.file=/home/tomcat/tc-1/conf/logging.properties
> > ... snip ..
> > org.apache.catalina.startup.Bootstrap start
> >
> > There was a 'crm resource restart cl_tomcat_tc1' issued at 15:47.
> >
> > In the above ha-debug log you can clearly see the lrmd starting
> > tomcat at both points in time (15:39 and 15:47) and receiving a
> > successful start ok response.
> >
> >
> >
> > -----Original Message-----
> > From: Dejan Muhamedagic [mailto:[email protected]]
> > Sent: Fri 09/07/2010 14:46
> > To: General Linux-HA mailing list
> > Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead process
> on stop or restart
> >
> > Hi,
> >
> > On Fri, Jul 09, 2010 at 01:04:09PM +0100, Brett Delle Grazie wrote:
> > > Hi,
> > >
> > > Yes checked both logs:
> > >
> > > Catalina.out specifies normal (successful) Tomcat startup.
> > >
> > > tc-1.log (log from backgrounded start/stop operations):
> > >
> > > Doesn't give anything unusual:
> > > 2010/07/09 09:42:13: start ===========================
> > > 2010/07/09 10:20:46: stop  ###########################
> > > 2010/07/09 10:27:35: start ===========================
> > > 2010/07/09 12:50:20: stop  ###########################
> > > 2010/07/09 12:50:26: start ===========================
> > >
> > > Yes, I realise these are from later runs but the same thing is still
> occurring.
> > >
> > > Is it possible that the start operation doesn't close of one of
> > > the file descriptors and is left 'hanging' - even though
> > > it exits (at least from the perspective of pacemaker)?
> > >
> > > Would this explain the ownership of 'init' by the 'tomcat
> > > start' process instead of by pacemaker?
> >
> > No. lrmd kills the process if it doesn't exit within the timeout.
> > By "ownership" I guess you mean the parent process. The RA
> > process (/usr/lib/ocf/.../tomcat start) is a child of the lrmd.
> > init can become its parent only if lrmd exits.
> >
> > What is the timeout for that start operation set to? Does the
> > process remain even after that timeout? What happens to lrmd?
> >
> > > > > > heartbeat-libs-3.0.3-1
> >
> > Where does that come from? Normally, you should have
> > cluster-libs. Perhaps you need to update.
> >
> > Thanks,
> >
> > Dejan
>
>
> ------------------------------
>
> Message: 4
> Date: Fri, 9 Jul 2010 18:18:52 +0200
> From: Dejan Muhamedagic <[email protected]>
> Subject: Re: [Linux-HA] Two curious problems with heartbeat and
>        ldirector
> To: General Linux-HA mailing list <[email protected]>
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=iso-8859-1
>
> Hi,
>
> On Fri, Jul 09, 2010 at 05:26:53PM +0200, Schaefer, Dirk Alexander wrote:
> > Hi,
> >
> > unfortunately it isn't. is there a similar functionality in the latest
> > heartbeat vesion?
>
> There is, but I can't say how it's packaged in your distribution.
> It could be that it's in a package called resource-agents or
> similar too. Best to ask to the distribution maintainer.
>
> Thanks,
>
> Dejan
>
> > Mit freundlichen Gruessen / With kind regards
> >
> > Dirk Alexander Schaefer
> >
> >
> > -----Urspr?ngliche Nachricht-----
> > Von: [email protected]
> > [mailto:[email protected]] Im Auftrag von Dejan
> > Muhamedagic
> > Gesendet: Freitag, 9. Juli 2010 17:07
> > An: General Linux-HA mailing list
> > Betreff: Re: [Linux-HA] Two curious problems with heartbeat and ldirector
> >
> > Hi,
> >
> > On Fri, Jul 09, 2010 at 04:25:29PM +0200, Schaefer, Dirk Alexander wrote:
> > > Hi,
> > >
> > > well, that's a good to know information ;) it's latest version
> > > offered by gentoo's package manager. I already played with
> > > version 3.0.3. something I haven't understand so far is how to
> > > configure the part ldirectord plays in my current setup with
> > > heartbeat 3.0.3. I cannot find it anymore once installed
> > > version 3.0.3. is there any substitute for it available?
> >
> > It is probably a separate package (ldirectord). Depends on the
> > distribution.
> >
> > Thanks,
> >
> > Dejan
> >
> > >
> > > Thanks and...
> > >
> > > Mit freundlichen Gruessen / With kind regards
> > >
> > > Dirk Alexander Schaefer
> > >
> > > -----Urspr?ngliche Nachricht-----
> > > Von: [email protected]
> > [mailto:[email protected]] Im Auftrag von Michael
> > Schwartzkopff
> > > Gesendet: Freitag, 9. Juli 2010 12:15
> > > An: General Linux-HA mailing list
> > > Betreff: Re: [Linux-HA] Two curious problems with heartbeat and
> ldirector
> > >
> > > Am Freitag, den 09.07.2010, 11:40 +0200 schrieb Schaefer, Dirk
> > > Alexander:
> > > > Hello,
> > > >
> > > >
> > > >
> > > > I'm trying to setup a two nodes ha loadbalancer for two webservers
> and
> > i've
> > > > got two curious problems with heartbeat/ldirectord.
> > > >
> > > >
> > > >
> > > > The first is, that I cannot active the broadcast mechanism. If i
> > uncomment
> > > > 'bcast ethXY' in the ha.cf file then the following error is reported
> in
> > the
> > > > logs during the startup:
> > > >
> > > >
> > > >
> > > >                 heartbeat[31484]: 2010/07/09_11:10:04 ERROR: glib:
> Error
> > > > setting socket option SO_BINDTODEVICE: Protocol not available
> > > >
> > > > heartbeat[31484]: 2010/07/09_11:10:04 ERROR: cannot open bcast eth0
> > > >
> > > >
> > > >
> > > > and the service is getting stopped.
> > > >
> > > >
> > > >
> > > > the second problem is that although everything seems to work fine I
> > cannot
> > > > access the webServers through the vip. Ipvsadmin shows that the vip
> and
> > the
> > > > real servers are registered correctly. It also shows that requests to
> > the
> > > > vip are getting received. Even more, the failover to the second
> > > > heartbeat/ldirectord machine works once I shutdown the current
> muster.
> > But
> > > > what ever I do the requests to the vip seems not to get
> routed/nated/.
> > (I've
> > > > tried gate and masq definitions in ldirectord conf) to the real
> servers.
> > > >
> > > >
> > > >
> > > > My setup:
> > > >
> > > >
> > > >
> > > > Os: gentoo 10.1 (amd64) (XEN DomU)
> > > >
> > > > Heartbeat: 2.0.8
> > > >
> > > >
> > > >
> > > > ha.cf:
> > > >
> > > > debugfile /var/log/ha-debug
> > > >
> > > > logfile       /var/log/ha-log
> > > >
> > > > logfacility   local0
> > > >
> > > > mcast eth0 225.0.0.1 694 1 0
> > > >
> > > > auto_failback off
> > > >
> > > > node   lb1 lb2
> > > >
> > > > respawn root /usr/lib64/heartbeat/ipfail
> > > >
> > > > apiauth ipfail gid=cluster uid=cluster
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > harecourses:
> > > >
> > > > lb1 \
> > > >
> > > >         ldirectord::/etc/ha.d/ldirectord.www.cf \
> > > >
> > > >         LVSSyncDaemonSwap::master \
> > > >
> > > >         IPaddr2::172.22.40.50/20/eth1/172.22.47.255
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > ldirector.www.cf:
> > > >
> > > > checktimeout=10
> > > >
> > > > checkinterval=2
> > > >
> > > > autoreload=yes
> > > >
> > > > logfile="local0"
> > > >
> > > > logfile="/var/log/ldirectord.log"
> > > >
> > > > quiescent=yes
> > > >
> > > >
> > > >
> > > > virtual=172.22.40.50:80
> > > >
> > > >         real=172.22.40.110:80 masq
> > > >
> > > >         real=172.22.40.111:80 masq
> > > >
> > > >         service=http
> > > >
> > > >         request="lbProbe.html"
> > > >
> > > >         receive="OK"
> > > >
> > > >         scheduler=rr
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > kernel config:
> > > >
> > > > CONFIG_IP_VS=m
> > > >
> > > > # CONFIG_IP_VS_IPV6 is not set
> > > >
> > > > CONFIG_IP_VS_DEBUG=y
> > > >
> > > > CONFIG_IP_VS_TAB_BITS=12
> > > >
> > > > # IPVS transport protocol load balancing support
> > > >
> > > > CONFIG_IP_VS_PROTO_TCP=y
> > > >
> > > > CONFIG_IP_VS_PROTO_UDP=y
> > > >
> > > > CONFIG_IP_VS_PROTO_AH_ESP=y
> > > >
> > > > CONFIG_IP_VS_PROTO_ESP=y
> > > >
> > > > CONFIG_IP_VS_PROTO_AH=y
> > > >
> > > > CONFIG_IP_VS_PROTO_SCTP=y
> > > >
> > > > # IPVS scheduler
> > > >
> > > > CONFIG_IP_VS_RR=m
> > > >
> > > > CONFIG_IP_VS_WRR=m
> > > >
> > > > CONFIG_IP_VS_LC=m
> > > >
> > > > CONFIG_IP_VS_WLC=m
> > > >
> > > > CONFIG_IP_VS_LBLC=m
> > > >
> > > > CONFIG_IP_VS_LBLCR=m
> > > >
> > > > CONFIG_IP_VS_DH=m
> > > >
> > > > CONFIG_IP_VS_SH=m
> > > >
> > > > CONFIG_IP_VS_SED=m
> > > >
> > > > CONFIG_IP_VS_NQ=m
> > > >
> > > > # IPVS application helper
> > > >
> > > > CONFIG_IP_VS_FTP=m
> > > >
> > > > # CONFIG_SCSI_MVSAS is not set
> > > >
> > > >
> > > >
> > > > I have no idea what to check else. Can someone imagine what the
> problem
> > > > could be?
> > > >
> > > >
> > > >
> > > > Thanks a lot and .
> > > >
> > > >
> > > >
> > > > Mit freundlichen Gruessen / With kind regards
> > > >
> > > >
> > > >
> > > > Dirk Alexander Schaefer
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > Linux-HA mailing list
> > > > [email protected]
> > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > > See also: http://linux-ha.org/ReportingProblems
> > >
> > >
> > > First of all, do NOT use 2.0.8 any more. It is for about 5 years out of
> > > date, buggy and, well, bad.
> > >
> > >
> > > --
> > > Dr. Michael Schwartzkopff
> > > Guardinistr. 63
> > > 81375 M?nchen
> > >
> > > mob: 0163 172 50 98
> > >
> > > _______________________________________________
> > > Linux-HA mailing list
> > > [email protected]
> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > See also: http://linux-ha.org/ReportingProblems
> > >
> > > _______________________________________________
> > > Linux-HA mailing list
> > > [email protected]
> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > See also: http://linux-ha.org/ReportingProblems
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
>
>
> ------------------------------
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
> End of Linux-HA Digest, Vol 80, Issue 20
> ****************************************
>



-- 
E-mail:[email protected] <e-mail%[email protected]>
MSN:[email protected] <msn%[email protected]>
OICQ:453446376
Name:Voyager
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Linux-HA Digest, Vol 80, Issue 20

Reply via email to