help On Sat, Jul 10, 2010 at 12:18 AM, <[email protected]>wrote:
> Send Linux-HA mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.linux-ha.org/mailman/listinfo/linux-ha > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Linux-HA digest..." > > > Today's Topics: > > 1. Re: Two curious problems with heartbeat and ldirector > (Dejan Muhamedagic) > 2. Re: Two curious problems with heartbeat and ldirector > (Schaefer, Dirk Alexander) > 3. Re: Tomcat Resource Agent always leaves dead process on stop > or restart (Dejan Muhamedagic) > 4. Re: Two curious problems with heartbeat and ldirector > (Dejan Muhamedagic) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 9 Jul 2010 17:06:54 +0200 > From: Dejan Muhamedagic <[email protected]> > Subject: Re: [Linux-HA] Two curious problems with heartbeat and > ldirector > To: General Linux-HA mailing list <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset=iso-8859-1 > > Hi, > > On Fri, Jul 09, 2010 at 04:25:29PM +0200, Schaefer, Dirk Alexander wrote: > > Hi, > > > > well, that's a good to know information ;) it's latest version > > offered by gentoo's package manager. I already played with > > version 3.0.3. something I haven't understand so far is how to > > configure the part ldirectord plays in my current setup with > > heartbeat 3.0.3. I cannot find it anymore once installed > > version 3.0.3. is there any substitute for it available? > > It is probably a separate package (ldirectord). Depends on the > distribution. > > Thanks, > > Dejan > > > > > Thanks and... > > > > Mit freundlichen Gruessen / With kind regards > > > > Dirk Alexander Schaefer > > > > -----Urspr?ngliche Nachricht----- > > Von: [email protected] [mailto: > [email protected]] Im Auftrag von Michael Schwartzkopff > > Gesendet: Freitag, 9. Juli 2010 12:15 > > An: General Linux-HA mailing list > > Betreff: Re: [Linux-HA] Two curious problems with heartbeat and ldirector > > > > Am Freitag, den 09.07.2010, 11:40 +0200 schrieb Schaefer, Dirk > > Alexander: > > > Hello, > > > > > > > > > > > > I'm trying to setup a two nodes ha loadbalancer for two webservers and > i've > > > got two curious problems with heartbeat/ldirectord. > > > > > > > > > > > > The first is, that I cannot active the broadcast mechanism. If i > uncomment > > > 'bcast ethXY' in the ha.cf file then the following error is reported > in the > > > logs during the startup: > > > > > > > > > > > > heartbeat[31484]: 2010/07/09_11:10:04 ERROR: glib: > Error > > > setting socket option SO_BINDTODEVICE: Protocol not available > > > > > > heartbeat[31484]: 2010/07/09_11:10:04 ERROR: cannot open bcast eth0 > > > > > > > > > > > > and the service is getting stopped. > > > > > > > > > > > > the second problem is that although everything seems to work fine I > cannot > > > access the webServers through the vip. Ipvsadmin shows that the vip and > the > > > real servers are registered correctly. It also shows that requests to > the > > > vip are getting received. Even more, the failover to the second > > > heartbeat/ldirectord machine works once I shutdown the current muster. > But > > > what ever I do the requests to the vip seems not to get routed/nated/. > (I've > > > tried gate and masq definitions in ldirectord conf) to the real > servers. > > > > > > > > > > > > My setup: > > > > > > > > > > > > Os: gentoo 10.1 (amd64) (XEN DomU) > > > > > > Heartbeat: 2.0.8 > > > > > > > > > > > > ha.cf: > > > > > > debugfile /var/log/ha-debug > > > > > > logfile /var/log/ha-log > > > > > > logfacility local0 > > > > > > mcast eth0 225.0.0.1 694 1 0 > > > > > > auto_failback off > > > > > > node lb1 lb2 > > > > > > respawn root /usr/lib64/heartbeat/ipfail > > > > > > apiauth ipfail gid=cluster uid=cluster > > > > > > > > > > > > > > > > > > harecourses: > > > > > > lb1 \ > > > > > > ldirectord::/etc/ha.d/ldirectord.www.cf \ > > > > > > LVSSyncDaemonSwap::master \ > > > > > > IPaddr2::172.22.40.50/20/eth1/172.22.47.255 > > > > > > > > > > > > > > > > > > ldirector.www.cf: > > > > > > checktimeout=10 > > > > > > checkinterval=2 > > > > > > autoreload=yes > > > > > > logfile="local0" > > > > > > logfile="/var/log/ldirectord.log" > > > > > > quiescent=yes > > > > > > > > > > > > virtual=172.22.40.50:80 > > > > > > real=172.22.40.110:80 masq > > > > > > real=172.22.40.111:80 masq > > > > > > service=http > > > > > > request="lbProbe.html" > > > > > > receive="OK" > > > > > > scheduler=rr > > > > > > > > > > > > > > > > > > kernel config: > > > > > > CONFIG_IP_VS=m > > > > > > # CONFIG_IP_VS_IPV6 is not set > > > > > > CONFIG_IP_VS_DEBUG=y > > > > > > CONFIG_IP_VS_TAB_BITS=12 > > > > > > # IPVS transport protocol load balancing support > > > > > > CONFIG_IP_VS_PROTO_TCP=y > > > > > > CONFIG_IP_VS_PROTO_UDP=y > > > > > > CONFIG_IP_VS_PROTO_AH_ESP=y > > > > > > CONFIG_IP_VS_PROTO_ESP=y > > > > > > CONFIG_IP_VS_PROTO_AH=y > > > > > > CONFIG_IP_VS_PROTO_SCTP=y > > > > > > # IPVS scheduler > > > > > > CONFIG_IP_VS_RR=m > > > > > > CONFIG_IP_VS_WRR=m > > > > > > CONFIG_IP_VS_LC=m > > > > > > CONFIG_IP_VS_WLC=m > > > > > > CONFIG_IP_VS_LBLC=m > > > > > > CONFIG_IP_VS_LBLCR=m > > > > > > CONFIG_IP_VS_DH=m > > > > > > CONFIG_IP_VS_SH=m > > > > > > CONFIG_IP_VS_SED=m > > > > > > CONFIG_IP_VS_NQ=m > > > > > > # IPVS application helper > > > > > > CONFIG_IP_VS_FTP=m > > > > > > # CONFIG_SCSI_MVSAS is not set > > > > > > > > > > > > I have no idea what to check else. Can someone imagine what the problem > > > could be? > > > > > > > > > > > > Thanks a lot and . > > > > > > > > > > > > Mit freundlichen Gruessen / With kind regards > > > > > > > > > > > > Dirk Alexander Schaefer > > > > > > > > > > > > _______________________________________________ > > > Linux-HA mailing list > > > [email protected] > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > See also: http://linux-ha.org/ReportingProblems > > > > > > First of all, do NOT use 2.0.8 any more. It is for about 5 years out of > > date, buggy and, well, bad. > > > > > > -- > > Dr. Michael Schwartzkopff > > Guardinistr. 63 > > 81375 M?nchen > > > > mob: 0163 172 50 98 > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > ------------------------------ > > Message: 2 > Date: Fri, 9 Jul 2010 17:26:53 +0200 > From: "Schaefer, Dirk Alexander" <[email protected]> > Subject: Re: [Linux-HA] Two curious problems with heartbeat and > ldirector > To: "'General Linux-HA mailing list'" <[email protected]> > Message-ID: <[email protected]@bluewin.ch> > Content-Type: text/plain; charset="iso-8859-1" > > Hi, > > unfortunately it isn't. is there a similar functionality in the latest > heartbeat vesion? > > Mit freundlichen Gruessen / With kind regards > > Dirk Alexander Schaefer > > > -----Urspr?ngliche Nachricht----- > Von: [email protected] > [mailto:[email protected]] Im Auftrag von Dejan > Muhamedagic > Gesendet: Freitag, 9. Juli 2010 17:07 > An: General Linux-HA mailing list > Betreff: Re: [Linux-HA] Two curious problems with heartbeat and ldirector > > Hi, > > On Fri, Jul 09, 2010 at 04:25:29PM +0200, Schaefer, Dirk Alexander wrote: > > Hi, > > > > well, that's a good to know information ;) it's latest version > > offered by gentoo's package manager. I already played with > > version 3.0.3. something I haven't understand so far is how to > > configure the part ldirectord plays in my current setup with > > heartbeat 3.0.3. I cannot find it anymore once installed > > version 3.0.3. is there any substitute for it available? > > It is probably a separate package (ldirectord). Depends on the > distribution. > > Thanks, > > Dejan > > > > > Thanks and... > > > > Mit freundlichen Gruessen / With kind regards > > > > Dirk Alexander Schaefer > > > > -----Urspr?ngliche Nachricht----- > > Von: [email protected] > [mailto:[email protected]] Im Auftrag von Michael > Schwartzkopff > > Gesendet: Freitag, 9. Juli 2010 12:15 > > An: General Linux-HA mailing list > > Betreff: Re: [Linux-HA] Two curious problems with heartbeat and ldirector > > > > Am Freitag, den 09.07.2010, 11:40 +0200 schrieb Schaefer, Dirk > > Alexander: > > > Hello, > > > > > > > > > > > > I'm trying to setup a two nodes ha loadbalancer for two webservers and > i've > > > got two curious problems with heartbeat/ldirectord. > > > > > > > > > > > > The first is, that I cannot active the broadcast mechanism. If i > uncomment > > > 'bcast ethXY' in the ha.cf file then the following error is reported > in > the > > > logs during the startup: > > > > > > > > > > > > heartbeat[31484]: 2010/07/09_11:10:04 ERROR: glib: > Error > > > setting socket option SO_BINDTODEVICE: Protocol not available > > > > > > heartbeat[31484]: 2010/07/09_11:10:04 ERROR: cannot open bcast eth0 > > > > > > > > > > > > and the service is getting stopped. > > > > > > > > > > > > the second problem is that although everything seems to work fine I > cannot > > > access the webServers through the vip. Ipvsadmin shows that the vip and > the > > > real servers are registered correctly. It also shows that requests to > the > > > vip are getting received. Even more, the failover to the second > > > heartbeat/ldirectord machine works once I shutdown the current muster. > But > > > what ever I do the requests to the vip seems not to get routed/nated/. > (I've > > > tried gate and masq definitions in ldirectord conf) to the real > servers. > > > > > > > > > > > > My setup: > > > > > > > > > > > > Os: gentoo 10.1 (amd64) (XEN DomU) > > > > > > Heartbeat: 2.0.8 > > > > > > > > > > > > ha.cf: > > > > > > debugfile /var/log/ha-debug > > > > > > logfile /var/log/ha-log > > > > > > logfacility local0 > > > > > > mcast eth0 225.0.0.1 694 1 0 > > > > > > auto_failback off > > > > > > node lb1 lb2 > > > > > > respawn root /usr/lib64/heartbeat/ipfail > > > > > > apiauth ipfail gid=cluster uid=cluster > > > > > > > > > > > > > > > > > > harecourses: > > > > > > lb1 \ > > > > > > ldirectord::/etc/ha.d/ldirectord.www.cf \ > > > > > > LVSSyncDaemonSwap::master \ > > > > > > IPaddr2::172.22.40.50/20/eth1/172.22.47.255 > > > > > > > > > > > > > > > > > > ldirector.www.cf: > > > > > > checktimeout=10 > > > > > > checkinterval=2 > > > > > > autoreload=yes > > > > > > logfile="local0" > > > > > > logfile="/var/log/ldirectord.log" > > > > > > quiescent=yes > > > > > > > > > > > > virtual=172.22.40.50:80 > > > > > > real=172.22.40.110:80 masq > > > > > > real=172.22.40.111:80 masq > > > > > > service=http > > > > > > request="lbProbe.html" > > > > > > receive="OK" > > > > > > scheduler=rr > > > > > > > > > > > > > > > > > > kernel config: > > > > > > CONFIG_IP_VS=m > > > > > > # CONFIG_IP_VS_IPV6 is not set > > > > > > CONFIG_IP_VS_DEBUG=y > > > > > > CONFIG_IP_VS_TAB_BITS=12 > > > > > > # IPVS transport protocol load balancing support > > > > > > CONFIG_IP_VS_PROTO_TCP=y > > > > > > CONFIG_IP_VS_PROTO_UDP=y > > > > > > CONFIG_IP_VS_PROTO_AH_ESP=y > > > > > > CONFIG_IP_VS_PROTO_ESP=y > > > > > > CONFIG_IP_VS_PROTO_AH=y > > > > > > CONFIG_IP_VS_PROTO_SCTP=y > > > > > > # IPVS scheduler > > > > > > CONFIG_IP_VS_RR=m > > > > > > CONFIG_IP_VS_WRR=m > > > > > > CONFIG_IP_VS_LC=m > > > > > > CONFIG_IP_VS_WLC=m > > > > > > CONFIG_IP_VS_LBLC=m > > > > > > CONFIG_IP_VS_LBLCR=m > > > > > > CONFIG_IP_VS_DH=m > > > > > > CONFIG_IP_VS_SH=m > > > > > > CONFIG_IP_VS_SED=m > > > > > > CONFIG_IP_VS_NQ=m > > > > > > # IPVS application helper > > > > > > CONFIG_IP_VS_FTP=m > > > > > > # CONFIG_SCSI_MVSAS is not set > > > > > > > > > > > > I have no idea what to check else. Can someone imagine what the problem > > > could be? > > > > > > > > > > > > Thanks a lot and . > > > > > > > > > > > > Mit freundlichen Gruessen / With kind regards > > > > > > > > > > > > Dirk Alexander Schaefer > > > > > > > > > > > > _______________________________________________ > > > Linux-HA mailing list > > > [email protected] > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > See also: http://linux-ha.org/ReportingProblems > > > > > > First of all, do NOT use 2.0.8 any more. It is for about 5 years out of > > date, buggy and, well, bad. > > > > > > -- > > Dr. Michael Schwartzkopff > > Guardinistr. 63 > > 81375 M?nchen > > > > mob: 0163 172 50 98 > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > > > > ------------------------------ > > Message: 3 > Date: Fri, 9 Jul 2010 18:17:05 +0200 > From: Dejan Muhamedagic <[email protected]> > Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead > process on stop or restart > To: General Linux-HA mailing list <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset=us-ascii > > Hi, > > On Fri, Jul 09, 2010 at 04:01:14PM +0100, Brett Delle Grazie wrote: > > Hi, > > > > Yes I meant parent of the process is init. > > > > Timeout for start operation is: 120 seconds - process is still > > around (so is tomcat) after this. > > According to these log messages: > > > Jul 09 15:39:47 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:184: > start > > Jul 09 15:39:51 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM > operation tomcat_tc1:0_start_0 (call=184, rc=0, cib-update=208, > confirmed=true) ok > > Jul 09 15:47:43 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:186: > stop > > Jul 09 15:47:49 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM > operation tomcat_tc1:0_stop_0 (call=186, rc=0, cib-update=210, > confirmed=true) ok > > Jul 09 15:47:50 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op: > Performing key=23:227:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584 > op=tomcat_tc1:0_start_0 ) > > Jul 09 15:47:50 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:187: > start > > Jul 09 15:47:54 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM > operation tomcat_tc1:0_start_0 (call=187, rc=0, cib-update=211, > confirmed=true) ok > > everything seems to run fine. The start operation takes around 4 > seconds. It seems to exit just fine. I really don't know what's > happening. Can you run this again with debug turned on. Then > produce a hb_report. If it's too big to send to the list, you can > open a bugzilla. > > > heartbeat-libs - taken from linbit repo packages where we have > > a support contract as we use DRBD in other stuff - > > OK. Perhaps you can upgrade from linbit? Though I can't recall > ever seeing anything like this. > > Thanks, > > Dejan > > > I didn't realise they were renamed to cluster-libs on > > clusterlabs.org. > > Hmm. I can update but can the packages on clusterlabs still use heartbeat > or will I need to switch to corosync? > > > > The log from pacemaker / heartbeat is: > > /var/log/ha-debug > > Jul 09 15:39:47 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:184: > start > > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: > flush message from fmp-dun-tapp2 > > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: > flush message from fmp-dun-tapp2 > > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: > flush message from fmp-dun-tapp2 > > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: > flush message from fmp-dun-tapp2 > > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: > flush message from fmp-dun-tapp2 > > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: > flush message from fmp-dun-tapp2 > > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: > flush message from fmp-dun-tapp2 > > Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: > flush message from fmp-dun-tapp2 > > Jul 09 15:39:48 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: > flush message from fmp-dun-tapp2 > > Jul 09 15:39:48 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: > flush message from fmp-dun-tapp2 > > Jul 09 15:39:48 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: > flush message from fmp-dun-tapp2 > > Jul 09 15:39:51 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM > operation tomcat_tc1:0_start_0 (call=184, rc=0, cib-update=208, > confirmed=true) ok > > Jul 09 15:39:51 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op: > Performing key=24:225:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584 > op=tomcat_tc1:0_monitor_10000 ) > > Jul 09 15:39:51 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:185: > monitor > > Jul 09 15:39:51 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM > operation tomcat_tc1:0_monitor_10000 (call=185, rc=0, cib-update=209, > confirmed=false) ok > > Jul 09 15:40:16 fmp-dun-tapp1 lrmd: [4264]: info: rsc:cmon_html0:0:174: > monitor > > Jul 09 15:42:12 fmp-dun-tapp1 cib: [4263]: info: cib_stats: Processed 198 > operations (909.00us average, 0% utilization) in the last 10min > > Jul 09 15:47:43 fmp-dun-tapp1 lrmd: [4264]: info: cancel_op: operation > monitor[185] on ocf::tomcat::tomcat_tc1:0 for client 4267, its parameters: > CRM_meta_interval=[10000] catalina_home=[/opt/tomcat] > catalina_base=[/home/tomcat/tc-1] tomcat_user=[tomcat] > catalina_pid=[/home/tomcat/tc-1/temp/tomcat.pid] catalina_rotate_log=[YES] > CRM_meta_timeout=[30000] CRM_meta_clone_max=[2] crm_feature_set=[3.0.1] > java_home=[/usr/lib/jvm/java] CRM_meta_globally_unique=[false] > CRM_meta_name=[monitor] script_log=[/home/tomcat/tc-1/logs/tc-1.log] > statusurl=[http://127.0.0.1:10305/exam cancelled > > Jul 09 15:47:43 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op: > Performing key=24:226:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584 > op=tomcat_tc1:0_stop_0 ) > > Jul 09 15:47:43 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:186: > stop > > Jul 09 15:47:43 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM > operation tomcat_tc1:0_monitor_10000 (call=185, status=1, cib-update=0, > confirmed=true) Cancelled > > Jul 09 15:47:43 fmp-dun-tapp1 cib: [31183]: info: write_cib_contents: > Archived previous version as /var/lib/heartbeat/crm/cib-11.raw > > Jul 09 15:47:43 fmp-dun-tapp1 cib: [31183]: info: write_cib_contents: > Wrote version 0.259.0 of the CIB to disk (digest: > e028d9e440a93208328ecb4eada8fdf6) > > Jul 09 15:47:43 fmp-dun-tapp1 cib: [31183]: info: retrieveCib: Reading > cluster configuration from: /var/lib/heartbeat/crm/cib.8JcqTH (digest: > /var/lib/heartbeat/crm/cib.V73jBK) > > Jul 09 15:47:49 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM > operation tomcat_tc1:0_stop_0 (call=186, rc=0, cib-update=210, > confirmed=true) ok > > Jul 09 15:47:50 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op: > Performing key=23:227:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584 > op=tomcat_tc1:0_start_0 ) > > Jul 09 15:47:50 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:187: > start > > Jul 09 15:47:54 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM > operation tomcat_tc1:0_start_0 (call=187, rc=0, cib-update=211, > confirmed=true) ok > > Jul 09 15:47:54 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op: > Performing key=24:227:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584 > op=tomcat_tc1:0_monitor_10000 ) > > Jul 09 15:47:54 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:188: > monitor > > Jul 09 15:47:54 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM > operation tomcat_tc1:0_monitor_10000 (call=188, rc=0, cib-update=212, > confirmed=false) ok > > > > Process list is: > > root 23162 1 0 15:39 ? 00:00:00 /bin/sh > /usr/lib/ocf/resource.d//intact/tomcat start > > root 31372 1 0 15:47 ? 00:00:00 /bin/sh > /usr/lib/ocf/resource.d//intact/tomcat start > > tomcat 31408 1 0 15:47 ? 00:00:03 > /usr/lib/jvm/java/bin/java > -Djava.util.logging.config.file=/home/tomcat/tc-1/conf/logging.properties > > ... snip .. > > org.apache.catalina.startup.Bootstrap start > > > > There was a 'crm resource restart cl_tomcat_tc1' issued at 15:47. > > > > In the above ha-debug log you can clearly see the lrmd starting > > tomcat at both points in time (15:39 and 15:47) and receiving a > > successful start ok response. > > > > > > > > -----Original Message----- > > From: Dejan Muhamedagic [mailto:[email protected]] > > Sent: Fri 09/07/2010 14:46 > > To: General Linux-HA mailing list > > Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead process > on stop or restart > > > > Hi, > > > > On Fri, Jul 09, 2010 at 01:04:09PM +0100, Brett Delle Grazie wrote: > > > Hi, > > > > > > Yes checked both logs: > > > > > > Catalina.out specifies normal (successful) Tomcat startup. > > > > > > tc-1.log (log from backgrounded start/stop operations): > > > > > > Doesn't give anything unusual: > > > 2010/07/09 09:42:13: start =========================== > > > 2010/07/09 10:20:46: stop ########################### > > > 2010/07/09 10:27:35: start =========================== > > > 2010/07/09 12:50:20: stop ########################### > > > 2010/07/09 12:50:26: start =========================== > > > > > > Yes, I realise these are from later runs but the same thing is still > occurring. > > > > > > Is it possible that the start operation doesn't close of one of > > > the file descriptors and is left 'hanging' - even though > > > it exits (at least from the perspective of pacemaker)? > > > > > > Would this explain the ownership of 'init' by the 'tomcat > > > start' process instead of by pacemaker? > > > > No. lrmd kills the process if it doesn't exit within the timeout. > > By "ownership" I guess you mean the parent process. The RA > > process (/usr/lib/ocf/.../tomcat start) is a child of the lrmd. > > init can become its parent only if lrmd exits. > > > > What is the timeout for that start operation set to? Does the > > process remain even after that timeout? What happens to lrmd? > > > > > > > > heartbeat-libs-3.0.3-1 > > > > Where does that come from? Normally, you should have > > cluster-libs. Perhaps you need to update. > > > > Thanks, > > > > Dejan > > > ------------------------------ > > Message: 4 > Date: Fri, 9 Jul 2010 18:18:52 +0200 > From: Dejan Muhamedagic <[email protected]> > Subject: Re: [Linux-HA] Two curious problems with heartbeat and > ldirector > To: General Linux-HA mailing list <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset=iso-8859-1 > > Hi, > > On Fri, Jul 09, 2010 at 05:26:53PM +0200, Schaefer, Dirk Alexander wrote: > > Hi, > > > > unfortunately it isn't. is there a similar functionality in the latest > > heartbeat vesion? > > There is, but I can't say how it's packaged in your distribution. > It could be that it's in a package called resource-agents or > similar too. Best to ask to the distribution maintainer. > > Thanks, > > Dejan > > > Mit freundlichen Gruessen / With kind regards > > > > Dirk Alexander Schaefer > > > > > > -----Urspr?ngliche Nachricht----- > > Von: [email protected] > > [mailto:[email protected]] Im Auftrag von Dejan > > Muhamedagic > > Gesendet: Freitag, 9. Juli 2010 17:07 > > An: General Linux-HA mailing list > > Betreff: Re: [Linux-HA] Two curious problems with heartbeat and ldirector > > > > Hi, > > > > On Fri, Jul 09, 2010 at 04:25:29PM +0200, Schaefer, Dirk Alexander wrote: > > > Hi, > > > > > > well, that's a good to know information ;) it's latest version > > > offered by gentoo's package manager. I already played with > > > version 3.0.3. something I haven't understand so far is how to > > > configure the part ldirectord plays in my current setup with > > > heartbeat 3.0.3. I cannot find it anymore once installed > > > version 3.0.3. is there any substitute for it available? > > > > It is probably a separate package (ldirectord). Depends on the > > distribution. > > > > Thanks, > > > > Dejan > > > > > > > > Thanks and... > > > > > > Mit freundlichen Gruessen / With kind regards > > > > > > Dirk Alexander Schaefer > > > > > > -----Urspr?ngliche Nachricht----- > > > Von: [email protected] > > [mailto:[email protected]] Im Auftrag von Michael > > Schwartzkopff > > > Gesendet: Freitag, 9. Juli 2010 12:15 > > > An: General Linux-HA mailing list > > > Betreff: Re: [Linux-HA] Two curious problems with heartbeat and > ldirector > > > > > > Am Freitag, den 09.07.2010, 11:40 +0200 schrieb Schaefer, Dirk > > > Alexander: > > > > Hello, > > > > > > > > > > > > > > > > I'm trying to setup a two nodes ha loadbalancer for two webservers > and > > i've > > > > got two curious problems with heartbeat/ldirectord. > > > > > > > > > > > > > > > > The first is, that I cannot active the broadcast mechanism. If i > > uncomment > > > > 'bcast ethXY' in the ha.cf file then the following error is reported > in > > the > > > > logs during the startup: > > > > > > > > > > > > > > > > heartbeat[31484]: 2010/07/09_11:10:04 ERROR: glib: > Error > > > > setting socket option SO_BINDTODEVICE: Protocol not available > > > > > > > > heartbeat[31484]: 2010/07/09_11:10:04 ERROR: cannot open bcast eth0 > > > > > > > > > > > > > > > > and the service is getting stopped. > > > > > > > > > > > > > > > > the second problem is that although everything seems to work fine I > > cannot > > > > access the webServers through the vip. Ipvsadmin shows that the vip > and > > the > > > > real servers are registered correctly. It also shows that requests to > > the > > > > vip are getting received. Even more, the failover to the second > > > > heartbeat/ldirectord machine works once I shutdown the current > muster. > > But > > > > what ever I do the requests to the vip seems not to get > routed/nated/. > > (I've > > > > tried gate and masq definitions in ldirectord conf) to the real > servers. > > > > > > > > > > > > > > > > My setup: > > > > > > > > > > > > > > > > Os: gentoo 10.1 (amd64) (XEN DomU) > > > > > > > > Heartbeat: 2.0.8 > > > > > > > > > > > > > > > > ha.cf: > > > > > > > > debugfile /var/log/ha-debug > > > > > > > > logfile /var/log/ha-log > > > > > > > > logfacility local0 > > > > > > > > mcast eth0 225.0.0.1 694 1 0 > > > > > > > > auto_failback off > > > > > > > > node lb1 lb2 > > > > > > > > respawn root /usr/lib64/heartbeat/ipfail > > > > > > > > apiauth ipfail gid=cluster uid=cluster > > > > > > > > > > > > > > > > > > > > > > > > harecourses: > > > > > > > > lb1 \ > > > > > > > > ldirectord::/etc/ha.d/ldirectord.www.cf \ > > > > > > > > LVSSyncDaemonSwap::master \ > > > > > > > > IPaddr2::172.22.40.50/20/eth1/172.22.47.255 > > > > > > > > > > > > > > > > > > > > > > > > ldirector.www.cf: > > > > > > > > checktimeout=10 > > > > > > > > checkinterval=2 > > > > > > > > autoreload=yes > > > > > > > > logfile="local0" > > > > > > > > logfile="/var/log/ldirectord.log" > > > > > > > > quiescent=yes > > > > > > > > > > > > > > > > virtual=172.22.40.50:80 > > > > > > > > real=172.22.40.110:80 masq > > > > > > > > real=172.22.40.111:80 masq > > > > > > > > service=http > > > > > > > > request="lbProbe.html" > > > > > > > > receive="OK" > > > > > > > > scheduler=rr > > > > > > > > > > > > > > > > > > > > > > > > kernel config: > > > > > > > > CONFIG_IP_VS=m > > > > > > > > # CONFIG_IP_VS_IPV6 is not set > > > > > > > > CONFIG_IP_VS_DEBUG=y > > > > > > > > CONFIG_IP_VS_TAB_BITS=12 > > > > > > > > # IPVS transport protocol load balancing support > > > > > > > > CONFIG_IP_VS_PROTO_TCP=y > > > > > > > > CONFIG_IP_VS_PROTO_UDP=y > > > > > > > > CONFIG_IP_VS_PROTO_AH_ESP=y > > > > > > > > CONFIG_IP_VS_PROTO_ESP=y > > > > > > > > CONFIG_IP_VS_PROTO_AH=y > > > > > > > > CONFIG_IP_VS_PROTO_SCTP=y > > > > > > > > # IPVS scheduler > > > > > > > > CONFIG_IP_VS_RR=m > > > > > > > > CONFIG_IP_VS_WRR=m > > > > > > > > CONFIG_IP_VS_LC=m > > > > > > > > CONFIG_IP_VS_WLC=m > > > > > > > > CONFIG_IP_VS_LBLC=m > > > > > > > > CONFIG_IP_VS_LBLCR=m > > > > > > > > CONFIG_IP_VS_DH=m > > > > > > > > CONFIG_IP_VS_SH=m > > > > > > > > CONFIG_IP_VS_SED=m > > > > > > > > CONFIG_IP_VS_NQ=m > > > > > > > > # IPVS application helper > > > > > > > > CONFIG_IP_VS_FTP=m > > > > > > > > # CONFIG_SCSI_MVSAS is not set > > > > > > > > > > > > > > > > I have no idea what to check else. Can someone imagine what the > problem > > > > could be? > > > > > > > > > > > > > > > > Thanks a lot and . > > > > > > > > > > > > > > > > Mit freundlichen Gruessen / With kind regards > > > > > > > > > > > > > > > > Dirk Alexander Schaefer > > > > > > > > > > > > > > > > _______________________________________________ > > > > Linux-HA mailing list > > > > [email protected] > > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > > See also: http://linux-ha.org/ReportingProblems > > > > > > > > > First of all, do NOT use 2.0.8 any more. It is for about 5 years out of > > > date, buggy and, well, bad. > > > > > > > > > -- > > > Dr. Michael Schwartzkopff > > > Guardinistr. 63 > > > 81375 M?nchen > > > > > > mob: 0163 172 50 98 > > > > > > _______________________________________________ > > > Linux-HA mailing list > > > [email protected] > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > See also: http://linux-ha.org/ReportingProblems > > > > > > _______________________________________________ > > > Linux-HA mailing list > > > [email protected] > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > See also: http://linux-ha.org/ReportingProblems > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > ------------------------------ > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > > End of Linux-HA Digest, Vol 80, Issue 20 > **************************************** > -- E-mail:[email protected] <e-mail%[email protected]> MSN:[email protected] <msn%[email protected]> OICQ:453446376 Name:Voyager _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
