On Thu, Sep 04, 2008 at 10:45:30AM -0500, Matt Zagrabelny wrote: > On Wed, 2008-09-03 at 16:18 +0200, Dejan Muhamedagic wrote: > > Hi, > > > > On Tue, Sep 02, 2008 at 07:29:45PM -0500, Matt Zagrabelny wrote: > > > Greetings, > > > > > > I have an IPaddr2 resource that is timing out. > > > > > > Logs: > > > > > > lrmd[29423]: 2008/09/02_18:41:08 WARN: internal_VIP:monitor process (PID > > > 24658) timed out (try 1). Killing with signal SIGTERM (15). > > > lrmd[29423]: 2008/09/02_18:41:08 WARN: operation monitor[25] on > > > ocf::IPaddr2::internal_VIP for client 29426, its parameters: > > > CRM_meta_interval=[5000] ip=[192.168.115.25] > > > CRM_meta_id=[internal_VIP_mon] CRM_meta_timeout=[5000] > > > crm_feature_set=[2.0] CRM_meta_name=[monitor] : pid [24658] timed out > > > crmd[29426]: 2008/09/02_18:41:08 ERROR: process_lrm_event: LRM operation > > > internal_VIP_monitor_5000 (25) Timed Out (timeout=5000ms) > > > > > > cib.xml: > > > > > > <primitive id="internal_VIP" class="ocf" provider="heartbeat" > > > type="IPaddr2"> > > > <operations> > > > <op id="internal_VIP_mon" name="monitor" interval="5s" timeout="5s"/> > > > </operations> > > > <instance_attributes id="internal_VIP_inst_attr"> > > > <attributes> > > > <nvpair id="internal_VIP_ip_assignment" name="ip" > > > value="192.168.115.25"/> > > > </attributes> > > > </instance_attributes> > > > </primitive> > > > > > > I have looked through the IPaddr2 RA and cannot find anyplace > > > in the monitor code that would be taking anywhere near 5 seconds > > > to complete. > > > > Busy host? Something somewhere with name resolution or network? > > This seems to be the only external program invoked: > > > > ip -o -f inet addr show > > > > Can't think of anything else. > > > > Yes, five seconds may be too excessive, but you should still > > allow for higher timeouts. > > Suppose I want to see what the average time is for completion of monitor > events, is that possible?
Afraid not. Though there's an enhancement request for that. > I am curious what the completion time is under high load. Also, it would > be good to determine the timeout value that is larger than 99.9999% of > the monitors. I know I could set it to 600 seconds, but that seems > excessive, I would rather have educated values. Of course. Well, something like 15 or 20 seconds should fit the bill. If it goes higher than that then perhaps it should be considered to be an outage. That depends on your setup/users/services. Of course, there's no hard rule on what do on high host load. Thanks, Dejan > Thanks, > > -- > Matt Zagrabelny - [EMAIL PROTECTED] - (218) 726 8844 > University of Minnesota Duluth > Information Technology Systems & Services > PGP key 1024D/84E22DA2 2005-11-07 > Fingerprint: 78F9 18B3 EF58 56F5 FC85 C5CA 53E7 887F 84E2 2DA2 > > He is not a fool who gives up what he cannot keep to gain what he cannot > lose. > -Jim Elliot > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
