Re: [Linux-HA] clock_t wrapped around causing false resource start failure

Dejan Muhamedagic Tue, 30 Jun 2009 02:25:56 -0700

Hi,

On Mon, Jun 29, 2009 at 02:53:33PM -0400, Tavanyar, Simon wrote:
> I'm running heartbeat 2.1.4
> 
> I'm getting a false failure on a start of my ClusterAddr resource
> because in the same second that the resource starts, the clock_t wraps
> around. 
> Has anyone else seen this behavior?


Can't recall. And that shouldn't have happened. The time wrap is
recognized (as the log message shows) and a wrap counter is added
to the high bits so that the time is still greater than the
previous timestamp.

Do you have any more information about this: Was it a one-off
occurrence? Did your system really had a long uptime? How long?

Thanks,

Dejan

> Jun 22 10:04:49 node0 crmd: [14913]: info: do_lrm_rsc_op: Performing
> op=ClusterAddr_start_0 key=8:14:0:59f9d23b-effd-4ec4-a766-17ed34a92b34)
> Jun 22 10:04:49 node0 lrmd: [14910]: info: rsc:ClusterAddr: start
> Jun 22 10:04:49 node0 SpineFilesystem: running
>                         !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> Jun 22 10:04:49 node0 lrmd: [14910]: info: time_longclock: clock_t
> wrapped around (uptime).
> Jun 22 10:04:49 node0 lrmd: [14910]: WARN: ClusterAddr:start process
> (PID 17282) timed out (try 1).  Killing with signal SIGTERM (15).
>                         !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> Jun 22 10:04:49 node0 crmd: [14913]: info: process_lrm_event: LRM
> operation SharedFs_monitor_30000 (call=17, rc=0) complete
> Jun 22 10:04:49 node0 lrmd: [14910]: WARN: operation start[18] on
> ocf::IPaddr2::ClusterAddr for client 14913, its parameters:
> ip=[134.111.29.140] cidr_netmask=[21] broadcast=[134.111.31.255]
> CRM_meta_timeout=[20000] crm_feature_set=[2.0] nic=[biz0] : pid [17282]
> timed out
> Jun 22 10:04:49 node0 crmd: [14913]: ERROR: process_lrm_event: LRM
> operation ClusterAddr_start_0 (18) Timed Out (timeout=20000ms)
> Jun 22 10:04:50 node0 lrmd: [14910]: info: rsc:ClusterAddr: stop
> Jun 22 10:04:50 node0 crmd: [14913]: info: do_lrm_rsc_op: Performing
> op=ClusterAddr_stop_0 key=2:15:0:59f9d23b-effd-4ec4-a766-17ed34a92b34)
> Jun 22 10:04:50 node0 crmd: [14913]: info: process_lrm_event: LRM
> operation ClusterAddr_stop_0 (call=19, rc=0) complete
> 
> 
> Thanks
> Simon 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] clock_t wrapped around causing false resource start failure

Reply via email to