Hi all,
new user here.
We have been testing an older version of the heartbeat / pacemaker combination
compiled for illumos (an opensolaris follow-up).
Versions:
Heartbeat-3-0-STABLE-3.0.5
Pacemaker-1-0-Pacemaker-1.0.11
It all works ok while testing (several months now) but I have noticed that
every so often (and sometimes quite frequently) I see the following console
message appear:
crmd: [ID 996084 daemon.crit] [12637]: CRIT: time_longclock: old value was
298671305, new value is 298671304, diff is 1, callcount 141814
Now from what I have been able to find about this, is that this type of
occurence should have been fixed in heartbeat post 2.1.4 versions. At that time
this occurence could make a cluster start behaving irratically.
We have two test implementions of a cluster, 1 in vmware and 1 on standard
hardware. All just for testing.
We have made sure that timesync is done via ntp with the internet. The hardware
implementation doesn't show this message as many times as the vmware
implementation, but still it appears (sometimes about three times per 24 hours).
We haven't had any strange behaviour yet in the cluster, but my questions about
this are as follows:
should we worry about this 'time_longclock' crit error eventhough it should
have been fixed in version post HA 3?
Is there something (simple) that can be done to prevent this type of error, or
should we expect normal cluster behaviour since ntp is used.
The above message should make clear that I'm not a programmer, nor am I a
heartbeat specialist .... ;-)
Regards,
S.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems