Re: [ClusterLabs] Different Times in the Corosync Log?

2018-08-27 Thread Jan Pokorný

On 22/08/18 03:58 +, Eric Robinson wrote:
>> -Original Message-
>> From: Users  On Behalf Of Jan Pokorný
>> Sent: Tuesday, August 21, 2018 2:45 AM
>> To: users@clusterlabs.org
>> Subject: Re: [ClusterLabs] Different Times in the Corosync Log?
>> 
>> On 21/08/18 08:43 +, Eric Robinson wrote:
>>>> I could guess that the processes run with different timezone settings
>>>> (for whatever reason).
>>> 
>>> That would be my guess, too, but I cannot imagine how they ended up in
>>> that condition.
>> 
>> Hard to guess, the PIDs indicate the expected state of covering a very short
>> interval sequentially (i.e. no intermittent failure recovered with a restart 
>> of
>> lrmd, AFAICT).  In case it can have any bearing, how do you start pacemaker 
>> --
>> systemd, initscript, as a corosync plugin, something else?
> 
> Depends on how new the cluster is. With these, I start it with 'pcs
> cluster start'.

Ok, in case you don't use anything like pam_env PAM module (you've
already checked that, right?), my blind guess is that this could be
due to incorrect permissions for /etc/localtime or it's target in case
it's a symlink.  That way, root-privileged processes (lrmd in your
example) can format the timestamps correctly in localtime_r(3) whereas
non-privileged (like cib and crmd) are doomed to assume UTC.

In case it might help, try setting this in your
/etc/{default,sysconfig}/pacemaker file:

> TZ=:[filespec]

Quick and dirty fix leveraging existing /etc/localtime file (can be
a copy, not necessarily a symlink), assuming /etc/sysconfig/pacemaker
is the correct environment file and that the command is run as root:

 echo "TZ=:$(find /usr/share/zoneinfo -type d -o -exec sha256sum {} \; \
  | grep -F "$(sha256sum /etc/localtime | sed 's| .*||g')" \
  | sed 's|.*/usr/share/zoneinfo/||;q')" >>/etc/sysconfig/pacemaker

Let us know your progress.

-- 
Nazdar,
Jan (Poki)


pgpNXss5VoYSp.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Different Times in the Corosync Log?

2018-08-21 Thread Eric Robinson
> 
> Whoa, I think you win some sort of fubar prize. :-)

It's always nice to feel special. 😉

> 
> AFAIK, any OS-level time or timezone change affects all processes equally. (I
> occasionally deal with cluster logs where the OS time jumped backward or
> forward, and all logs system-wide are equally
> affected.)
> 

Except when you're visiting insane-world, which I seem to be. 

> Some applications have their own timezone setting that can override the
> system default, but pacemaker isn't one of them. It's even more bizarre when
> you consider that the daemons here are the children of the same process
> (pacemakerd), and thus have an identical set of environment variables and so
> forth. (And as Jan pointed out, they appear to have been started within a
> fraction of a second of each other.)
> 
> Apparently there is a dateshift kernel module that can put particular 
> processes
> in different apparent times, but I assume you'd know if you did that on 
> purpose.
> :-) It does occur to me that the module would be a great prank to play on
> someone (especially combined with a cron job that randomly altered the
> configuration).
> 
> If you figure this out, I'd love to hear what it was. Gremlins ...

You'll be the second to know after me!

> 
> On Tue, 2018-08-21 at 11:45 +0200, Jan Pokorný wrote:
> > On 21/08/18 08:43 +, Eric Robinson wrote:
> > > > I could guess that the processes run with different timezone
> > > > settings (for whatever reason).
> > >
> > > That would be my guess, too, but I cannot imagine how they ended up
> > > in that condition.
> >
> > Hard to guess, the PIDs indicate the expected state of covering a very
> > short interval sequentially (i.e. no intermittent failure recovered
> > with a restart of lrmd, AFAICT).  In case it can have any bearing, how
> > do you start pacemaker -- systemd, initscript, as a corosync plugin,
> > something else?
> --
> Ken Gaillot 
> ___
> Users mailing list: Users@clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Different Times in the Corosync Log?

2018-08-21 Thread Eric Robinson
> -Original Message-
> From: Users  On Behalf Of Jan Pokorný
> Sent: Tuesday, August 21, 2018 2:45 AM
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] Different Times in the Corosync Log?
> 
> On 21/08/18 08:43 +, Eric Robinson wrote:
> >> I could guess that the processes run with different timezone settings
> >> (for whatever reason).
> >
> > That would be my guess, too, but I cannot imagine how they ended up in
> > that condition.
> 
> Hard to guess, the PIDs indicate the expected state of covering a very short
> interval sequentially (i.e. no intermittent failure recovered with a restart 
> of
> lrmd, AFAICT).  In case it can have any bearing, how do you start pacemaker --
> systemd, initscript, as a corosync plugin, something else?

Depends on how new the cluster is. With these, I start it with 'pcs cluster 
start'.

> 
> --
> Nazdar,
> Jan (Poki)
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Different Times in the Corosync Log?

2018-08-21 Thread Ken Gaillot
Whoa, I think you win some sort of fubar prize. :-)

AFAIK, any OS-level time or timezone change affects all processes
equally. (I occasionally deal with cluster logs where the OS time
jumped backward or forward, and all logs system-wide are equally
affected.)

Some applications have their own timezone setting that can override the
system default, but pacemaker isn't one of them. It's even more bizarre
when you consider that the daemons here are the children of the same
process (pacemakerd), and thus have an identical set of environment
variables and so forth. (And as Jan pointed out, they appear to have
been started within a fraction of a second of each other.)

Apparently there is a dateshift kernel module that can put particular
processes in different apparent times, but I assume you'd know if you
did that on purpose. :-) It does occur to me that the module would be a
great prank to play on someone (especially combined with a cron job
that randomly altered the configuration).

If you figure this out, I'd love to hear what it was. Gremlins ...

On Tue, 2018-08-21 at 11:45 +0200, Jan Pokorný wrote:
> On 21/08/18 08:43 +, Eric Robinson wrote:
> > > I could guess that the processes run with different timezone
> > > settings (for whatever reason).
> > 
> > That would be my guess, too, but I cannot imagine how they ended up
> > in that condition. 
> 
> Hard to guess, the PIDs indicate the expected state of covering a
> very
> short interval sequentially (i.e. no intermittent failure recovered
> with
> a restart of lrmd, AFAICT).  In case it can have any bearing, how do
> you start pacemaker -- systemd, initscript, as a corosync plugin,
> something else?
-- 
Ken Gaillot 
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Different Times in the Corosync Log?

2018-08-21 Thread Jan Pokorný
On 21/08/18 08:43 +, Eric Robinson wrote:
>> I could guess that the processes run with different timezone
>> settings (for whatever reason).
> 
> That would be my guess, too, but I cannot imagine how they ended up
> in that condition. 

Hard to guess, the PIDs indicate the expected state of covering a very
short interval sequentially (i.e. no intermittent failure recovered with
a restart of lrmd, AFAICT).  In case it can have any bearing, how do
you start pacemaker -- systemd, initscript, as a corosync plugin,
something else?

-- 
Nazdar,
Jan (Poki)


pgpetfGN76bWt.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org