Re: [ceph-users] ceph falsely reports clock skew?

2015-03-26 Thread Sage Weil
On Thu, 26 Mar 2015, Gregory Farnum wrote:
 On Thu, Mar 26, 2015 at 7:44 AM, Lee Revell rlrev...@gmail.com wrote:
  I have a virtual test environment of an admin node and 3 mon + osd nodes,
  built by just following the quick start guide.  It seems to work OK but ceph
  is constantly complaining about clock skew much greater than reality.
  Clocksource on the virtuals is kvm-clock and they also run ntpd.
 
  ceph-admin-node
  26 Mar 10:35:29 ntpdate[2647]: adjust time server 91.189.94.4 offset
  0.000802 sec
 
  ceph-node-1
  26 Mar 10:35:35 ntpdate[4250]: adjust time server 91.189.94.4 offset
  0.002537 sec
 
  ceph-node-2
  26 Mar 10:35:42 ntpdate[1708]: adjust time server 91.189.94.4 offset
  -0.000214 sec
 
  ceph-node-3
  26 Mar 10:35:49 ntpdate[1964]: adjust time server 91.189.94.4 offset
  0.001490 sec
 
  ceph@ceph-admin-node:~/my-cluster$ ceph -w
  cluster db460aa2-5129-4aaa-8b2e-43eac727124e
   health HEALTH_WARN clock skew detected on mon.ceph-node-2
   monmap e3: 3 mons at
  {ceph-node-1=192.168.122.121:6789/0,ceph-node-2=192.168.122.131:6789/0,ceph-node-3=192.168.122.141:6789/0},
  election epoch 140, quorum 0,1,2 ceph-node-1,ceph-node-2,ceph-node-3
   mdsmap e54: 1/1/1 up {0=ceph-node-1=up:active}
   osdmap e182: 3 osds: 3 up, 3 in
pgmap v3594: 840 pgs, 8 pools, 7163 MB data, 958 objects
  29850 MB used, 27118 MB / 60088 MB avail
   840 active+clean
 
 What clock skews is it reporting? I don't remember the defaults, but
 if ntp is consistently adjusting your clocks by a couple of
 milliseconds then I don't think Ceph is going to be very happy about
 it.

IIRC the mons re-check sync every 5 minutes.  Does the warning persist?  
Does it go away if you restart the mons?

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph falsely reports clock skew?

2015-03-26 Thread Gregory Farnum
On Thu, Mar 26, 2015 at 7:44 AM, Lee Revell rlrev...@gmail.com wrote:
 I have a virtual test environment of an admin node and 3 mon + osd nodes,
 built by just following the quick start guide.  It seems to work OK but ceph
 is constantly complaining about clock skew much greater than reality.
 Clocksource on the virtuals is kvm-clock and they also run ntpd.

 ceph-admin-node
 26 Mar 10:35:29 ntpdate[2647]: adjust time server 91.189.94.4 offset
 0.000802 sec

 ceph-node-1
 26 Mar 10:35:35 ntpdate[4250]: adjust time server 91.189.94.4 offset
 0.002537 sec

 ceph-node-2
 26 Mar 10:35:42 ntpdate[1708]: adjust time server 91.189.94.4 offset
 -0.000214 sec

 ceph-node-3
 26 Mar 10:35:49 ntpdate[1964]: adjust time server 91.189.94.4 offset
 0.001490 sec

 ceph@ceph-admin-node:~/my-cluster$ ceph -w
 cluster db460aa2-5129-4aaa-8b2e-43eac727124e
  health HEALTH_WARN clock skew detected on mon.ceph-node-2
  monmap e3: 3 mons at
 {ceph-node-1=192.168.122.121:6789/0,ceph-node-2=192.168.122.131:6789/0,ceph-node-3=192.168.122.141:6789/0},
 election epoch 140, quorum 0,1,2 ceph-node-1,ceph-node-2,ceph-node-3
  mdsmap e54: 1/1/1 up {0=ceph-node-1=up:active}
  osdmap e182: 3 osds: 3 up, 3 in
   pgmap v3594: 840 pgs, 8 pools, 7163 MB data, 958 objects
 29850 MB used, 27118 MB / 60088 MB avail
  840 active+clean

What clock skews is it reporting? I don't remember the defaults, but
if ntp is consistently adjusting your clocks by a couple of
milliseconds then I don't think Ceph is going to be very happy about
it.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph falsely reports clock skew?

2015-03-26 Thread Lee Revell
I have a virtual test environment of an admin node and 3 mon + osd nodes,
built by just following the quick start guide.  It seems to work OK but
ceph is constantly complaining about clock skew much greater than reality.
Clocksource on the virtuals is kvm-clock and they also run ntpd.

ceph-admin-node
26 Mar 10:35:29 ntpdate[2647]: adjust time server 91.189.94.4 offset
0.000802 sec

ceph-node-1
26 Mar 10:35:35 ntpdate[4250]: adjust time server 91.189.94.4 offset
0.002537 sec

ceph-node-2
26 Mar 10:35:42 ntpdate[1708]: adjust time server 91.189.94.4 offset
-0.000214 sec

ceph-node-3
26 Mar 10:35:49 ntpdate[1964]: adjust time server 91.189.94.4 offset
0.001490 sec

ceph@ceph-admin-node:~/my-cluster$ ceph -w
cluster db460aa2-5129-4aaa-8b2e-43eac727124e
 health HEALTH_WARN clock skew detected on mon.ceph-node-2
 monmap e3: 3 mons at {ceph-node-1=
192.168.122.121:6789/0,ceph-node-2=192.168.122.131:6789/0,ceph-node-3=192.168.122.141:6789/0},
election epoch 140, quorum 0,1,2 ceph-node-1,ceph-node-2,ceph-node-3
 mdsmap e54: 1/1/1 up {0=ceph-node-1=up:active}
 osdmap e182: 3 osds: 3 up, 3 in
  pgmap v3594: 840 pgs, 8 pools, 7163 MB data, 958 objects
29850 MB used, 27118 MB / 60088 MB avail
 840 active+clean

Lee
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph falsely reports clock skew?

2015-03-26 Thread Lee Revell
I have a virtual test environment of an admin node and 3 mon + osd nodes,
built by just following the quick start guide.  It seems to work OK but
ceph is constantly complaining about clock skew much greater than reality.
Clocksource on the virtuals is kvm-clock and they also run ntpd.

ceph-admin-node
26 Mar 10:35:29 ntpdate[2647]: adjust time server 91.189.94.4 offset
0.000802 sec

ceph-node-1
26 Mar 10:35:35 ntpdate[4250]: adjust time server 91.189.94.4 offset
0.002537 sec

ceph-node-2
26 Mar 10:35:42 ntpdate[1708]: adjust time server 91.189.94.4 offset
-0.000214 sec

ceph-node-3
26 Mar 10:35:49 ntpdate[1964]: adjust time server 91.189.94.4 offset
0.001490 sec

ceph@ceph-admin-node:~/my-cluster$ ceph -w
cluster db460aa2-5129-4aaa-8b2e-43eac727124e
 health HEALTH_WARN clock skew detected on mon.ceph-node-2
 monmap e3: 3 mons at {ceph-node-1=
192.168.122.121:6789/0,ceph-node-2=192.168.122.131:6789/0,ceph-node-3=192.168.122.141:6789/0},
election epoch 140, quorum 0,1,2 ceph-node-1,ceph-node-2,ceph-node-3
 mdsmap e54: 1/1/1 up {0=ceph-node-1=up:active}
 osdmap e182: 3 osds: 3 up, 3 in
  pgmap v3594: 840 pgs, 8 pools, 7163 MB data, 958 objects
29850 MB used, 27118 MB / 60088 MB avail
 840 active+clean
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph falsely reports clock skew?

2015-03-26 Thread Lee Revell
I think I solved the problem. The clock skew only happens when restarting a
node to simulate hardware failure. The virtual comes up with a skewed clock
and ceph services start before ntp has time to adjust it, then there's a
delay before ceph rechecks the clock skew.

Lee

On Thu, Mar 26, 2015 at 11:21 AM, Sage Weil s...@newdream.net wrote:

 On Thu, 26 Mar 2015, Gregory Farnum wrote:
  On Thu, Mar 26, 2015 at 7:44 AM, Lee Revell rlrev...@gmail.com wrote:
   I have a virtual test environment of an admin node and 3 mon + osd
 nodes,
   built by just following the quick start guide.  It seems to work OK
 but ceph
   is constantly complaining about clock skew much greater than reality.
   Clocksource on the virtuals is kvm-clock and they also run ntpd.
  
   ceph-admin-node
   26 Mar 10:35:29 ntpdate[2647]: adjust time server 91.189.94.4 offset
   0.000802 sec
  
   ceph-node-1
   26 Mar 10:35:35 ntpdate[4250]: adjust time server 91.189.94.4 offset
   0.002537 sec
  
   ceph-node-2
   26 Mar 10:35:42 ntpdate[1708]: adjust time server 91.189.94.4 offset
   -0.000214 sec
  
   ceph-node-3
   26 Mar 10:35:49 ntpdate[1964]: adjust time server 91.189.94.4 offset
   0.001490 sec
  
   ceph@ceph-admin-node:~/my-cluster$ ceph -w
   cluster db460aa2-5129-4aaa-8b2e-43eac727124e
health HEALTH_WARN clock skew detected on mon.ceph-node-2
monmap e3: 3 mons at
   {ceph-node-1=
 192.168.122.121:6789/0,ceph-node-2=192.168.122.131:6789/0,ceph-node-3=192.168.122.141:6789/0
 },
   election epoch 140, quorum 0,1,2 ceph-node-1,ceph-node-2,ceph-node-3
mdsmap e54: 1/1/1 up {0=ceph-node-1=up:active}
osdmap e182: 3 osds: 3 up, 3 in
 pgmap v3594: 840 pgs, 8 pools, 7163 MB data, 958 objects
   29850 MB used, 27118 MB / 60088 MB avail
840 active+clean
 
  What clock skews is it reporting? I don't remember the defaults, but
  if ntp is consistently adjusting your clocks by a couple of
  milliseconds then I don't think Ceph is going to be very happy about
  it.

 IIRC the mons re-check sync every 5 minutes.  Does the warning persist?
 Does it go away if you restart the mons?

 sage

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com