Make sure you've configured ntpd with 'iburst' too. Basically, it sends a burst of packets to the time server at startup, which reduces the time it takes to get a valid time.

On 9/26/2014 6:00 PM, Craig Lewis wrote:
First, make sure you're running ntpd on all of the nodes. I prefer to configure ntp to set the time on boot, then track the time. Just running ntpd will help, the set on boot isn't required.

The few times I've gotten a PG stuck in peering for any length of time, I restarted the primary OSD for those PGs. That solved my problems.

Just make sure the monitors have quorum and time sync before you start doing this. It probably won't help if the monitors aren't happy.


On Wed, Sep 24, 2014 at 5:04 AM, Pavel V. Kaygorodov <[email protected] <mailto:[email protected]>> wrote:

    Hi!

    We have experienced some problems with power supply and whole our
    ceph cluster was rebooted several times.
    After a reboot the clocks on different monitor nodes becomes
    slightly desynchronized and ceph won't go up before time sync.
    But even after a time sync the ceph cluster also shows that about
    a half (typically, sometimes more, sometimes less) of pgs are in
    peering state for several hours and ceph clients don't have an
    access to the data.
    I have tried to speedup the process manually restarting monitors
    and osds, sometimes with success, sometimes without.

    Is there a way to speedup cluster repair after a global reboot?

    Thanks in advance,
      Pavel.

    _______________________________________________
    ceph-users mailing list
    [email protected] <mailto:[email protected]>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to