On 4/16/19 2:27 PM, M Ranga Swami Reddy wrote:
> Its Smart Storage battery, which was disabled due to high ambient
> temperature.
> All OSD processes/daemon working as is...but those OSDs not responding
> to other OSD due to high CPU utilization..
> Don't observe the clock skew issue.
> 

As the I/O was stalling on those OSDs you needed to wait for the OSDs to
commit suicide which can take up to 600 seconds.

The storage subsystem underneath the OSDs was causing problems and those
are very hard to work around.

If OSDs commit suicide too quickly you could get a snowball effect if
you have disks which are sluggish or overloaded.

Ceph can handle this just fine after these OSDs have committed suicide
or you stop them manually. But by nature Ceph is a synchronous system
and it will wait for all the OSDs to commit a write if it comes in.

The behavior in this case was fully expected and normal.

Wido

> On Tue, Apr 16, 2019 at 12:49 PM Marco Gaiarin <g...@sv.lnf.it
> <mailto:g...@sv.lnf.it>> wrote:
> 
>     Mandi! M Ranga Swami Reddy
>       In chel di` si favelave...
> 
>     > Hello - Recevenlt we had an issue with storage node's battery
>     failure, which
>     > cause ceph client IO dropped to '0' bytes. Means ceph cluster
>     couldn't perform
>     > IO operations on the cluster till the node takes out. This is not
>     expected from
>     > Ceph, as some HW fails, those respective OSDs should mark as
>     out/down and IO
>     > should go as is..
>     > Please let me know if anyone seen the similar behavior and is this
>     issue
>     > resolved?
> 
>     'battery' mean 'CMOS battery'?
> 
> 
>     OSDs and MONs need accurate clock sync between them. So, if a node
>     reboot with a clock skew more than (AFAI Remember well) 5 seconds, OSD
>     does not start.
> 
>     Provide a stable NTP server for all your OSDs and MONs, and restart
>     OSDs after clock are in sync.
> 
>     -- 
>     dott. Marco Gaiarin                                     GNUPG Key
>     ID: 240A3D66
>       Associazione ``La Nostra Famiglia''         
>     http://www.lanostrafamiglia.it/
>       Polo FVG   -   Via della Bontà, 7 - 33078   -   San Vito al
>     Tagliamento (PN)
>       marco.gaiarin(at)lanostrafamiglia.it <http://lanostrafamiglia.it> 
>      t +39-0434-842711   f +39-0434-842797
> 
>                     Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
>           http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
>             (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to