Re: [ceph-users] osd down detection broken in jewel?

2016-12-12 Thread Gregory Farnum
On Wed, Nov 30, 2016 at 8:31 AM, Manuel Lausch 
wrote:

> Yes. This parameter is used in the condition described there:
> http://docs.ceph.com/docs/jewel/rados/configuration/mon-
> osd-interaction/#osds-report-their-status and works. I think the default
> timeout of 900s is quiet a bit large.
>
> Also in the documentation is a other function wich checks the health of
> OSDs and report them down: http://docs.ceph.com/docs/
> jewel/rados/configuration/mon-osd-interaction/#osds-report-down-osds
>
> As far as I see in the sourcode this documentation is not valid anymore!
> I found this commit -> https://github.com/ceph/ceph/commit/
> bcb8f362ec6ac47c4908118e7860dec7971d001f#diff-
> 0a5db46a44ae9900e226289a810f10e8
>
> "mon_osd_min_down_reporters" now is the threshold how many "
> mon_osd_reporter_subtree_level" has to report a down OSD. in Hammer this
> was how many other OSDs had to report. And in Hammer there was also the
> parameter "mon_osd_min_down_reports" which sets how often a other OSD has
> to report a other OSD. In Jewel the parameter doesn't exists anymore.
>
> With this "knowlege" I adjusted my configuration.  And will now test it.
>
>
> BTW:
> While reading the source code I may found a other bug. Can you confirm
> this?
> In the function "OSDMonitor::check_failure" in   src/mon/OSDMonitor.cc
> the code which counts the "reporters_by_subtree" is in the if block "if
> (g_conf->mon_osd_adjust_heartbeat_grace) {".  So if I disable
> adjust_heartbeat_grace the reporters_by_subtree functionality will not
> work at all.
>
>
> Yes, I think you're correct and that's a (fairly nasty, to somebody
someday)  bug. Can you create a ticket at tracker.ceph.com? :)
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd down detection broken in jewel?

2016-11-30 Thread Manuel Lausch
Yes. This parameter is used in the condition described there: 
http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/#osds-report-their-status 
and works. I think the default timeout of 900s is quiet a bit large.


Also in the documentation is a other function wich checks the health of 
OSDs and report them down: 
http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/#osds-report-down-osds


As far as I see in the sourcode this documentation is not valid anymore!
I found this commit -> 
https://github.com/ceph/ceph/commit/bcb8f362ec6ac47c4908118e7860dec7971d001f#diff-0a5db46a44ae9900e226289a810f10e8


"mon_osd_min_down_reporters" now is the threshold how many 
"mon_osd_reporter_subtree_level" has to report a down OSD. in Hammer 
this was how many other OSDs had to report. And in Hammer there was also 
the parameter "mon_osd_min_down_reports" which sets how often a other 
OSD has to report a other OSD. In Jewel the parameter doesn't exists 
anymore.


With this "knowlege" I adjusted my configuration.  And will now test it.


BTW:
While reading the source code I may found a other bug. Can you confirm this?
In the function "OSDMonitor::check_failure" in src/mon/OSDMonitor.cc  
the code which counts the "reporters_by_subtree" is in the if block "if 
(g_conf->mon_osd_adjust_heartbeat_grace) {".  So if I disable
adjust_heartbeat_grace the reporters_by_subtree functionality will not 
work at all.



Regards,
Manuel


Am 30.11.2016 um 15:24 schrieb John Petrini:

It's right there in your config.

mon osd report timeout = 900

See: 
http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/


___

John Petrini

NOC Systems Administrator   // *CoreDial, LLC*   // coredial.com 
   // Twitter  
LinkedIn  Google Plus 
 Blog 


Hillcrest I, 751 Arbor Way, Suite 150, Blue Bell PA, 19422
*P: *215.297.4400 x232   // *F: *215.297.4401   // *E: 
*jpetr...@coredial.com 


Exceptional people. Proven Processes. Innovative Technology. Discover 
CoreDial - watch our video 



The information transmitted is intended only for the person or entity 
to which it is addressed and may contain confidential and/or 
privileged material. Any review, retransmission,  dissemination or 
other use of, or taking of any action in reliance upon, this 
information by persons or entities other than the intended recipient 
is prohibited. If you received this in error, please contact the 
sender and delete the material from any computer.



On Wed, Nov 30, 2016 at 6:39 AM, Manuel Lausch > wrote:


Hi,

In a test with ceph jewel we tested how long the cluster needs to
detect and mark down OSDs after they are killed (with kill -9).
The result -> 900 seconds.

In Hammer this took about 20 - 30 seconds.

In the Logfile from the leader monitor are a lot of messeages like
2016-11-30 11:32:20.966567 7f158f5ab700  0 log_channel(cluster)
log [DBG] : osd.7 10.78.43.141:8120/106673
 reported failed by osd.272
10.78.43.145:8106/117053 
A deeper look at this. A lot of OSDs reported this exactly one
time. In Hammer The OSDs reported a down OSD a few more times.

Finaly there is the following and the osd is marked down.
2016-11-30 11:36:22.633253 7f158fdac700  0 log_channel(cluster)
log [INF] : osd.7 marked down after no pg stats for 900.982893seconds

In my ceph.conf I have the following lines in the global section
mon osd min down reporters = 10
mon osd min down reports = 3
mon osd report timeout = 900

It seems the parameter "mon osd min down reports" is removed in
jewel but the documentation is not updated ->
http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/



Can someone tell me how ceph jewel detects down OSDs and mark them
down in a appropriated time?


The Cluster:
ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)
24 hosts á 60 OSDs -> 1440 OSDs
2 pool with replication factor 4
65536 PGs
5 Mons

-- 
Manuel Lausch


Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany
Phone: +49 721 91374-1847 
E-Mail: manuel.lau...@1und1.de  |
Web: www.1und1.de 

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen


Member of United 

Re: [ceph-users] osd down detection broken in jewel?

2016-11-30 Thread Warren Wang - ISD
FYI - Setting min down reports to 10 is somewhat risky. Unless you have a 
really large cluster, I would advise turning that down to 5 or lower. In a past 
life, we used to run that number higher on super dense nodes, but we found that 
it would result in some instances where legitimately down OSDs did not have 
enough peers to exceed the min down reporters.

Warren Wang
Walmart ✻


From: ceph-users <ceph-users-boun...@lists.ceph.com> on behalf of John Petrini 
<jpetr...@coredial.com>
Date: Wednesday, November 30, 2016 at 9:24 AM
To: Manuel Lausch <manuel.lau...@1und1.de>
Cc: Ceph Users <ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] osd down detection broken in jewel?

It's right there in your config.

mon osd report timeout = 900

See: http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/


___

John Petrini

NOC Systems Administrator   //   CoreDial, LLC   //   
coredial.com<http://coredial.com/>   //   [witter] 
<https://twitter.com/coredial>[inkedIn] 
<http://www.linkedin.com/company/99631>[oogle Plus] 
<https://plus.google.com/104062177220750809525/posts>[log] 
<http://success.coredial.com/blog>
Hillcrest I, 751 Arbor Way, Suite 150, Blue Bell PA, 19422
P: 215.297.4400 x232   //   F: 215.297.4401   //   E: 
jpetr...@coredial.com<mailto:jpetr...@coredial.com>

[xceptional people. Proven Processes. Innovative Technology. 
Discover]<http://cta-redirect.hubspot.com/cta/redirect/210539/4c492538-6e4b-445e-9480-bef676787085>

The information transmitted is intended only for the person or entity to which 
it is addressed and may contain confidential and/or privileged material. Any 
review, retransmission,  dissemination or other use of, or taking of any action 
in reliance upon, this information by persons or entities other than the 
intended recipient is prohibited. If you received this in error, please contact 
the sender and delete the material from any computer.

On Wed, Nov 30, 2016 at 6:39 AM, Manuel Lausch 
<manuel.lau...@1und1.de<mailto:manuel.lau...@1und1.de>> wrote:
Hi,

In a test with ceph jewel we tested how long the cluster needs to detect and 
mark down OSDs after they are killed (with kill -9). The result -> 900 seconds.

In Hammer this took about 20 - 30 seconds.

In the Logfile from the leader monitor are a lot of messeages like
2016-11-30 11:32:20.966567 7f158f5ab700  0 log_channel(cluster) log [DBG] : 
osd.7 10.78.43.141:8120/106673<http://10.78.43.141:8120/106673> reported failed 
by osd.272 10.78.43.145:8106/117053<http://10.78.43.145:8106/117053>
A deeper look at this. A lot of OSDs reported this exactly one time. In Hammer 
The OSDs reported a down OSD a few more times.

Finaly there is the following and the osd is marked down.
2016-11-30 11:36:22.633253 7f158fdac700  0 log_channel(cluster) log [INF] : 
osd.7 marked down after no pg stats for 900.982893seconds

In my ceph.conf I have the following lines in the global section
mon osd min down reporters = 10
mon osd min down reports = 3
mon osd report timeout = 900

It seems the parameter "mon osd min down reports" is removed in jewel but the 
documentation is not updated -> 
http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/


Can someone tell me how ceph jewel detects down OSDs and mark them down in a 
appropriated time?


The Cluster:
ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)
24 hosts á 60 OSDs -> 1440 OSDs
2 pool with replication factor 4
65536 PGs
5 Mons

--
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 
Karlsruhe | Germany
Phone: +49 721 91374-1847<tel:%2B49%20721%2091374-1847>
E-Mail: manuel.lau...@1und1.de<mailto:manuel.lau...@1und1.de> | Web: 
www.1und1.de<http://www.1und1.de>

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen 
enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail 
irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten 
Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, 
diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise 
auch immer zu verwenden.

This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient of this e-mail, you are hereby notified that saving, 
distribution or use of the content of this e-mail in any way is prohibited. If 
you have received this e-mail in error, please notify the sender and delete the 
e-mail.


___
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


This e

Re: [ceph-users] osd down detection broken in jewel?

2016-11-30 Thread John Petrini
It's right there in your config.

mon osd report timeout = 900

See:
http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/

___

John Petrini

NOC Systems Administrator   //   *CoreDial, LLC*   //   coredial.com
//   [image:
Twitter]    [image: LinkedIn]
   [image: Google Plus]
   [image: Blog]

Hillcrest I, 751 Arbor Way, Suite 150, Blue Bell PA, 19422
*P: *215.297.4400 x232   //   *F: *215.297.4401   //   *E: *
jpetr...@coredial.com

[image: Exceptional people. Proven Processes. Innovative Technology.
Discover CoreDial - watch our video]


The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission,  dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipient is prohibited. If you received
this in error, please contact the sender and delete the material from any
computer.

On Wed, Nov 30, 2016 at 6:39 AM, Manuel Lausch 
wrote:

> Hi,
>
> In a test with ceph jewel we tested how long the cluster needs to detect
> and mark down OSDs after they are killed (with kill -9). The result -> 900
> seconds.
>
> In Hammer this took about 20 - 30 seconds.
>
> In the Logfile from the leader monitor are a lot of messeages like
> 2016-11-30 11:32:20.966567 7f158f5ab700  0 log_channel(cluster) log [DBG]
> : osd.7 10.78.43.141:8120/106673 reported failed by osd.272
> 10.78.43.145:8106/117053
> A deeper look at this. A lot of OSDs reported this exactly one time. In
> Hammer The OSDs reported a down OSD a few more times.
>
> Finaly there is the following and the osd is marked down.
> 2016-11-30 11:36:22.633253 7f158fdac700  0 log_channel(cluster) log [INF]
> : osd.7 marked down after no pg stats for 900.982893seconds
>
> In my ceph.conf I have the following lines in the global section
> mon osd min down reporters = 10
> mon osd min down reports = 3
> mon osd report timeout = 900
>
> It seems the parameter "mon osd min down reports" is removed in jewel but
> the documentation is not updated -> http://docs.ceph.com/docs/jewe
> l/rados/configuration/mon-osd-interaction/
>
>
> Can someone tell me how ceph jewel detects down OSDs and mark them down in
> a appropriated time?
>
>
> The Cluster:
> ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)
> 24 hosts á 60 OSDs -> 1440 OSDs
> 2 pool with replication factor 4
> 65536 PGs
> 5 Mons
>
> --
> Manuel Lausch
>
> Systemadministrator
> Cloud Services
>
> 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135
> Karlsruhe | Germany
> Phone: +49 721 91374-1847
> E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de
>
> Amtsgericht Montabaur, HRB 5452
>
> Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen
>
>
> Member of United Internet
>
> Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
> Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind
> oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den
> Absender und vernichten Sie diese E-Mail. Anderen als dem
> bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
> weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden.
>
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient of this e-mail, you are hereby notified that
> saving, distribution or use of the content of this e-mail in any way is
> prohibited. If you have received this e-mail in error, please notify the
> sender and delete the e-mail.
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com