Re: [ceph-users] Luminous: resilience - private interface down , no read/write

nokia ceph Thu, 24 May 2018 02:29:30 -0700

Hi ,

We changed   mon_osd_ min_down_reporters   to 69 , and when the cluster
network is down , read/write completely blocked and none of the OSDs moved
to down state in mon status.


We have set  mon osd down out subtree limit to host which is our failure
domain from the default rack.

Could you please suggest other options which we can try.

thanks,
Muthu

On Wed, May 23, 2018 at 4:51 PM, nokia ceph <[email protected]>
wrote:

> yes it is 68 disks , and will this  mon_osd_reporter_subtree_level = host
> have any impact on  mon_osd_ min_down_reporters ?
>
> And related to min_size , yes there was many suggestions for us to move to
> 2 , due to storage efficiency concerns we still retain with 1 and trying to
> convince customers to go with 2 for better data integrity.
>
> thanks,
> Muthu
>
> On Wed, May 23, 2018 at 3:31 PM, David Turner <[email protected]>
> wrote:
>
>> How many disks in each node? 68? If yes, then change it to 69. Also
>> running with ec 4+1 is bad for the same reason as running with size=2
>> min_size=1 which has been mentioned and discussed multiple times on the ML.
>>
>>
>> On Wed, May 23, 2018, 3:39 AM nokia ceph <[email protected]>
>> wrote:
>>
>>> Hi David Turner,
>>>
>>> This is our ceph config under mon section , we have EC 4+1 and set the
>>> failure domain as host and osd_min_down_reporters to 4 ( osds from 4
>>> different host ) .
>>>
>>> [mon]
>>> mon_compact_on_start = True
>>> mon_osd_down_out_interval = 86400
>>> mon_osd_down_out_subtree_limit = host
>>> mon_osd_min_down_reporters = 4
>>> mon_osd_reporter_subtree_level = host
>>>
>>> We have 68 disks , can we increase  sd_min_down_reporters  to 68 ?
>>>
>>> Thanks,
>>> Muthu
>>>
>>> On Tue, May 22, 2018 at 5:46 PM, David Turner <[email protected]>
>>> wrote:
>>>
>>>> What happens when a storage node loses its cluster network but not it's
>>>> public network is that all other osss on the cluster see that it's down and
>>>> report that to the mons, but the node call still talk to the mons telling
>>>> the mons that it is up and in fact everything else is down.
>>>>
>>>> The setting osd _min_reporters (I think that's the name of it off the
>>>> top of my head) is designed to help with this scenario. It's default is 1
>>>> which means any osd on either side of the network problem will be trusted
>>>> by the mons to mark osds down. What you want to do with this seeing is to
>>>> set it to at least 1 more than the number of osds in your failure domain.
>>>> If the failure domain is host and each node has 32 osds, then setting it to
>>>> 33 will prevent a full problematic node from being able to cause havoc.
>>>>
>>>> The osds will still try to mark themselves as up and this will still
>>>> cause problems for read until the osd process stops or the network comes
>>>> back up. There might be a seeing for how long an odd will try telling the
>>>> mons it's up, but this isn't really a situation I've come across after
>>>> initial testing and installation of nodes.
>>>>
>>>> On Tue, May 22, 2018, 1:47 AM nokia ceph <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Ceph users,
>>>>>
>>>>> We have a cluster with 5 node (67 disks) and EC 4+1 configuration and
>>>>> min_size set as 4.
>>>>> Ceph version : 12.2.5
>>>>> While executing one of our resilience usecase , making private
>>>>> interface down on one of the node, till kraken we saw less outage in rados
>>>>> (60s) .
>>>>>
>>>>> Now with luminous, we could able to see rados read/write outage for
>>>>> more than 200s . In the logs we could able to see that peer OSDs inform
>>>>> that one of the node OSDs are down however the OSDs  defend like it is
>>>>> wrongly marked down and does not move to down state for long time.
>>>>>
>>>>> 2018-05-22 05:37:17.871049 7f6ac71e6700  0 log_channel(cluster) log
>>>>> [WRN] : Monitor daemon marked osd.1 down, but it is still running
>>>>> 2018-05-22 05:37:17.871072 7f6ac71e6700  0 log_channel(cluster) log
>>>>> [DBG] : map e35690 wrongly marked me down at e35689
>>>>> 2018-05-22 05:37:17.878347 7f6ac71e6700  0 osd.1 35690 crush map has
>>>>> features 1009107927421960192, adjusting msgr requires for osds
>>>>> 2018-05-22 05:37:18.296643 7f6ac71e6700  0 osd.1 35691 crush map has
>>>>> features 1009107927421960192, adjusting msgr requires for osds
>>>>>
>>>>>
>>>>> Only when all 67 OSDs are move to down state , the read/write traffic
>>>>> is resumed.
>>>>>
>>>>> Could you please help us in resolving this issue and if it is bug , we
>>>>> will create corresponding ticket.
>>>>>
>>>>> Thanks,
>>>>> Muthu
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> [email protected]
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>
>>>
>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Luminous: resilience - private interface down , no read/write

Reply via email to