[ceph-users] Re: Replication question

Etienne Menguy Sat, 30 Oct 2021 15:38:56 -0700

Hi,

> My question is - do I understand correctly that I need to either update my
> CRUSH rule to select OSDs (which I know is bad) to place objects into PGs
> or have more OSD hosts available so when one of them is going down I would
> still have 3 active hosts and CEPH can re-distribute data between these 3
> hosts to maintain replica size of x3?
True, you could do this. But I think the best way would be to add a fourth 
server.


> Or maybe I don't understand something?
I don’t think so, in your crush rule you request a failure domain at host 
level, so if you only have 2 hosts but 3 replicas, ceph can’t replicate the 
third copy.
I don’t know if there is a way to bypass this while keeping the failure domain.

From my experience if a host (even an osd) is temporary down you don’t want to 
recover.
It will generate load to recover but also once host is back on cluster to put 
back PG to their original OSD.

-
Etienne Menguy
[email protected]




> On 30 Oct 2021, at 11:05, Yury Kirsanov <[email protected]> wrote:
> 
> Hi everyone,
> I have a CEPH cluster with 3 MON/MGR/MDS nodes, 3 OSD nodes each hosting
> two OSDs (2 HDDs, 1 OSD per HDD). My pools are configured with a replica x
> 3 and my osd_pool_default_size is set to 2. So I have 6 total OSDs and 3
> hosts for OSDs.
> 
> My CRUSH map is plain simple - root, then 3 hosts each having two OSDs. And
> the CRUSH rule is set to choose HOST, not OSD in order to find data.
> 
> I was going to do maintenance and service on one of my OSD nodes so I tried
> to set it 'out' as per CEPH manual hoping that after that all the data will
> be redistributed among active 4 OSD nodes as I thought that replica size of
> 3 means that data is replicated among OSDs, not hosts even though CRUSH
> rule has hosts in it.
> 
> After setting two OSDs to 'out' nothing happened except for 33% of data
> becoming degraded. So I followed the manual, put OSDs back 'in' and
> re-weighted them with a weight of 0. Nothing happened again. Data stayed at
> 33% degraded state.
> 
> So I removed OSDs completely from the CEPH system and CRUSH map. Again - no
> migration even though I have 4 available OSDs active and up.
> 
> My question is - do I understand correctly that I need to either update my
> CRUSH rule to select OSDs (which I know is bad) to place objects into PGs
> or have more OSD hosts available so when one of them is going down I would
> still have 3 active hosts and CEPH can re-distribute data between these 3
> hosts to maintain replica size of x3? Or maybe I don't understand something?
> 
> Thanks!
> 
> Best regards,
> Yury.
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Replication question

Reply via email to