[ceph-users] Off-Site monitor node over VPN

2024-05-06 Thread Stefan Pinter
Hi! i hope someone can help us out here :) We need to move from 3 datacenters to 2 datacenters (+ 1 small serverroom reachable via layer 3 VPN) NOW we have a ceph-mon in each datacenter, which is fine. But we have to move and will only have 2 datacenters in the future (that are connected, so

[ceph-users] Re: Theory about min_size and its implications

2023-03-07 Thread stefan . pinter
hi! thanks to all of you, I appreciate this very much! I will have to go through all of your messages a few more times and do some research. so our rule from the intial post does make sure, that, when 1 room goes down it does NOT try to restore 3 replicas in the remaining room but it will only

[ceph-users] Re: Theory about min_size and its implications

2023-03-03 Thread stefan . pinter
thank you Robert! this sounds like split brain again... but we have a quorum system by using 3 monitor nodes. so only the room with the majority of the ceph-mons is available for I/O. if the room with the majority of the ceph-mons is the one that is cut-off, I suppose we'd need to do this: go

[ceph-users] Re: Theory about min_size and its implications

2023-03-03 Thread stefan . pinter
great, thank you Anthony! :) so size 4 / min_size 2 would be a lot better (of course) we have to stay at 3/2 for now though, because our OSDs are filled 60% in sum maybe someone can answer additional questions: - what is the best practice to avoid a full OSD scenario, where ceph tries to

[ceph-users] Theory about min_size and its implications

2023-03-02 Thread stefan . pinter
Hi! it is unclear for us what min_size means besides what it does. i hope someone can clear this up :) scenario: size is 3 and min_size is 2 2 rooms with 100 OSDs each and this crush rule "op": "take", "item": -10, "item_name": "default"

[ceph-users] Re%3A%20%5Bceph-users%5D%20Re%3A%20Exit%20yolo%20mode%20by%20increasing%20size/min_size%20does%20not%20%28really%29%20work

2023-02-13 Thread Stefan Pinter
Hi! Thank you  Your message was very helpful! The main reason why “ceph df“ went to “100% USAGE” was because of the crush rule said this: "min_size": 2 "max_size": 2 And the new “size” was 3, so the rule did not want to work with the pool. After creating a new rule and setting the pools to

[ceph-users] Re: Exit yolo mode by increasing size/min_size does not (really) work

2023-02-10 Thread stefan . pinter
sure! ceph osd pool ls detail https://privatebin.net/?85105578dd50f65f#4oNunvNfLoNbnqJwuXoWXrB1idt4zMGnBXdQ8Lkwor8p i guess this needs some cleaning up regarding snapshots - could this be a problem? ceph osd crush rule dump

[ceph-users] Re: Exit yolo mode by increasing size/min_size does not (really) work

2023-02-09 Thread stefan . pinter
hi, thank you Eugen for being interested in solving this ;) certainly, here are some more infos: ceph osd tree https://privatebin.net/?db7b93a623095879#AKJNy6pKNxa5XssjUpxxjMnggc3d4PirTH1pwHQFF3Qk ceph osd df https://privatebin.net/?0f7c3b091b683d65#8K4KQW5a2G2mFgcnTdUjQXJvcZCpAJGxcPRc1nUYiLXz

[ceph-users] Exit yolo mode by increasing size/min_size does not (really) work

2023-02-03 Thread Stefan Pinter
Hi!  It would be very kind of you to help us with that! We have pools in our ceph cluster that are set to replicated size 2 min_size 1. Obviously we want to go to size 3 / min_size 2 but we experience problems with that. USED goes to 100% instantly and MAX AVAIL goes to 0. Write operations