Hello all, 

I have setup a Ceph cluster consisting of one monitor, 32 OSD hosts (1 OSD of 
size 320GB per host) and 16 clients which are reading
and writing to the cluster. I have one erasure coded pool (shec plugin) with 
k=8, m=4, c=3 and pg_num=256. Failure domain is host.
I am able to reach a HEALTH_OK state and everything is working as expected. The 
pool was populated with
114048 files of different sizes ranging from 1kB to 4GB. Total amount of data 
in the pool was around 3TB. The capacity of the
pool was around 10TB.

I want to evaluate how Ceph is rebalancing data when 

1) I take out two OSDs and 
2) when I rejoin these two OSDS.

For scenario 1) I am „killing" two OSDs via ceph osd out <osd-id>. Ceph notices 
this failure and starts to rebalance data until I 
reach HEALTH_OK again.

For scenario 2) I am rejoining the previously killed OSDs via ceph osd in 
<osd-id>. Again, Ceph notices this failure and starts to 
rebalance data until HEALTH_OK state.

I repeated this whole scenario four times. What I am noticing is that the 
rebalancing process in the event of two OSDs joining the
cluster takes more than 3 times longer than in the event of the loss of two 
OSDs. This was consistent over the four runs.

I expected both recovering times to be equally long since at both scenarios the 
number of degraded objects was around 8% and the
number of missing objects around 2%. I attached a visualization of the recovery 
process in terms of degraded and missing objects, 
first part is the scenario where two OSDs „failed“, second one is the rejoining 
of these two OSDs. Note how it takes significantly longer
to recover in the second case.

Now I want to understand why it takes longer! I appreciate all hints.

Thanks!
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to