Hi ceph users,

I am using CephFS for file storage and I have noticed that the data gets 
distributed very unevenly across OSDs.

  *   I have about 90 OSDs across 8 hosts, and 4096 PGs for the cephfs_data 
pool with 2 replicas, which is in line with the total PG recommendation if 
"Total PGs = (OSDs * 100) / pool_size" from the docs.
  *   CephFS distributes the data pretty much evenly across the PGs as shown by 
'ceph pg dump'
  *   However - the number of PGs assigned to various OSDs (per weight 
unit/terabyte) varies quite a lot.  The fullest OSD has as many as 44 PGs per 
terabyte (weight unit), while the emptier ones have as few as 19 or 20.
  *   Even if I consider the total number of PGs for all pools per OSD, the 
number varies similarly wildly (as with the cephfs_data pool only).

As a result, when the whole CephFS file system is at 60% full, some of the OSDs 
already reach the 95% full condition, and no more data can be written to the 
system.
Is there any way to force a more even distribution of PGs to OSDs?  I am using 
the default crush map, with two levels (root/host).  Can any changes to the 
crush map help?  I would really like to be get higher disk utilization than 60% 
without 1 of 90 disks filling up so early.

Thanks,

Andras

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to