We are running on Hammer 0.94.7 and have had very bad experiences with PG 
folders splitting a sub-directory further.  OSDs being marked out, hundreds of 
blocked requests, etc.  We have modified our settings and watched the behavior 
match the ceph documentation for splitting, but right now the subfolders are 
splitting outside of what the documentation says they should.

filestore_split_multiple * abs(filestore_merge_threshold) * 16

Our filestore_merge_threshold is set to 40.  When we had our 
filestore_split_multiple set to 8, we were splitting subfolders when a 
subfolder had (8 * 40 * 16 = ) 5120 objects in the directory.  In a different 
cluster we had to push that back again with elevated settings and the 
subfolders split when they had (16 * 40 * 16 = ) 10240 objects.

We have another cluster that we're working with that is splitting at a value 
that seems to be a hardcoded maximum.  The settings are (32 * 40 * 16 = ) 20480 
objects before it should split, but it seems to be splitting subfolders at 
12800 objects.

Normally I would expect this number to be a power of 2, but we recently found 
another hardcoded maximum of the object map only allowing RBD's with a maximum 
256,000,000 objects in them.  The 12800 matches that as being a base 2 followed 
by a set of zero's to be the hardcoded maximum.

Has anyone else encountered what seems to be a hardcoded maximum here?  Are we 
missing a setting elsewhere that is capping us, or diminishing our value?  Much 
more to the point, though, is there any way to mitigate how painful it is to 
split subfolders in PGs?  So far it seems like the only way we can do it is to 
push up the setting to later drop it back down during a week that we plan to 
have our cluster plagued with blocked requests all while cranking our 
osd_heartbeat_grace so that we don't have flapping osds.

A little more about our setup is that we have 32x 4TB HGST drives with 4x 200GB 
Intel DC3710 journals (8 drives per journal), dual hyper-threaded octa-core 
Xeon (32 virtual cores), 192GB memory, 10Gb redundant network... per storage 
node.

________________________________

[cid:[email protected]]<https://storagecraft.com>       David 
Turner | Cloud Operations Engineer | StorageCraft Technology 
Corporation<https://storagecraft.com>
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760 | Mobile: 385.224.2943

________________________________

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

________________________________
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to