Re: [ceph-users] BlueFS spillover detected - 14.2.1

2019-06-19 Thread Nigel Williams
On Thu, 20 Jun 2019 at 09:12, Vitaliy Filippov wrote: > All values except 4, 30 and 286 GB are currently useless in ceph with > default rocksdb settings :) > however, several commenters have said that during compaction rocksdb needs space during the process, and hence the DB partition needs to b

Re: [ceph-users] BlueFS spillover detected - 14.2.1

2019-06-19 Thread Vitaliy Filippov
All values except 4, 30 and 286 GB are currently useless in ceph with default rocksdb settings :) That's what you are seeing - all devices just use ~28 GB and everything else goes to HDDs. -- With best regards, Vitaliy Filippov ___ ceph-users ma

Re: [ceph-users] BlueFS spillover detected - 14.2.1

2019-06-18 Thread Igor Fedotov
Yes, for now I'd personally prefer 60-64 GB per DB volume unless one is unable to allocate 300+ GB. This is 2x times larger than your DBs keep right now (and which is pretty inline with RocksDB Level 3 max size). Thanks, Igor On 6/18/2019 9:30 PM, Brett Chancellor wrote: Thanks Igor. I'm

Re: [ceph-users] BlueFS spillover detected - 14.2.1

2019-06-18 Thread Brett Chancellor
Thanks Igor. I'm fine turning the warnings off, but it's curious that only this cluster is showing the alerts. Is there any value in rebuilding the with smaller SSD meta data volumes? Say 60GB or 30GB? -Brett On Tue, Jun 18, 2019 at 1:55 PM Igor Fedotov wrote: > Hi Brett, > > this issue has be

Re: [ceph-users] BlueFS spillover detected - 14.2.1

2019-06-18 Thread Igor Fedotov
Hi Brett, this issue has been with you long before upgrade to 14.2.1. This upgrade just brought corresponding alert visible. You can turn the alert off by setting bluestore_warn_on_bluefs_spillover=false. But generally this warning shows DB data layout inefficiency - some data is kept at s

[ceph-users] BlueFS spillover detected - 14.2.1

2019-06-18 Thread Brett Chancellor
Does anybody have a fix for BlueFS spillover detected? This started happening 2 days after an upgrade to 14.2.1 and has increased from 3 OSDs to 118 in the last 4 days. I read you could fix it by rebuilding the OSDs, but rebuilding the 264 OSDs on this cluster will take months of rebalancing. $ s