Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-03-24 Thread Mazzystr
Hi Sage, Thanks for chiming in. I can't image how busy you are. Sorry guys. I reprovisioned the offending osd right after this email and a conversation on #ceph. I do have the output from '/usr/bin/ceph daemon osd.5 perf dump | /usr/bin/jq .' saved. I'll be happy to add it to the issue

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-03-22 Thread Konstantin Shalygin
On 3/23/19 12:20 AM, Mazzystr wrote: inline... On Fri, Mar 22, 2019 at 1:08 PM Konstantin Shalygin > wrote: On 3/22/19 11:57 PM, Mazzystr wrote: > I am also seeing BlueFS spill since updating to Nautilus.  I also see > high slow_used_bytes and

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-03-22 Thread Mazzystr
inline... On Fri, Mar 22, 2019 at 1:08 PM Konstantin Shalygin wrote: > On 3/22/19 11:57 PM, Mazzystr wrote: > > I am also seeing BlueFS spill since updating to Nautilus. I also see > > high slow_used_bytes and slow_total_bytes metrics. It sure looks to > > me that the only solution is to zap

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-03-22 Thread Konstantin Shalygin
On 3/22/19 11:57 PM, Mazzystr wrote: I am also seeing BlueFS spill since updating to Nautilus.  I also see high slow_used_bytes and slow_total_bytes metrics.  It sure looks to me that the only solution is to zap and rebuilt the osd.  I had to manually check 36 osds some of them traditional

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-03-22 Thread Sage Weil
I have a ticket open for this: http://tracker.ceph.com/issues/38745 Please comment there with the health warning you're seeing and any other details so we can figure out why it's happening. I wouldn't reprovision those OSDs yet, until we know why it happens. Also, it's likely that

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-03-22 Thread Mazzystr
I am also seeing BlueFS spill since updating to Nautilus. I also see high slow_used_bytes and slow_total_bytes metrics. It sure looks to me that the only solution is to zap and rebuilt the osd. I had to manually check 36 osds some of them traditional processes and some containerized. The lack

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-03-17 Thread Konstantin Shalygin
Yes, I was in a similar situation initially where I had deployed my OSD's with 25GB DB partitions and after 3GB DB used, everything else was going into slowDB on disk. From memory 29GB was just enough to make the DB fit on flash, but 30GB is a safe round figure to aim for. With a 30GB DB

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-25 Thread Nick Fisk
> -Original Message- > From: Vitaliy Filippov > Sent: 23 February 2019 20:31 > To: n...@fisk.me.uk; Serkan Çoban > Cc: ceph-users > Subject: Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow > storage for db - why? > > X-Assp-URIBL failed: '

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-25 Thread Nick Fisk
> -Original Message- > From: Konstantin Shalygin > Sent: 22 February 2019 14:23 > To: Nick Fisk > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow > storage for db - why? > > Bluestore/RocksDB will

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-24 Thread Anthony D'Atri
> Date: Fri, 22 Feb 2019 16:26:34 -0800 > From: solarflow99 > > > Aren't you undersized at only 30GB? I thought you should have 4% of your > OSDs The 4% guidance is new. Until relatively recently the oft-suggested and default value was 1%. > From: "Vitaliy Filippov" > Numbers are easy to

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-23 Thread Vitaliy Filippov
Numbers are easy to calculate from RocksDB parameters, however I also don't understand why it's 3 -> 30 -> 300... Default memtables are 256 MB, there are 4 of them, so L0 should be 1 GB, L1 should be 10 GB, and L2 should be 100 GB? These sizes are roughly 3GB,30GB,300GB. Anything

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-22 Thread solarflow99
Aren't you undersized at only 30GB? I thought you should have 4% of your OSDs On Fri, Feb 22, 2019 at 3:10 PM Nick Fisk wrote: > >On 2/16/19 12:33 AM, David Turner wrote: > >> The answer is probably going to be in how big your DB partition is vs > >> how big your HDD disk is. From your

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-22 Thread Konstantin Shalygin
Bluestore/RocksDB will only put the next level up size of DB on flash if the whole size will fit. These sizes are roughly 3GB,30GB,300GB. Anything in-between those sizes are pointless. Only ~3GB of SSD will ever be used out of a 28GB partition. Likewise a 240GB partition is also pointless as

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-22 Thread Serkan Çoban
>Where did you get those numbers? I would like to read more if you can point to a link. Just found the link: https://github.com/facebook/rocksdb/wiki/Leveled-Compaction On Fri, Feb 22, 2019 at 4:22 PM Serkan Çoban wrote: > > >>These sizes are roughly 3GB,30GB,300GB. Anything in-between those

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-22 Thread Serkan Çoban
>>These sizes are roughly 3GB,30GB,300GB. Anything in-between those sizes are >>pointless. Only ~3GB of SSD will ever be used out of a 28GB partition. Likewise a 240GB partition is also pointless as only ~30GB will be used. Where did you get those numbers? I would like to read more if you can

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-22 Thread Nick Fisk
>On 2/16/19 12:33 AM, David Turner wrote: >> The answer is probably going to be in how big your DB partition is vs >> how big your HDD disk is. From your output it looks like you have a >> 6TB HDD with a 28GB Blocks.DB partition. Even though the DB used >> size isn't currently full, I would

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-20 Thread Igor Fedotov
You're right - WAL/DB expansion capability is present in Luminous+ releases. But David meant volume migration stuff which appeared in Nautilus, see: https://github.com/ceph/ceph/pull/23103 Thanks, Igor On 2/20/2019 9:22 AM, Konstantin Shalygin wrote: On 2/19/19 11:46 PM, David Turner

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-19 Thread Konstantin Shalygin
On 2/19/19 11:46 PM, David Turner wrote: I don't know that there's anything that can be done to resolve this yet without rebuilding the OSD.  Based on a Nautilus tool being able to resize the DB device, I'm assuming that Nautilus is also capable of migrating the DB/WAL between devices.  That

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-19 Thread David Turner
I don't know that there's anything that can be done to resolve this yet without rebuilding the OSD. Based on a Nautilus tool being able to resize the DB device, I'm assuming that Nautilus is also capable of migrating the DB/WAL between devices. That functionality would allow anyone to migrate

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-18 Thread Konstantin Shalygin
On 2/18/19 9:43 PM, David Turner wrote: Do you have historical data from these OSDs to see when/if the DB used on osd.73 ever filled up?  To account for this OSD using the slow storage for DB, all we need to do is show that it filled up the fast DB at least once.  If that happened, then

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-18 Thread David Turner
Do you have historical data from these OSDs to see when/if the DB used on osd.73 ever filled up? To account for this OSD using the slow storage for DB, all we need to do is show that it filled up the fast DB at least once. If that happened, then something spilled over to the slow storage and has

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-15 Thread Konstantin Shalygin
On 2/16/19 12:33 AM, David Turner wrote: The answer is probably going to be in how big your DB partition is vs how big your HDD disk is.  From your output it looks like you have a 6TB HDD with a 28GB Blocks.DB partition.  Even though the DB used size isn't currently full, I would guess that at

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-15 Thread David Turner
The answer is probably going to be in how big your DB partition is vs how big your HDD disk is. From your output it looks like you have a 6TB HDD with a 28GB Blocks.DB partition. Even though the DB used size isn't currently full, I would guess that at some point since this OSD was created that

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-14 Thread Konstantin Shalygin
Wrong metadata paste of osd.73 in previous message. {     "id": 73,     "arch": "x86_64",     "back_addr": "10.10.10.6:6804/175338",     "back_iface": "vlan3",     "bluefs": "1",     "bluefs_db_access_mode": "blk",     "bluefs_db_block_size": "4096",     "bluefs_db_dev": "259:22",