[ceph-users] Re: Space leak in Bluestore

2020-03-27 Thread vitalif
Update on my issue. It seems it was caused by the broken compression 
which one of 14.2.x releases (ubuntu builds) probably had.


My osd versions were mixed. Five OSDs were 14.2.7, one was 14.2.4, other 
6 were 14.2.8.


I moved the same pg several times more. Space usage dropped when the pg 
was moved from 11,6,0 to 6,0,7. Then it raised again after moving the pg 
back.


Then I upgraded all OSDs and moved the PG again. Space usage dropped to 
the original point...


So now I probably have to move all PGs twice. Because I don't know which 
ones are affected...



Hi,

The cluster is all-flash (NVMe), so the removal is fast and it's in
fact pretty noticeable, even on Prometheus graphs.

Also I've logged raw space usage from `ceph -f json df`:

1) before pg rebalance started the space usage was 32724002664448 bytes
2) just before the rebalance finished it was 32883513622528 bytes
(1920 of ~120k objects misplaced) = +100 GB
3) then it started to drop. not instantly, but fast, and stopped at
32785906380800 = +58 GB to the original

I've repeated it several times. The behaviour is always the same.
First it copies the PG, then removes the old copy, but space usage
doesn't drop to the original point. It's obviously not client IO too,
it always happens exactly during the rebalance.


Hi Vitaliy,

just as a guess to verify:

a while ago I've been observed very long pool (pretty large) removal.
It took several days to complete. DB was at spinner which was one of
driver of this slow behavior.

Another one - PG removal design which enumerates up to 30 entries max
to fill single removal batch. Then execute it. Everything in a single
"thread". So the process is pretty slow for millions of objects...

During removal pool (read PGs) space was in use ad decreased slowly.
Pretty high DB volume utilization was observed.

 I assume rebalance performs PG removal as well - may be that's the
case?

Thanks,

Igor
On 3/26/2020 1:51 AM, Виталий Филиппов wrote:


Hi Igor,

I think so because
1) space usage increases after each rebalance. Even when the same pg
is moved twice (!)
2) I use 4k min_alloc_size from the beginning

One crazy hypothesis is that maybe ceph allocates space for
uncompressed objects, then compresses them and leaks
(uncompressed-compressed) space. Really crazy idea but who knows
o_O.

I already did a deep fsck, it didn't help... what else could I
check?...

26 марта 2020 г. 1:40:52 GMT+03:00, Igor Fedotov
 пишет:

Bluestore fsck/repair detect and fix leaks at Bluestore level but I
doubt your issue is here. To be honest I don't understand from the
overview why do you think that there are any leaks at all Not
sure whether this is relevant but from my experience space "leaks"
are sometimes caused by 64K allocation unit and keeping tons of
small files or massive small EC overwrites. To verify if this is
applicable you might want to inspect bluestore performance counters
(bluestore_stored vs. bluestore_allocated) to estimate your losses
due to high allocation units. Significant difference at multiple
OSDs might indicate that overhead is caused by high allocation
granularity. Compression might make this analysis not that simple
though... Thanks, Igor On 3/26/2020 1:19 AM, vita...@yourcmc.ru
wrote: I have a question regarding this problem - is it possible to
rebuild bluestore allocation metadata? I could try it to test if
it's an allocator problem... Hi. I'm experiencing some kind of a
space leak in Bluestore. I use EC, compression and snapshots. First
I thought that the leak was caused by "virtual clones" (issue
#38184). However, then I got rid of most of the snapshots, but
continued to experience the problem. I suspected something when I
added a new disk to the cluster and free space in the cluster didn't
increase (!). So to track down the issue I moved one PG (34.1a)
using upmaps from osd11,6,0 to osd6,0,7 and then back to osd11,6,0.
It ate +59 GB after the first move and +51 GB after the second. As I
understand this proves that it's not #38184. Devirtualizaton of
virtual clones couldn't eat additional space after SECOND rebalance
of the same PG. The PG has ~39000 objects, it is EC 2+1 and the
compression is enabled. Compression ratio is about ~2.7 in my setup,
so the PG should use ~90 GB raw space. Before and after moving the
PG I stopped osd0, mounted it with ceph-objectstore-tool with debug
bluestore = 20/20 and opened the 34.1a***/all directory. It seems to
dump all object extents into the log in that case. So now I have two
logs with all allocated extents for osd0 (I hope all extents are
there). I parsed both logs and added all compressed blob sizes
together ("get_ref Blob ... 0x2 -> 0x... compressed"). But they
add up to ~39 GB before first rebalance (34.1as2), ~22 GB after it
(34.1as1) and ~41 GB again after the second move (34.1as2) which
doesn't indicate a leak. But the raw space usage still exceeds
initial by a lot. So it's clear that there's a leak somewhere. What
additional details can I 

[ceph-users] Re: Space leak in Bluestore

2020-03-26 Thread vitalif

Hi,

The cluster is all-flash (NVMe), so the removal is fast and it's in fact 
pretty noticeable, even on Prometheus graphs.


Also I've logged raw space usage from `ceph -f json df`:

1) before pg rebalance started the space usage was 32724002664448 bytes
2) just before the rebalance finished it was 32883513622528 bytes (1920 
of ~120k objects misplaced) = +100 GB
3) then it started to drop. not instantly, but fast, and stopped at 
32785906380800 = +58 GB to the original


I've repeated it several times. The behaviour is always the same. First 
it copies the PG, then removes the old copy, but space usage doesn't 
drop to the original point. It's obviously not client IO too, it always 
happens exactly during the rebalance.



Hi Vitaliy,

just as a guess to verify:

a while ago I've been observed very long pool (pretty large) removal.
It took several days to complete. DB was at spinner which was one of
driver of this slow behavior.

Another one - PG removal design which enumerates up to 30 entries max
to fill single removal batch. Then execute it. Everything in a single
"thread". So the process is pretty slow for millions of objects...

During removal pool (read PGs) space was in use ad decreased slowly.
Pretty high DB volume utilization was observed.

 I assume rebalance performs PG removal as well - may be that's the
case?

Thanks,

Igor
On 3/26/2020 1:51 AM, Виталий Филиппов wrote:


Hi Igor,

I think so because
1) space usage increases after each rebalance. Even when the same pg
is moved twice (!)
2) I use 4k min_alloc_size from the beginning

One crazy hypothesis is that maybe ceph allocates space for
uncompressed objects, then compresses them and leaks
(uncompressed-compressed) space. Really crazy idea but who knows
o_O.

I already did a deep fsck, it didn't help... what else could I
check?...

26 марта 2020 г. 1:40:52 GMT+03:00, Igor Fedotov
 пишет:

Bluestore fsck/repair detect and fix leaks at Bluestore level but I
doubt your issue is here. To be honest I don't understand from the
overview why do you think that there are any leaks at all Not
sure whether this is relevant but from my experience space "leaks"
are sometimes caused by 64K allocation unit and keeping tons of
small files or massive small EC overwrites. To verify if this is
applicable you might want to inspect bluestore performance counters
(bluestore_stored vs. bluestore_allocated) to estimate your losses
due to high allocation units. Significant difference at multiple
OSDs might indicate that overhead is caused by high allocation
granularity. Compression might make this analysis not that simple
though... Thanks, Igor On 3/26/2020 1:19 AM, vita...@yourcmc.ru
wrote: I have a question regarding this problem - is it possible to
rebuild bluestore allocation metadata? I could try it to test if
it's an allocator problem... Hi. I'm experiencing some kind of a
space leak in Bluestore. I use EC, compression and snapshots. First
I thought that the leak was caused by "virtual clones" (issue
#38184). However, then I got rid of most of the snapshots, but
continued to experience the problem. I suspected something when I
added a new disk to the cluster and free space in the cluster didn't
increase (!). So to track down the issue I moved one PG (34.1a)
using upmaps from osd11,6,0 to osd6,0,7 and then back to osd11,6,0.
It ate +59 GB after the first move and +51 GB after the second. As I
understand this proves that it's not #38184. Devirtualizaton of
virtual clones couldn't eat additional space after SECOND rebalance
of the same PG. The PG has ~39000 objects, it is EC 2+1 and the
compression is enabled. Compression ratio is about ~2.7 in my setup,
so the PG should use ~90 GB raw space. Before and after moving the
PG I stopped osd0, mounted it with ceph-objectstore-tool with debug
bluestore = 20/20 and opened the 34.1a***/all directory. It seems to
dump all object extents into the log in that case. So now I have two
logs with all allocated extents for osd0 (I hope all extents are
there). I parsed both logs and added all compressed blob sizes
together ("get_ref Blob ... 0x2 -> 0x... compressed"). But they
add up to ~39 GB before first rebalance (34.1as2), ~22 GB after it
(34.1as1) and ~41 GB again after the second move (34.1as2) which
doesn't indicate a leak. But the raw space usage still exceeds
initial by a lot. So it's clear that there's a leak somewhere. What
additional details can I provide for you to identify the bug? I
posted the same message in the issue tracker,
https://tracker.ceph.com/issues/44731


--
With best regards,
Vitaliy Filippov

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Space leak in Bluestore

2020-03-26 Thread Igor Fedotov

Hi Vitaliy,

just as a guess to verify:

a while ago I've been observed very long pool (pretty large) removal. It 
took several days to complete. DB was at spinner which was one of driver 
of this slow behavior.


Another one - PG removal design which enumerates up to 30 entries max to 
fill single removal batch. Then execute it. Everything in a single 
"thread". So the process is pretty slow for millions of objects...


During removal pool (read PGs) space was in use ad decreased slowly. 
Pretty high DB volume utilization was observed.


I assume rebalance performs PG removal as well - may be that's the case?


Thanks,

Igor

On 3/26/2020 1:51 AM, Виталий Филиппов wrote:

Hi Igor,

I think so because
1) space usage increases after each rebalance. Even when the same pg 
is moved twice (!)

2) I use 4k min_alloc_size from the beginning

One crazy hypothesis is that maybe ceph allocates space for 
uncompressed objects, then compresses them and leaks 
(uncompressed-compressed) space. Really crazy idea but who knows o_O.


I already did a deep fsck, it didn't help... what else could I check?...

26 марта 2020 г. 1:40:52 GMT+03:00, Igor Fedotov  
пишет:


Bluestore fsck/repair detect and fix leaks at Bluestore level but I
doubt your issue is here.

To be honest I don't understand from the overview why do you think that
there are any leaks at all

Not sure whether this is relevant but from my experience space "leaks"
are sometimes caused by 64K allocation unit and keeping tons of small
files or massive small EC overwrites.

To verify if this is applicable you might want to inspect bluestore
performance counters (bluestore_stored vs. bluestore_allocated) to
estimate your losses due to high allocation units.

Significant difference at multiple OSDs might indicate that overhead is
caused by high allocation granularity. Compression might make this
analysis not that simple though...


Thanks,

Igor


On 3/26/2020 1:19 AM, vita...@yourcmc.ru wrote:

I have a question regarding this problem - is it possible to
rebuild bluestore allocation metadata? I could try it to test
if it's an allocator problem...

Hi. I'm experiencing some kind of a space leak in
Bluestore. I use EC, compression and snapshots. First I
thought that the leak was caused by "virtual clones"
(issue #38184). However, then I got rid of most of the
snapshots, but continued to experience the problem. I
suspected something when I added a new disk to the cluster
and free space in the cluster didn't increase (!). So to
track down the issue I moved one PG (34.1a) using upmaps
from osd11,6,0 to osd6,0,7 and then back to osd11,6,0. It
ate +59 GB after the first move and +51 GB after the
second. As I understand this proves that it's not #38184.
Devirtualizaton of virtual clones couldn't eat additional
space after SECOND rebalance of the same PG. The PG has
~39000 objects, it is EC 2+1 and the compression is
enabled. Compression ratio is about ~2.7 in my setup, so
the PG should use ~90 GB raw space. Before and after
moving the PG I stopped osd0, mounted it with
ceph-objectstore-tool with debug bluestore = 20/20 and
opened the 34.1a***/all directory. It seems to dump all
object extents into the log in that case. So now I have
two logs with all allocated extents for osd0 (I hope all
extents are there). I parsed both logs and added all
compressed blob sizes together ("get_ref Blob ... 0x2
-> 0x... compressed"). But they add up to ~39 GB before
first rebalance (34.1as2), ~22 GB after it (34.1as1) and
~41 GB again after the second move (34.1as2) which doesn't
indicate a leak. But the raw space usage still exceeds
initial by a lot. So it's clear that there's a leak
somewhere. What additional details can I provide for you
to identify the bug? I posted the same message in the
issue tracker, https://tracker.ceph.com/issues/44731 



--
With best regards,
Vitaliy Filippov 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Space leak in Bluestore

2020-03-25 Thread Виталий Филиппов
Hi Igor,

I think so because
1) space usage increases after each rebalance. Even when the same pg is moved 
twice (!)
2) I use 4k min_alloc_size from the beginning

One crazy hypothesis is that maybe ceph allocates space for uncompressed 
objects, then compresses them and leaks (uncompressed-compressed) space. Really 
crazy idea but who knows o_O.

I already did a deep fsck, it didn't help... what else could I check?...

26 марта 2020 г. 1:40:52 GMT+03:00, Igor Fedotov  пишет:
>Bluestore fsck/repair detect and fix leaks at Bluestore level but I 
>doubt your issue is here.
>
>To be honest I don't understand from the overview why do you think that
>
>there are any leaks at all
>
>Not sure whether this is relevant but from my experience space "leaks" 
>are sometimes caused by 64K allocation unit and keeping tons of small 
>files or massive small EC overwrites.
>
>To verify if this is applicable you might want to inspect bluestore 
>performance counters (bluestore_stored vs. bluestore_allocated) to 
>estimate your losses due to high allocation units.
>
>Significant difference at multiple OSDs might indicate that overhead is
>
>caused by high allocation granularity. Compression might make this 
>analysis not that simple though...
>
>
>Thanks,
>
>Igor
>
>
>On 3/26/2020 1:19 AM, vita...@yourcmc.ru wrote:
>> I have a question regarding this problem - is it possible to rebuild 
>> bluestore allocation metadata? I could try it to test if it's an 
>> allocator problem...
>>
>>> Hi.
>>>
>>> I'm experiencing some kind of a space leak in Bluestore. I use EC,
>>> compression and snapshots. First I thought that the leak was caused
>by
>>> "virtual clones" (issue #38184). However, then I got rid of most of
>>> the snapshots, but continued to experience the problem.
>>>
>>> I suspected something when I added a new disk to the cluster and
>free
>>> space in the cluster didn't increase (!).
>>>
>>> So to track down the issue I moved one PG (34.1a) using upmaps from
>>> osd11,6,0 to osd6,0,7 and then back to osd11,6,0.
>>>
>>> It ate +59 GB after the first move and +51 GB after the second. As I
>>> understand this proves that it's not #38184. Devirtualizaton of
>>> virtual clones couldn't eat additional space after SECOND rebalance
>of
>>> the same PG.
>>>
>>> The PG has ~39000 objects, it is EC 2+1 and the compression is
>>> enabled. Compression ratio is about ~2.7 in my setup, so the PG
>should
>>> use ~90 GB raw space.
>>>
>>> Before and after moving the PG I stopped osd0, mounted it with
>>> ceph-objectstore-tool with debug bluestore = 20/20 and opened the
>>> 34.1a***/all directory. It seems to dump all object extents into the
>>> log in that case. So now I have two logs with all allocated extents
>>> for osd0 (I hope all extents are there). I parsed both logs and
>added
>>> all compressed blob sizes together ("get_ref Blob ... 0x2 ->
>0x...
>>> compressed"). But they add up to ~39 GB before first rebalance
>>> (34.1as2), ~22 GB after it (34.1as1) and ~41 GB again after the
>second
>>> move (34.1as2) which doesn't indicate a leak.
>>>
>>> But the raw space usage still exceeds initial by a lot. So it's
>clear
>>> that there's a leak somewhere.
>>>
>>> What additional details can I provide for you to identify the bug?
>>>
>>> I posted the same message in the issue tracker,
>>> https://tracker.ceph.com/issues/44731

-- 
With best regards,
  Vitaliy Filippov
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Space leak in Bluestore

2020-03-25 Thread Igor Fedotov
Bluestore fsck/repair detect and fix leaks at Bluestore level but I 
doubt your issue is here.


To be honest I don't understand from the overview why do you think that 
there are any leaks at all


Not sure whether this is relevant but from my experience space "leaks" 
are sometimes caused by 64K allocation unit and keeping tons of small 
files or massive small EC overwrites.


To verify if this is applicable you might want to inspect bluestore 
performance counters (bluestore_stored vs. bluestore_allocated) to 
estimate your losses due to high allocation units.


Significant difference at multiple OSDs might indicate that overhead is 
caused by high allocation granularity. Compression might make this 
analysis not that simple though...



Thanks,

Igor


On 3/26/2020 1:19 AM, vita...@yourcmc.ru wrote:
I have a question regarding this problem - is it possible to rebuild 
bluestore allocation metadata? I could try it to test if it's an 
allocator problem...



Hi.

I'm experiencing some kind of a space leak in Bluestore. I use EC,
compression and snapshots. First I thought that the leak was caused by
"virtual clones" (issue #38184). However, then I got rid of most of
the snapshots, but continued to experience the problem.

I suspected something when I added a new disk to the cluster and free
space in the cluster didn't increase (!).

So to track down the issue I moved one PG (34.1a) using upmaps from
osd11,6,0 to osd6,0,7 and then back to osd11,6,0.

It ate +59 GB after the first move and +51 GB after the second. As I
understand this proves that it's not #38184. Devirtualizaton of
virtual clones couldn't eat additional space after SECOND rebalance of
the same PG.

The PG has ~39000 objects, it is EC 2+1 and the compression is
enabled. Compression ratio is about ~2.7 in my setup, so the PG should
use ~90 GB raw space.

Before and after moving the PG I stopped osd0, mounted it with
ceph-objectstore-tool with debug bluestore = 20/20 and opened the
34.1a***/all directory. It seems to dump all object extents into the
log in that case. So now I have two logs with all allocated extents
for osd0 (I hope all extents are there). I parsed both logs and added
all compressed blob sizes together ("get_ref Blob ... 0x2 -> 0x...
compressed"). But they add up to ~39 GB before first rebalance
(34.1as2), ~22 GB after it (34.1as1) and ~41 GB again after the second
move (34.1as2) which doesn't indicate a leak.

But the raw space usage still exceeds initial by a lot. So it's clear
that there's a leak somewhere.

What additional details can I provide for you to identify the bug?

I posted the same message in the issue tracker,
https://tracker.ceph.com/issues/44731

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Space leak in Bluestore

2020-03-25 Thread vitalif
I have a question regarding this problem - is it possible to rebuild 
bluestore allocation metadata? I could try it to test if it's an 
allocator problem...



Hi.

I'm experiencing some kind of a space leak in Bluestore. I use EC,
compression and snapshots. First I thought that the leak was caused by
"virtual clones" (issue #38184). However, then I got rid of most of
the snapshots, but continued to experience the problem.

I suspected something when I added a new disk to the cluster and free
space in the cluster didn't increase (!).

So to track down the issue I moved one PG (34.1a) using upmaps from
osd11,6,0 to osd6,0,7 and then back to osd11,6,0.

It ate +59 GB after the first move and +51 GB after the second. As I
understand this proves that it's not #38184. Devirtualizaton of
virtual clones couldn't eat additional space after SECOND rebalance of
the same PG.

The PG has ~39000 objects, it is EC 2+1 and the compression is
enabled. Compression ratio is about ~2.7 in my setup, so the PG should
use ~90 GB raw space.

Before and after moving the PG I stopped osd0, mounted it with
ceph-objectstore-tool with debug bluestore = 20/20 and opened the
34.1a***/all directory. It seems to dump all object extents into the
log in that case. So now I have two logs with all allocated extents
for osd0 (I hope all extents are there). I parsed both logs and added
all compressed blob sizes together ("get_ref Blob ... 0x2 -> 0x...
compressed"). But they add up to ~39 GB before first rebalance
(34.1as2), ~22 GB after it (34.1as1) and ~41 GB again after the second
move (34.1as2) which doesn't indicate a leak.

But the raw space usage still exceeds initial by a lot. So it's clear
that there's a leak somewhere.

What additional details can I provide for you to identify the bug?

I posted the same message in the issue tracker,
https://tracker.ceph.com/issues/44731

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Space leak in Bluestore

2020-03-24 Thread Mark Nelson
FWIW, Igor has been doing some great work on improving performance with 
the 4k_min_alloc size.  He gave a presentation at a recent weekly 
performance meeting on it and it's looking really good.  On HDDs I think 
he was seeing up to 2X faster 8K-128K random writes at the expense of up 
to a 20% sequential read hit when there is fragmentation all while 
retaining the space saving benefits of the 4K min_alloc size.  I believe 
in a future point release we should be able to have 4K be the default 
for both HDD and flash as it's already arguably faster on NVMe at this 
point.



Mark

On 3/24/20 12:03 PM, Steven Pine wrote:

Hi Vitaliy,

You may be coming across the EC space amplification issue,
https://tracker.ceph.com/issues/44213

I am not aware of any recent updates to resolve this issue.

Sincerely,

On Tue, Mar 24, 2020 at 12:53 PM  wrote:


Hi.

I'm experiencing some kind of a space leak in Bluestore. I use EC,
compression and snapshots. First I thought that the leak was caused by
"virtual clones" (issue #38184). However, then I got rid of most of the
snapshots, but continued to experience the problem.

I suspected something when I added a new disk to the cluster and free
space in the cluster didn't increase (!).

So to track down the issue I moved one PG (34.1a) using upmaps from
osd11,6,0 to osd6,0,7 and then back to osd11,6,0.

It ate +59 GB after the first move and +51 GB after the second. As I
understand this proves that it's not #38184. Devirtualizaton of virtual
clones couldn't eat additional space after SECOND rebalance of the same
PG.

The PG has ~39000 objects, it is EC 2+1 and the compression is enabled.
Compression ratio is about ~2.7 in my setup, so the PG should use ~90 GB
raw space.

Before and after moving the PG I stopped osd0, mounted it with
ceph-objectstore-tool with debug bluestore = 20/20 and opened the
34.1a***/all directory. It seems to dump all object extents into the log
in that case. So now I have two logs with all allocated extents for osd0
(I hope all extents are there). I parsed both logs and added all
compressed blob sizes together ("get_ref Blob ... 0x2 -> 0x...
compressed"). But they add up to ~39 GB before first rebalance
(34.1as2), ~22 GB after it (34.1as1) and ~41 GB again after the second
move (34.1as2) which doesn't indicate a leak.

But the raw space usage still exceeds initial by a lot. So it's clear
that there's a leak somewhere.

What additional details can I provide for you to identify the bug?

I posted the same message in the issue tracker,
https://tracker.ceph.com/issues/44731

--
Vitaliy Filippov
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Space leak in Bluestore

2020-03-24 Thread vitalif

Hi Steve,

Thanks, it's an interesting discussion, however I don't think that it's 
the same problem, because in my case bluestore eats additional space 
during rebalance. And it doesn't seem that Ceph does small overwrites 
during rebalance. As I understand it does the opposite: it reads and 
writes the whole object... Also I have bluestore_min_alloc_size set to 
4K from the beginning and Igor says that it works around that bug... 
bug-o-feature. :D



Hi Vitaliy,

You may be coming across the EC space amplification issue,
https://tracker.ceph.com/issues/44213

I am not aware of any recent updates to resolve this issue.

Sincerely,

On Tue, Mar 24, 2020 at 12:53 PM  wrote:


Hi.

I'm experiencing some kind of a space leak in Bluestore. I use EC,
compression and snapshots. First I thought that the leak was caused
by
"virtual clones" (issue #38184). However, then I got rid of most of
the
snapshots, but continued to experience the problem.

I suspected something when I added a new disk to the cluster and
free
space in the cluster didn't increase (!).

So to track down the issue I moved one PG (34.1a) using upmaps from
osd11,6,0 to osd6,0,7 and then back to osd11,6,0.

It ate +59 GB after the first move and +51 GB after the second. As I

understand this proves that it's not #38184. Devirtualizaton of
virtual
clones couldn't eat additional space after SECOND rebalance of the
same
PG.

The PG has ~39000 objects, it is EC 2+1 and the compression is
enabled.
Compression ratio is about ~2.7 in my setup, so the PG should use
~90 GB
raw space.

Before and after moving the PG I stopped osd0, mounted it with
ceph-objectstore-tool with debug bluestore = 20/20 and opened the
34.1a***/all directory. It seems to dump all object extents into the
log
in that case. So now I have two logs with all allocated extents for
osd0
(I hope all extents are there). I parsed both logs and added all
compressed blob sizes together ("get_ref Blob ... 0x2 -> 0x...
compressed"). But they add up to ~39 GB before first rebalance
(34.1as2), ~22 GB after it (34.1as1) and ~41 GB again after the
second
move (34.1as2) which doesn't indicate a leak.

But the raw space usage still exceeds initial by a lot. So it's
clear
that there's a leak somewhere.

What additional details can I provide for you to identify the bug?

I posted the same message in the issue tracker,
https://tracker.ceph.com/issues/44731

--
Vitaliy Filippov
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--

Steven Pine

webair.com [1]

P  516.938.4100 x

 E  steven.p...@webair.com

   [2]  [3]



Links:
--
[1] http://webair.com
[2] https://www.facebook.com/WebairInc/
[3] https://www.linkedin.com/company/webair

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Space leak in Bluestore

2020-03-24 Thread Steven Pine
Hi Vitaliy,

You may be coming across the EC space amplification issue,
https://tracker.ceph.com/issues/44213

I am not aware of any recent updates to resolve this issue.

Sincerely,

On Tue, Mar 24, 2020 at 12:53 PM  wrote:

> Hi.
>
> I'm experiencing some kind of a space leak in Bluestore. I use EC,
> compression and snapshots. First I thought that the leak was caused by
> "virtual clones" (issue #38184). However, then I got rid of most of the
> snapshots, but continued to experience the problem.
>
> I suspected something when I added a new disk to the cluster and free
> space in the cluster didn't increase (!).
>
> So to track down the issue I moved one PG (34.1a) using upmaps from
> osd11,6,0 to osd6,0,7 and then back to osd11,6,0.
>
> It ate +59 GB after the first move and +51 GB after the second. As I
> understand this proves that it's not #38184. Devirtualizaton of virtual
> clones couldn't eat additional space after SECOND rebalance of the same
> PG.
>
> The PG has ~39000 objects, it is EC 2+1 and the compression is enabled.
> Compression ratio is about ~2.7 in my setup, so the PG should use ~90 GB
> raw space.
>
> Before and after moving the PG I stopped osd0, mounted it with
> ceph-objectstore-tool with debug bluestore = 20/20 and opened the
> 34.1a***/all directory. It seems to dump all object extents into the log
> in that case. So now I have two logs with all allocated extents for osd0
> (I hope all extents are there). I parsed both logs and added all
> compressed blob sizes together ("get_ref Blob ... 0x2 -> 0x...
> compressed"). But they add up to ~39 GB before first rebalance
> (34.1as2), ~22 GB after it (34.1as1) and ~41 GB again after the second
> move (34.1as2) which doesn't indicate a leak.
>
> But the raw space usage still exceeds initial by a lot. So it's clear
> that there's a leak somewhere.
>
> What additional details can I provide for you to identify the bug?
>
> I posted the same message in the issue tracker,
> https://tracker.ceph.com/issues/44731
>
> --
> Vitaliy Filippov
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Steven Pine
webair.com
*P*  516.938.4100 x
*E * steven.p...@webair.com


   

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io