Re: [ceph-users] 13.2.4 odd memory leak?

2019-03-08 Thread Mark Nelson


On 3/8/19 8:12 AM, Steffen Winther Sørensen wrote:



On 8 Mar 2019, at 14.30, Mark Nelson > wrote:



On 3/8/19 5:56 AM, Steffen Winther Sørensen wrote:


On 5 Mar 2019, at 10.02, Paul Emmerich > wrote:


Yeah, there's a bug in 13.2.4. You need to set it to at least ~1.2GB.

Yeap thanks, setting it at 1G+256M worked :)
Hope this won’t bloat memory during coming weekend VM backups 
through CephFS





FWIW, setting it to 1.2G will almost certainly result in the 
bluestore caches being stuck at cache_min, ie 128MB and the autotuner 
may not be able to keep the OSD memory that low.  I typically 
recommend a bare minimum of 2GB per OSD, and on SSD/NVMe backed OSDs 
3-4+ can improve performance significantly.

This a smaller dev cluster, not much IO, 4 nodes of 16GB & 6x HDD OSD

Just want to avoid consuming swap, which bloated after patching to 
13.2.4 from 13.2.2 after performing VM snapshots to CephFS, Otherwise 
cluster has been fine for ages…

/Steffen



Understood.  We struggled with whether we should have separate HDD and 
SSD defaults for osd_memory_target, but we were seeing other users 
having problems with setting the global default vs the ssd/hdd default 
and not seeing expected behavior.  We decided to have a single 
osd_memory_target to try to make the whole thing simpler with only a 
single parameter to set.  The 4GB/OSD is aggressive but can dramatically 
improve performance on NVMe and we figured that it sort of communicates 
to users where we think the sweet spot is (and as devices and data sets 
get larger, this is going to be even more important).



Mark







Mark




On Tue, Mar 5, 2019 at 9:00 AM Steffen Winther Sørensen
mailto:ste...@gmail.com>> wrote:



On 4 Mar 2019, at 16.09, Paul Emmerich > wrote:


Bloated to ~4 GB per OSD and you are on HDDs?

Something like that yes.


13.2.3 backported the cache auto-tuning which targets 4 GB memory
usage by default.


See https://ceph.com/releases/13-2-4-mimic-released/

Right, thanks…


The bluestore_cache_* options are no longer needed. They are replaced
by osd_memory_target, defaulting to 4GB. BlueStore will expand
and contract its cache to attempt to stay within this
limit. Users upgrading should note this is a higher default
than the previous bluestore_cache_size of 1GB, so OSDs using
BlueStore will use more memory by default.
For more details, see the BlueStore docs.

Adding a 'osd memory target’ value to our ceph.conf and restarting 
an OSD just makes the OSD dump like this:


[osd]
  ; this key makes 13.2.4 OSDs abort???
  osd memory target = 1073741824

  ; other OSD key settings
  osd pool default size = 2  # Write an object 2 times.
  osd pool default min size = 1 # Allow writing one copy in a 
degraded state.


  osd pool default pg num = 256
  osd pool default pgp num = 256

  client cache size = 131072
  osd client op priority = 40
  osd op threads = 8
  osd client message size cap = 512
  filestore min sync interval = 10
  filestore max sync interval = 60

  recovery max active = 2
  recovery op priority = 30
  osd max backfills = 2




osd log snippet:
 -472> 2019-03-05 08:36:02.233 7f2743a8c1c0  1 -- - start start
 -471> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 init 
/var/lib/ceph/osd/ceph-12 (looks like hdd)
 -470> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 journal 
/var/lib/ceph/osd/ceph-12/journal
 -469> 2019-03-05 08:36:02.234 7f2743a8c1c0  1 
bluestore(/var/lib/ceph/osd/ceph-12) _mount path 
/var/lib/ceph/osd/ceph-12
 -468> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev create path 
/var/lib/ceph/osd/ceph-12/block type kernel
 -467> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev(0x55b31af4a000 
/var/lib/ceph/osd/ceph-12/block) open path 
/var/lib/ceph/osd/ceph-12/block
 -466> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 bdev(0x55b31af4a000 
/var/lib/ceph/osd/ceph-12/block) open size 146775474176 
(0x222c80, 137 GiB) block_size 4096 (4 KiB) rotational
 -465> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 
bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 
1073741824 meta 0.4 kv 0.4 data 0.2
 -464> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev create path 
/var/lib/ceph/osd/ceph-12/block type kernel
 -463> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
/var/lib/ceph/osd/ceph-12/block) open path 
/var/lib/ceph/osd/ceph-12/block
 -462> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
/var/lib/ceph/osd/ceph-12/block) open size 146775474176 
(0x222c80, 137 GiB) block_size 4096 (4 KiB) rotational
 -461> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs 
add_block_device bdev 1 path /var/lib/ceph/osd/ceph-12/block size 
137 GiB

 -460> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs mount
 -459> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
compaction_readahead_size = 2097152
 -458> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
compression = kNoCompression
 -457> 

Re: [ceph-users] 13.2.4 odd memory leak?

2019-03-08 Thread Steffen Winther Sørensen


> On 8 Mar 2019, at 14.30, Mark Nelson  wrote:
> 
> 
> On 3/8/19 5:56 AM, Steffen Winther Sørensen wrote:
>> 
>>> On 5 Mar 2019, at 10.02, Paul Emmerich >> > wrote:
>>> 
>>> Yeah, there's a bug in 13.2.4. You need to set it to at least ~1.2GB.
>> Yeap thanks, setting it at 1G+256M worked :)
>> Hope this won’t bloat memory during coming weekend VM backups through CephFS
>> 
> 
> 
> FWIW, setting it to 1.2G will almost certainly result in the bluestore caches 
> being stuck at cache_min, ie 128MB and the autotuner may not be able to keep 
> the OSD memory that low.  I typically recommend a bare minimum of 2GB per 
> OSD, and on SSD/NVMe backed OSDs 3-4+ can improve performance significantly.
This a smaller dev cluster, not much IO, 4 nodes of 16GB & 6x HDD OSD 

Just want to avoid consuming swap, which bloated after patching to 13.2.4 from 
13.2.2 after performing VM snapshots to CephFS, Otherwise cluster has been fine 
for ages…
/Steffen


> 
> 
> Mark
> 
> 
> 
>>> On Tue, Mar 5, 2019 at 9:00 AM Steffen Winther Sørensen
>>>  wrote:
 
 
 On 4 Mar 2019, at 16.09, Paul Emmerich  wrote:
 
 Bloated to ~4 GB per OSD and you are on HDDs?
 
 Something like that yes.
 
 
 13.2.3 backported the cache auto-tuning which targets 4 GB memory
 usage by default.
 
 
 See https://ceph.com/releases/13-2-4-mimic-released/
 
 Right, thanks…
 
 
 The bluestore_cache_* options are no longer needed. They are replaced
 by osd_memory_target, defaulting to 4GB. BlueStore will expand
 and contract its cache to attempt to stay within this
 limit. Users upgrading should note this is a higher default
 than the previous bluestore_cache_size of 1GB, so OSDs using
 BlueStore will use more memory by default.
 For more details, see the BlueStore docs.
 
 Adding a 'osd memory target’ value to our ceph.conf and restarting an OSD 
 just makes the OSD dump like this:
 
 [osd]
   ; this key makes 13.2.4 OSDs abort???
   osd memory target = 1073741824
 
   ; other OSD key settings
   osd pool default size = 2  # Write an object 2 times.
   osd pool default min size = 1 # Allow writing one copy in a degraded 
 state.
 
   osd pool default pg num = 256
   osd pool default pgp num = 256
 
   client cache size = 131072
   osd client op priority = 40
   osd op threads = 8
   osd client message size cap = 512
   filestore min sync interval = 10
   filestore max sync interval = 60
 
   recovery max active = 2
   recovery op priority = 30
   osd max backfills = 2
 
 
 
 
 osd log snippet:
  -472> 2019-03-05 08:36:02.233 7f2743a8c1c0  1 -- - start start
  -471> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 init 
 /var/lib/ceph/osd/ceph-12 (looks like hdd)
  -470> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 journal 
 /var/lib/ceph/osd/ceph-12/journal
  -469> 2019-03-05 08:36:02.234 7f2743a8c1c0  1 
 bluestore(/var/lib/ceph/osd/ceph-12) _mount path /var/lib/ceph/osd/ceph-12
  -468> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev create path 
 /var/lib/ceph/osd/ceph-12/block type kernel
  -467> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev(0x55b31af4a000 
 /var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
  -466> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 bdev(0x55b31af4a000 
 /var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c80, 137 
 GiB) block_size 4096 (4 KiB) rotational
  -465> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 
 bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 
 1073741824 meta 0.4 kv 0.4 data 0.2
  -464> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev create path 
 /var/lib/ceph/osd/ceph-12/block type kernel
  -463> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
 /var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
  -462> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
 /var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c80, 137 
 GiB) block_size 4096 (4 KiB) rotational
  -461> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs add_block_device 
 bdev 1 path /var/lib/ceph/osd/ceph-12/block size 137 GiB
  -460> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs mount
  -459> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
 compaction_readahead_size = 2097152
  -458> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
 compression = kNoCompression
  -457> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
 max_write_buffer_number = 4
  -456> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
 min_write_buffer_number_to_merge = 1
  -455> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
 

Re: [ceph-users] 13.2.4 odd memory leak?

2019-03-08 Thread Mark Nelson


On 3/8/19 5:56 AM, Steffen Winther Sørensen wrote:



On 5 Mar 2019, at 10.02, Paul Emmerich  wrote:

Yeah, there's a bug in 13.2.4. You need to set it to at least ~1.2GB.

Yeap thanks, setting it at 1G+256M worked :)
Hope this won’t bloat memory during coming weekend VM backups through CephFS

/Steffen



FWIW, setting it to 1.2G will almost certainly result in the bluestore 
caches being stuck at cache_min, ie 128MB and the autotuner may not be 
able to keep the OSD memory that low.  I typically recommend a bare 
minimum of 2GB per OSD, and on SSD/NVMe backed OSDs 3-4+ can improve 
performance significantly.



Mark




On Tue, Mar 5, 2019 at 9:00 AM Steffen Winther Sørensen
 wrote:



On 4 Mar 2019, at 16.09, Paul Emmerich  wrote:

Bloated to ~4 GB per OSD and you are on HDDs?

Something like that yes.


13.2.3 backported the cache auto-tuning which targets 4 GB memory
usage by default.


See https://ceph.com/releases/13-2-4-mimic-released/

Right, thanks…


The bluestore_cache_* options are no longer needed. They are replaced
by osd_memory_target, defaulting to 4GB. BlueStore will expand
and contract its cache to attempt to stay within this
limit. Users upgrading should note this is a higher default
than the previous bluestore_cache_size of 1GB, so OSDs using
BlueStore will use more memory by default.
For more details, see the BlueStore docs.

Adding a 'osd memory target’ value to our ceph.conf and restarting an OSD just 
makes the OSD dump like this:

[osd]
   ; this key makes 13.2.4 OSDs abort???
   osd memory target = 1073741824

   ; other OSD key settings
   osd pool default size = 2  # Write an object 2 times.
   osd pool default min size = 1 # Allow writing one copy in a degraded state.

   osd pool default pg num = 256
   osd pool default pgp num = 256

   client cache size = 131072
   osd client op priority = 40
   osd op threads = 8
   osd client message size cap = 512
   filestore min sync interval = 10
   filestore max sync interval = 60

   recovery max active = 2
   recovery op priority = 30
   osd max backfills = 2




osd log snippet:
  -472> 2019-03-05 08:36:02.233 7f2743a8c1c0  1 -- - start start
  -471> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 init 
/var/lib/ceph/osd/ceph-12 (looks like hdd)
  -470> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 journal 
/var/lib/ceph/osd/ceph-12/journal
  -469> 2019-03-05 08:36:02.234 7f2743a8c1c0  1 
bluestore(/var/lib/ceph/osd/ceph-12) _mount path /var/lib/ceph/osd/ceph-12
  -468> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev create path 
/var/lib/ceph/osd/ceph-12/block type kernel
  -467> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev(0x55b31af4a000 
/var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
  -466> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 bdev(0x55b31af4a000 
/var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c80, 137 GiB) 
block_size 4096 (4 KiB) rotational
  -465> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 
bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 1073741824 meta 
0.4 kv 0.4 data 0.2
  -464> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev create path 
/var/lib/ceph/osd/ceph-12/block type kernel
  -463> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
/var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
  -462> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
/var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c80, 137 GiB) 
block_size 4096 (4 KiB) rotational
  -461> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs add_block_device bdev 1 
path /var/lib/ceph/osd/ceph-12/block size 137 GiB
  -460> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs mount
  -459> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
compaction_readahead_size = 2097152
  -458> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option compression 
= kNoCompression
  -457> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
max_write_buffer_number = 4
  -456> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
min_write_buffer_number_to_merge = 1
  -455> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
recycle_log_file_num = 4
  -454> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
writable_file_max_buffer_size = 0
  -453> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
write_buffer_size = 268435456
  -452> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
compaction_readahead_size = 2097152
  -451> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option compression 
= kNoCompression
  -450> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
max_write_buffer_number = 4
  -449> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
min_write_buffer_number_to_merge = 1
  -448> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
recycle_log_file_num = 4
  -447> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 

Re: [ceph-users] 13.2.4 odd memory leak?

2019-03-08 Thread Steffen Winther Sørensen


> On 5 Mar 2019, at 10.02, Paul Emmerich  wrote:
> 
> Yeah, there's a bug in 13.2.4. You need to set it to at least ~1.2GB.
Yeap thanks, setting it at 1G+256M worked :)
Hope this won’t bloat memory during coming weekend VM backups through CephFS

/Steffen

> 
> On Tue, Mar 5, 2019 at 9:00 AM Steffen Winther Sørensen
>  wrote:
>> 
>> 
>> 
>> On 4 Mar 2019, at 16.09, Paul Emmerich  wrote:
>> 
>> Bloated to ~4 GB per OSD and you are on HDDs?
>> 
>> Something like that yes.
>> 
>> 
>> 13.2.3 backported the cache auto-tuning which targets 4 GB memory
>> usage by default.
>> 
>> 
>> See https://ceph.com/releases/13-2-4-mimic-released/
>> 
>> Right, thanks…
>> 
>> 
>> The bluestore_cache_* options are no longer needed. They are replaced
>> by osd_memory_target, defaulting to 4GB. BlueStore will expand
>> and contract its cache to attempt to stay within this
>> limit. Users upgrading should note this is a higher default
>> than the previous bluestore_cache_size of 1GB, so OSDs using
>> BlueStore will use more memory by default.
>> For more details, see the BlueStore docs.
>> 
>> Adding a 'osd memory target’ value to our ceph.conf and restarting an OSD 
>> just makes the OSD dump like this:
>> 
>> [osd]
>>   ; this key makes 13.2.4 OSDs abort???
>>   osd memory target = 1073741824
>> 
>>   ; other OSD key settings
>>   osd pool default size = 2  # Write an object 2 times.
>>   osd pool default min size = 1 # Allow writing one copy in a degraded state.
>> 
>>   osd pool default pg num = 256
>>   osd pool default pgp num = 256
>> 
>>   client cache size = 131072
>>   osd client op priority = 40
>>   osd op threads = 8
>>   osd client message size cap = 512
>>   filestore min sync interval = 10
>>   filestore max sync interval = 60
>> 
>>   recovery max active = 2
>>   recovery op priority = 30
>>   osd max backfills = 2
>> 
>> 
>> 
>> 
>> osd log snippet:
>>  -472> 2019-03-05 08:36:02.233 7f2743a8c1c0  1 -- - start start
>>  -471> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 init 
>> /var/lib/ceph/osd/ceph-12 (looks like hdd)
>>  -470> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 journal 
>> /var/lib/ceph/osd/ceph-12/journal
>>  -469> 2019-03-05 08:36:02.234 7f2743a8c1c0  1 
>> bluestore(/var/lib/ceph/osd/ceph-12) _mount path /var/lib/ceph/osd/ceph-12
>>  -468> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev create path 
>> /var/lib/ceph/osd/ceph-12/block type kernel
>>  -467> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev(0x55b31af4a000 
>> /var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
>>  -466> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 bdev(0x55b31af4a000 
>> /var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c80, 137 
>> GiB) block_size 4096 (4 KiB) rotational
>>  -465> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 
>> bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 1073741824 
>> meta 0.4 kv 0.4 data 0.2
>>  -464> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev create path 
>> /var/lib/ceph/osd/ceph-12/block type kernel
>>  -463> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
>> /var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
>>  -462> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
>> /var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c80, 137 
>> GiB) block_size 4096 (4 KiB) rotational
>>  -461> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs add_block_device bdev 
>> 1 path /var/lib/ceph/osd/ceph-12/block size 137 GiB
>>  -460> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs mount
>>  -459> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>> compaction_readahead_size = 2097152
>>  -458> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>> compression = kNoCompression
>>  -457> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>> max_write_buffer_number = 4
>>  -456> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>> min_write_buffer_number_to_merge = 1
>>  -455> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>> recycle_log_file_num = 4
>>  -454> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>> writable_file_max_buffer_size = 0
>>  -453> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>> write_buffer_size = 268435456
>>  -452> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>> compaction_readahead_size = 2097152
>>  -451> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>> compression = kNoCompression
>>  -450> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>> max_write_buffer_number = 4
>>  -449> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>> min_write_buffer_number_to_merge = 1
>>  -448> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>> recycle_log_file_num = 4
>>  -447> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>> writable_file_max_buffer_size = 0
>>  -446> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb 

Re: [ceph-users] 13.2.4 odd memory leak?

2019-03-05 Thread Paul Emmerich
Yeah, there's a bug in 13.2.4. You need to set it to at least ~1.2GB.


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Tue, Mar 5, 2019 at 9:00 AM Steffen Winther Sørensen
 wrote:
>
>
>
> On 4 Mar 2019, at 16.09, Paul Emmerich  wrote:
>
> Bloated to ~4 GB per OSD and you are on HDDs?
>
> Something like that yes.
>
>
> 13.2.3 backported the cache auto-tuning which targets 4 GB memory
> usage by default.
>
>
> See https://ceph.com/releases/13-2-4-mimic-released/
>
> Right, thanks…
>
>
> The bluestore_cache_* options are no longer needed. They are replaced
> by osd_memory_target, defaulting to 4GB. BlueStore will expand
> and contract its cache to attempt to stay within this
> limit. Users upgrading should note this is a higher default
> than the previous bluestore_cache_size of 1GB, so OSDs using
> BlueStore will use more memory by default.
> For more details, see the BlueStore docs.
>
> Adding a 'osd memory target’ value to our ceph.conf and restarting an OSD 
> just makes the OSD dump like this:
>
> [osd]
>; this key makes 13.2.4 OSDs abort???
>osd memory target = 1073741824
>
>; other OSD key settings
>osd pool default size = 2  # Write an object 2 times.
>osd pool default min size = 1 # Allow writing one copy in a degraded state.
>
>osd pool default pg num = 256
>osd pool default pgp num = 256
>
>client cache size = 131072
>osd client op priority = 40
>osd op threads = 8
>osd client message size cap = 512
>filestore min sync interval = 10
>filestore max sync interval = 60
>
>recovery max active = 2
>recovery op priority = 30
>osd max backfills = 2
>
>
>
>
> osd log snippet:
>   -472> 2019-03-05 08:36:02.233 7f2743a8c1c0  1 -- - start start
>   -471> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 init 
> /var/lib/ceph/osd/ceph-12 (looks like hdd)
>   -470> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 journal 
> /var/lib/ceph/osd/ceph-12/journal
>   -469> 2019-03-05 08:36:02.234 7f2743a8c1c0  1 
> bluestore(/var/lib/ceph/osd/ceph-12) _mount path /var/lib/ceph/osd/ceph-12
>   -468> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev create path 
> /var/lib/ceph/osd/ceph-12/block type kernel
>   -467> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev(0x55b31af4a000 
> /var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
>   -466> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 bdev(0x55b31af4a000 
> /var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c80, 137 
> GiB) block_size 4096 (4 KiB) rotational
>   -465> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 
> bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 1073741824 
> meta 0.4 kv 0.4 data 0.2
>   -464> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev create path 
> /var/lib/ceph/osd/ceph-12/block type kernel
>   -463> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
> /var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
>   -462> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
> /var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c80, 137 
> GiB) block_size 4096 (4 KiB) rotational
>   -461> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs add_block_device bdev 
> 1 path /var/lib/ceph/osd/ceph-12/block size 137 GiB
>   -460> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs mount
>   -459> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
> compaction_readahead_size = 2097152
>   -458> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
> compression = kNoCompression
>   -457> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
> max_write_buffer_number = 4
>   -456> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
> min_write_buffer_number_to_merge = 1
>   -455> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
> recycle_log_file_num = 4
>   -454> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
> writable_file_max_buffer_size = 0
>   -453> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
> write_buffer_size = 268435456
>   -452> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
> compaction_readahead_size = 2097152
>   -451> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
> compression = kNoCompression
>   -450> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
> max_write_buffer_number = 4
>   -449> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
> min_write_buffer_number_to_merge = 1
>   -448> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
> recycle_log_file_num = 4
>   -447> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
> writable_file_max_buffer_size = 0
>   -446> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
> write_buffer_size = 268435456
>   -445> 2019-03-05 08:36:02.340 7f2743a8c1c0  1 rocksdb: do_open 

Re: [ceph-users] 13.2.4 odd memory leak?

2019-03-05 Thread Steffen Winther Sørensen


> On 4 Mar 2019, at 16.09, Paul Emmerich  wrote:
> 
> Bloated to ~4 GB per OSD and you are on HDDs?
Something like that yes.

> 
> 13.2.3 backported the cache auto-tuning which targets 4 GB memory
> usage by default.
> 
> See https://ceph.com/releases/13-2-4-mimic-released/ 
> 
Right, thanks…

> 
> The bluestore_cache_* options are no longer needed. They are replaced
> by osd_memory_target, defaulting to 4GB. BlueStore will expand
> and contract its cache to attempt to stay within this
> limit. Users upgrading should note this is a higher default
> than the previous bluestore_cache_size of 1GB, so OSDs using
> BlueStore will use more memory by default.
> For more details, see the BlueStore docs.
Adding a 'osd memory target’ value to our ceph.conf and restarting an OSD just 
makes the OSD dump like this:

[osd]
   ; this key makes 13.2.4 OSDs abort???
   osd memory target = 1073741824

   ; other OSD key settings
   osd pool default size = 2  # Write an object 2 times.
   osd pool default min size = 1 # Allow writing one copy in a degraded state.

   osd pool default pg num = 256
   osd pool default pgp num = 256

   client cache size = 131072
   osd client op priority = 40
   osd op threads = 8
   osd client message size cap = 512
   filestore min sync interval = 10
   filestore max sync interval = 60

   recovery max active = 2
   recovery op priority = 30
   osd max backfills = 2




osd log snippet:
  -472> 2019-03-05 08:36:02.233 7f2743a8c1c0  1 -- - start start
  -471> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 init 
/var/lib/ceph/osd/ceph-12 (looks like hdd)
  -470> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 journal 
/var/lib/ceph/osd/ceph-12/journal
  -469> 2019-03-05 08:36:02.234 7f2743a8c1c0  1 
bluestore(/var/lib/ceph/osd/ceph-12) _mount path /var/lib/ceph/osd/ceph-12
  -468> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev create path 
/var/lib/ceph/osd/ceph-12/block type kernel
  -467> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev(0x55b31af4a000 
/var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
  -466> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 bdev(0x55b31af4a000 
/var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c80, 137 GiB) 
block_size 4096 (4 KiB) rotational
  -465> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 
bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 1073741824 
meta 0.4 kv 0.4 data 0.2
  -464> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev create path 
/var/lib/ceph/osd/ceph-12/block type kernel
  -463> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
/var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
  -462> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
/var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c80, 137 GiB) 
block_size 4096 (4 KiB) rotational
  -461> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs add_block_device bdev 1 
path /var/lib/ceph/osd/ceph-12/block size 137 GiB
  -460> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs mount
  -459> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
compaction_readahead_size = 2097152
  -458> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option compression 
= kNoCompression
  -457> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
max_write_buffer_number = 4
  -456> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
min_write_buffer_number_to_merge = 1
  -455> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
recycle_log_file_num = 4
  -454> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
writable_file_max_buffer_size = 0
  -453> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
write_buffer_size = 268435456
  -452> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
compaction_readahead_size = 2097152
  -451> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option compression 
= kNoCompression
  -450> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
max_write_buffer_number = 4
  -449> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
min_write_buffer_number_to_merge = 1
  -448> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
recycle_log_file_num = 4
  -447> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
writable_file_max_buffer_size = 0
  -446> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
write_buffer_size = 268435456
  -445> 2019-03-05 08:36:02.340 7f2743a8c1c0  1 rocksdb: do_open column 
families: [default]
  -444> 2019-03-05 08:36:02.341 7f2743a8c1c0  4 rocksdb: RocksDB version: 5.13.0
  -443> 2019-03-05 08:36:02.342 7f2743a8c1c0  4 rocksdb: Git sha 
rocksdb_build_git_sha:@0@
  -442> 2019-03-05 08:36:02.342 7f2743a8c1c0  4 rocksdb: Compile date Jan  4 
2019
...
  -271> 2019-03-05 08:36:02.431 7f2743a8c1c0  1 freelist init
  -270> 2019-03-05 08:36:02.535 7f2743a8c1c0  1 
bluestore(/var/lib/ceph/osd/ceph-12) 

Re: [ceph-users] 13.2.4 odd memory leak?

2019-03-04 Thread Paul Emmerich
Bloated to ~4 GB per OSD and you are on HDDs?

13.2.3 backported the cache auto-tuning which targets 4 GB memory
usage by default.

See https://ceph.com/releases/13-2-4-mimic-released/

The bluestore_cache_* options are no longer needed. They are replaced
by osd_memory_target, defaulting to 4GB. BlueStore will expand
and contract its cache to attempt to stay within this
limit. Users upgrading should note this is a higher default
than the previous bluestore_cache_size of 1GB, so OSDs using
BlueStore will use more memory by default.
For more details, see the BlueStore docs.

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Mon, Mar 4, 2019 at 3:55 PM Steffen Winther Sørensen
 wrote:
>
> List Members,
>
> patched a centos 7  based cluster from 13.2.2 to 13.2.4 last monday, 
> everything appeared working fine.
>
> Only this morning I found all OSDs in the cluster to be bloated in memory 
> foot print, possible after weekend backup through MDS.
>
> Anyone else seeing possible memory leak in 13.2.4 OSD possible primarily when 
> using MDS?
>
> TIA
>
> /Steffen
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com