On 3/8/19 5:56 AM, Steffen Winther Sørensen wrote:

On 5 Mar 2019, at 10.02, Paul Emmerich <paul.emmer...@croit.io> wrote:

Yeah, there's a bug in 13.2.4. You need to set it to at least ~1.2GB.
Yeap thanks, setting it at 1G+256M worked :)
Hope this won’t bloat memory during coming weekend VM backups through CephFS

/Steffen


FWIW, setting it to 1.2G will almost certainly result in the bluestore caches being stuck at cache_min, ie 128MB and the autotuner may not be able to keep the OSD memory that low.  I typically recommend a bare minimum of 2GB per OSD, and on SSD/NVMe backed OSDs 3-4+ can improve performance significantly.


Mark



On Tue, Mar 5, 2019 at 9:00 AM Steffen Winther Sørensen
<ste...@gmail.com> wrote:


On 4 Mar 2019, at 16.09, Paul Emmerich <paul.emmer...@croit.io> wrote:

Bloated to ~4 GB per OSD and you are on HDDs?

Something like that yes.


13.2.3 backported the cache auto-tuning which targets 4 GB memory
usage by default.


See https://ceph.com/releases/13-2-4-mimic-released/

Right, thanks…


The bluestore_cache_* options are no longer needed. They are replaced
by osd_memory_target, defaulting to 4GB. BlueStore will expand
and contract its cache to attempt to stay within this
limit. Users upgrading should note this is a higher default
than the previous bluestore_cache_size of 1GB, so OSDs using
BlueStore will use more memory by default.
For more details, see the BlueStore docs.

Adding a 'osd memory target’ value to our ceph.conf and restarting an OSD just 
makes the OSD dump like this:

[osd]
   ; this key makes 13.2.4 OSDs abort???
   osd memory target = 1073741824

   ; other OSD key settings
   osd pool default size = 2  # Write an object 2 times.
   osd pool default min size = 1 # Allow writing one copy in a degraded state.

   osd pool default pg num = 256
   osd pool default pgp num = 256

   client cache size = 131072
   osd client op priority = 40
   osd op threads = 8
   osd client message size cap = 512
   filestore min sync interval = 10
   filestore max sync interval = 60

   recovery max active = 2
   recovery op priority = 30
   osd max backfills = 2




osd log snippet:
  -472> 2019-03-05 08:36:02.233 7f2743a8c1c0  1 -- - start start
  -471> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 init 
/var/lib/ceph/osd/ceph-12 (looks like hdd)
  -470> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 journal 
/var/lib/ceph/osd/ceph-12/journal
  -469> 2019-03-05 08:36:02.234 7f2743a8c1c0  1 
bluestore(/var/lib/ceph/osd/ceph-12) _mount path /var/lib/ceph/osd/ceph-12
  -468> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev create path 
/var/lib/ceph/osd/ceph-12/block type kernel
  -467> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev(0x55b31af4a000 
/var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
  -466> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 bdev(0x55b31af4a000 
/var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c800000, 137 GiB) 
block_size 4096 (4 KiB) rotational
  -465> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 
bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 1073741824 meta 
0.4 kv 0.4 data 0.2
  -464> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev create path 
/var/lib/ceph/osd/ceph-12/block type kernel
  -463> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
/var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
  -462> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
/var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c800000, 137 GiB) 
block_size 4096 (4 KiB) rotational
  -461> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs add_block_device bdev 1 
path /var/lib/ceph/osd/ceph-12/block size 137 GiB
  -460> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs mount
  -459> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
compaction_readahead_size = 2097152
  -458> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option compression 
= kNoCompression
  -457> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
max_write_buffer_number = 4
  -456> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
min_write_buffer_number_to_merge = 1
  -455> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
recycle_log_file_num = 4
  -454> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
writable_file_max_buffer_size = 0
  -453> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
write_buffer_size = 268435456
  -452> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
compaction_readahead_size = 2097152
  -451> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option compression 
= kNoCompression
  -450> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
max_write_buffer_number = 4
  -449> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
min_write_buffer_number_to_merge = 1
  -448> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
recycle_log_file_num = 4
  -447> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
writable_file_max_buffer_size = 0
  -446> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
write_buffer_size = 268435456
  -445> 2019-03-05 08:36:02.340 7f2743a8c1c0  1 rocksdb: do_open column 
families: [default]
  -444> 2019-03-05 08:36:02.341 7f2743a8c1c0  4 rocksdb: RocksDB version: 5.13.0
  -443> 2019-03-05 08:36:02.342 7f2743a8c1c0  4 rocksdb: Git sha 
rocksdb_build_git_sha:@0@
  -442> 2019-03-05 08:36:02.342 7f2743a8c1c0  4 rocksdb: Compile date Jan  4 
2019
...
  -271> 2019-03-05 08:36:02.431 7f2743a8c1c0  1 freelist init
  -270> 2019-03-05 08:36:02.535 7f2743a8c1c0  1 
bluestore(/var/lib/ceph/osd/ceph-12) _open_alloc opening allocation metadata
  -269> 2019-03-05 08:36:02.714 7f2743a8c1c0  1 
bluestore(/var/lib/ceph/osd/ceph-12) _open_alloc loaded 93 GiB in 31828 extents
  -268> 2019-03-05 08:36:02.722 7f2743a8c1c0  2 osd.12 0 journal looks like hdd
  -267> 2019-03-05 08:36:02.722 7f2743a8c1c0  2 osd.12 0 boot
  -266> 2019-03-05 08:36:02.723 7f272a0f3700  5 
bluestore.MempoolThread(0x55b31af46a30) _tune_cache_size target: 1073741824 heap: 
64675840 unmapped: 786432 mapped: 63889408 old cache_size: 134217728 new cache 
size: 17349132402135320576
  -265> 2019-03-05 08:36:02.723 7f272a0f3700  5 
bluestore.MempoolThread(0x55b31af46a30) _trim_shards cache_size: 
17349132402135320576 kv_alloc: 134217728 kv_used: 5099462 meta_alloc: 0 meta_used: 
21301 data_alloc: 0 data_used: 0
...
2019-03-05 08:36:40.166 7f03fc57f700  1 osd.12 pg_epoch: 7063 pg[2.93( v 6687'5 (0'0,6687'5] 
local-lis/les=7015/7016 n=1 ec=103/103 lis/c 7015/7015 les/c/f 7016/7016/0 7063/7063/7063) 
[12,19] r=0 lpr=7063 pi=[7015,7063)/1 crt=6687'5 lcod 0'0 mlcod 0'0 unknown NOTIFY mbc={}] 
start_peering_interval up [19] -> [12,19], acting [19] -> [12,19], acting_primary 19 
-> 12, up_primary 19 -> 12, role -1 -> 0, features acting 4611087854031142907 
upacting 4611087854031142907
2019-03-05 08:36:40.167 7f03fc57f700  1 osd.12 pg_epoch: 7063 pg[2.93( v 6687'5 
(0'0,6687'5] local-lis/les=7015/7016 n=1 ec=103/103 lis/c 7015/7015 les/c/f 
7016/7016/0 7063/7063/7063) [12,19] r=0 lpr=7063 pi=[7015,7063)/1 crt=6687'5 lcod 0'0 
mlcod 0'0 unknown mbc={}] state<Start>: transitioning to Primary
2019-03-05 08:36:40.167 7f03fb57d700  1 osd.12 pg_epoch: 7061 pg[2.40( v 6964'703 
(0'0,6964'703] local-lis/les=6999/7000 n=1 ec=103/103 lis/c 6999/6999 les/c/f 7000/7000/0 
7061/7061/6999) [8] r=-1 lpr=7061 pi=[6999,7061)/1 crt=6964'703 lcod 0'0 unknown mbc={}] 
start_peering_interval up [8,12] -> [8], acting [8,12] -> [8], acting_primary 8 -> 8, 
up_primary 8 -> 8, role 1 -> -1, features acting 4611087854031142907 upacting 
4611087854031142907
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 rgw_sync
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 xio
   1/ 5 compressor
   1/ 5 bluestore
   1/ 5 bluefs
   1/ 3 bdev
   1/ 5 kstore
   4/ 5 rocksdb
   4/ 5 leveldb
   4/ 5 memdb
   1/ 5 kinetic
   1/ 5 fuse
   1/ 5 mgr
   1/ 5 mgrc
   1/ 5 dpdk
   1/ 5 eventtrace
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.12.log
--- end dump of recent events ---

2019-03-05 08:36:07.750 7f272a0f3700 -1 *** Caught signal (Aborted) **
in thread 7f272a0f3700 thread_name:bstore_mempool

ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
1: (()+0x911e70) [0x55b318337e70]
2: (()+0xf5d0) [0x7f2737a4e5d0]
3: (gsignal()+0x37) [0x7f2736a6f207]
4: (abort()+0x148) [0x7f2736a708f8]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x242) 
[0x7f273aec62b2]
6: (()+0x25a337) [0x7f273aec6337]
7: (()+0x7a886e) [0x55b3181ce86e]
8: (BlueStore::MempoolThread::entry()+0x3b0) [0x55b3181d0060]
9: (()+0x7dd5) [0x7f2737a46dd5]
10: (clone()+0x6d) [0x7f2736b36ead]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.


Even without the ‘osd memory target’ conf key, OSD claims on start:

bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 1073741824

Any hints appreciated!

/Steffen


Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Mon, Mar 4, 2019 at 3:55 PM Steffen Winther Sørensen
<ste...@gmail.com> wrote:


List Members,

patched a centos 7  based cluster from 13.2.2 to 13.2.4 last monday, everything 
appeared working fine.

Only this morning I found all OSDs in the cluster to be bloated in memory foot 
print, possible after weekend backup through MDS.

Anyone else seeing possible memory leak in 13.2.4 OSD possible primarily when 
using MDS?

TIA

/Steffen
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to