Hi Wido, Thanks for the information and let us know if this is a bug. As workaround we will go with small bluestore_cache_size to 100MB.
Thanks, Muthu On 16 February 2017 at 14:04, Wido den Hollander <w...@42on.com> wrote: > > > Op 16 februari 2017 om 7:19 schreef Muthusamy Muthiah < > muthiah.muthus...@gmail.com>: > > > > > > Thanks IIya Letkowski for the information we will change this value > > accordingly. > > > > What I understand from yesterday's performance meeting is that this seems > like a bug. Lowering this buffer reduces memory, but the root-cause seems > to be memory not being freed. A few bytes of a larger allocation still > allocated causing this buffer not to be freed. > > Tried: > > debug_mempools = true > > $ ceph daemon osd.X dump_mempools > > Might want to view the YouTube video of yesterday when it's online: > https://www.youtube.com/channel/UCno-Fry25FJ7B4RycCxOtfw/videos > > Wido > > > Thanks, > > Muthu > > > > On 15 February 2017 at 17:03, Ilya Letkowski <mj12.svetz...@gmail.com> > > wrote: > > > > > Hi, Muthusamy Muthiah > > > > > > I'm not totally sure that this is a memory leak. > > > We had same problems with bluestore on ceph v11.2.0. > > > Reduce bluestore cache helped us to solve it and stabilize OSD memory > > > consumption on the 3GB level. > > > > > > Perhaps this will help you: > > > > > > bluestore_cache_size = 104857600 > > > > > > > > > > > > On Tue, Feb 14, 2017 at 11:52 AM, Muthusamy Muthiah < > > > muthiah.muthus...@gmail.com> wrote: > > > > > >> Hi All, > > >> > > >> On all our 5 node cluster with ceph 11.2.0 we encounter memory leak > > >> issues. > > >> > > >> Cluster details : 5 node with 24/68 disk per node , EC : 4+1 , RHEL > 7.2 > > >> > > >> Some traces using sar are below and attached the memory utilisation > graph > > >> . > > >> > > >> (16:54:42)[cn2.c1 sa] # sar -r > > >> 07:50:01 kbmemfree kbmemused %memused kbbuffers kbcached kbcommit > %commit > > >> kbactive kbinact kbdirty > > >> 10:20:01 32077264 132754368 80.54 16176 3040244 77767024 47.18 > 51991692 > > >> 2676468 260 > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> *10:30:01 32208384 132623248 80.46 16176 3048536 77832312 47.22 > 51851512 > > >> 2684552 1210:40:01 32067244 132764388 80.55 16176 3059076 77832316 > 47.22 > > >> 51983332 2694708 26410:50:01 30626144 134205488 81.42 16176 3064340 > > >> 78177232 47.43 53414144 2693712 411:00:01 28927656 135903976 82.45 > 16176 > > >> 3074064 78958568 47.90 55114284 2702892 1211:10:01 27158548 137673084 > 83.52 > > >> 16176 3080600 80553936 48.87 56873664 2708904 1211:20:01 26455556 > 138376076 > > >> 83.95 16176 3080436 81991036 49.74 57570280 2708500 811:30:01 26002252 > > >> 138829380 84.22 16176 3090556 82223840 49.88 58015048 2718036 > 1611:40:01 > > >> 25965924 138865708 84.25 16176 3089708 83734584 50.80 58049980 2716740 > > >> 1211:50:01 26142888 138688744 84.14 16176 3089544 83800100 50.84 > 57869628 > > >> 2715400 16* > > >> > > >> ... > > >> ... > > >> > > >> In the attached graph, there is increase in memory utilisation by > > >> ceph-osd during soak test. And when it reaches the system limit of > 128GB > > >> RAM , we could able to see the below dmesg logs related to memory out > when > > >> the system reaches close to 128GB RAM. OSD.3 killed due to Out of > memory > > >> and started again. > > >> > > >> [Tue Feb 14 03:51:02 2017] *tp_osd_tp invoked oom-killer: > > >> gfp_mask=0x280da, order=0, oom_score_adj=0* > > >> [Tue Feb 14 03:51:02 2017] tp_osd_tp cpuset=/ mems_allowed=0-1 > > >> [Tue Feb 14 03:51:02 2017] CPU: 20 PID: 11864 Comm: tp_osd_tp Not > tainted > > >> 3.10.0-327.13.1.el7.x86_64 #1 > > >> [Tue Feb 14 03:51:02 2017] Hardware name: HP ProLiant XL420 > Gen9/ProLiant > > >> XL420 Gen9, BIOS U19 09/12/2016 > > >> [Tue Feb 14 03:51:02 2017] ffff8819ccd7a280 0000000030e84036 > > >> ffff881fa58f7528 ffffffff816356f4 > > >> [Tue Feb 14 03:51:02 2017] ffff881fa58f75b8 ffffffff8163068f > > >> ffff881fa3478360 ffff881fa3478378 > > >> [Tue Feb 14 03:51:02 2017] ffff881fa58f75e8 ffff8819ccd7a280 > > >> 0000000000000001 000000000001f65f > > >> [Tue Feb 14 03:51:02 2017] Call Trace: > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff816356f4>] dump_stack+0x19/0x1b > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8163068f>] > dump_header+0x8e/0x214 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8116ce7e>] > > >> oom_kill_process+0x24e/0x3b0 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8116c9e6>] ? > > >> find_lock_task_mm+0x56/0xc0 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8116d6a6>] > > >> *out_of_memory+0x4b6/0x4f0* > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81173885>] > > >> __alloc_pages_nodemask+0xa95/0xb90 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811b792a>] > > >> alloc_pages_vma+0x9a/0x140 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811976c5>] > > >> handle_mm_fault+0xb85/0xf50 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811957fb>] ? > > >> follow_page_mask+0xbb/0x5c0 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81197c2b>] > > >> __get_user_pages+0x19b/0x640 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8119843d>] > > >> get_user_pages_unlocked+0x15d/0x1f0 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8106544f>] > > >> get_user_pages_fast+0x9f/0x1a0 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8121de78>] > > >> do_blockdev_direct_IO+0x1a78/0x2610 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81218c40>] ? I_BDEV+0x10/0x10 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8121ea65>] > > >> __blockdev_direct_IO+0x55/0x60 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81218c40>] ? I_BDEV+0x10/0x10 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81219297>] > > >> blkdev_direct_IO+0x57/0x60 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81218c40>] ? I_BDEV+0x10/0x10 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8116af63>] > > >> generic_file_aio_read+0x6d3/0x750 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffffa038ad5c>] ? > > >> xfs_iunlock+0x11c/0x130 [xfs] > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811690db>] ? > unlock_page+0x2b/0x30 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81192f21>] ? > __do_fault+0x401/0x510 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8121970c>] > blkdev_aio_read+0x4c/0x70 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811ddcfd>] > do_sync_read+0x8d/0xd0 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811de45c>] vfs_read+0x9c/0x170 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811df182>] SyS_pread64+0x92/0xc0 > > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81645e89>] > > >> system_call_fastpath+0x16/0x1b > > >> > > >> > > >> Feb 14 03:51:40 fr-paris kernel: *Out of memory: Kill process 7657 > > >> (ceph-osd) score 45 or sacrifice child* > > >> Feb 14 03:51:40 fr-paris kernel: Killed process 7657 (ceph-osd) > > >> total-vm:8650208kB, anon-rss:6124660kB, file-rss:1560kB > > >> Feb 14 03:51:41 fr-paris systemd:* ceph-osd@3.service: main process > > >> exited, code=killed, status=9/KILL* > > >> Feb 14 03:51:41 fr-paris systemd: Unit ceph-osd@3.service entered > failed > > >> state. > > >> Feb 14 03:51:41 fr-paris systemd: *ceph-osd@3.service failed.* > > >> Feb 14 03:51:41 fr-paris systemd: cassandra.service: main process > exited, > > >> code=killed, status=9/KILL > > >> Feb 14 03:51:41 fr-paris systemd: Unit cassandra.service entered > failed > > >> state. > > >> Feb 14 03:51:41 fr-paris systemd: cassandra.service failed. > > >> Feb 14 03:51:41 fr-paris ceph-mgr: 2017-02-14 03:51:41.978878 > > >> 7f51a3154700 -1 mgr ms_dispatch osd_map(7517..7517 src has > 6951..7517) v3 > > >> Feb 14 03:51:42 fr-paris systemd: Device > dev-disk-by\x2dpartlabel-ceph\x5cx20block.device > > >> appeared twice with different sysfs paths > /sys/devices/pci0000:00/0000:0 > > >> 0:03.2/0000:03:00.0/host0/target0:0:0/0:0:0:9/block/sdj/sdj2 and > > >> /sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/targ > > >> et0:0:0/0:0:0:4/block/sde/sde2 > > >> Feb 14 03:51:42 fr-paris ceph-mgr: 2017-02-14 03:51:42.992477 > > >> 7f51a3154700 -1 mgr ms_dispatch osd_map(7518..7518 src has > 6951..7518) v3 > > >> Feb 14 03:51:43 fr-paris ceph-mgr: 2017-02-14 03:51:43.508990 > > >> 7f51a3154700 -1 mgr ms_dispatch mgrdigest v1 > > >> Feb 14 03:51:48 fr-paris ceph-mgr: 2017-02-14 03:51:48.508970 > > >> 7f51a3154700 -1 mgr ms_dispatch mgrdigest v1 > > >> Feb 14 03:51:53 fr-paris ceph-mgr: 2017-02-14 03:51:53.509592 > > >> 7f51a3154700 -1 mgr ms_dispatch mgrdigest v1 > > >> Feb 14 03:51:58 fr-paris ceph-mgr: 2017-02-14 03:51:58.509936 > > >> 7f51a3154700 -1 mgr ms_dispatch mgrdigest v1 > > >> Feb 14 03:52:01 fr-paris systemd: ceph-osd@3.service holdoff time > over, > > >> scheduling restart. > > >> Feb 14 03:52:02 fr-paris systemd: *Starting Ceph object storage daemon > > >> osd.3.*.. > > >> Feb 14 03:52:02 fr-paris systemd: Started Ceph object storage daemon > > >> osd.3. > > >> Feb 14 03:52:02 fr-paris numactl: 2017-02-14 03:52:02.307106 > 7f1e499bb940 > > >> -1 WARNING: the following dangerous and experimental features are > enabled: > > >> bluestore,rocksdb > > >> Feb 14 03:52:02 fr-paris numactl: 2017-02-14 03:52:02.317687 > 7f1e499bb940 > > >> -1 WARNING: the following dangerous and experimental features are > enabled: > > >> bluestore,rocksdb > > >> Feb 14 03:52:02 fr-paris numactl: starting osd.3 at - osd_data > > >> /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal > > >> Feb 14 03:52:02 fr-paris numactl: 2017-02-14 03:52:02.333522 > 7f1e499bb940 > > >> -1 WARNING: experimental feature 'bluestore' is enabled > > >> Feb 14 03:52:02 fr-paris numactl: Please be aware that this feature is > > >> experimental, untested, > > >> Feb 14 03:52:02 fr-paris numactl: unsupported, and may result in data > > >> corruption, data loss, > > >> Feb 14 03:52:02 fr-paris numactl: and/or irreparable damage to your > > >> cluster. Do not use > > >> Feb 14 03:52:02 fr-paris numactl: feature with important data. > > >> > > >> This seems to happen only in 11.2.0 and not in 11.1.x . Could you > please > > >> help us in resolving this issue by means of any config change to > limit the > > >> memory use on ceph-osd or a bug in the current kraken release. > > >> > > >> Thanks, > > >> Muthu > > >> > > >> _______________________________________________ > > >> ceph-users mailing list > > >> ceph-users@lists.ceph.com > > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >> > > >> > > > > > > > > > -- > > > С уважением / Best regards > > > > > > Илья Летковский / Ilya Letkouski > > > > > > Phone, Viber: +375 29 3237335 > > > > > > Minsk, Belarus (GMT+3) > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com