I experienced this as well on tiny Ceph cluster testing... HW spec - 3x Intel i7-4770K quad core 32Gb m2/ssd 8Gb memory Dell PERC H200 6 x 3Tb Seagate Centos 7.x Ceph 12.x
I also run 3 memory hungry procs on the Ceph nodes. Obviously there is a memory problem here. Here are the steps I took avoid oom-killer killing the node ... /etc/rc.local - for i in $(pgrep ceph-mon); do echo -17 > /proc/$i/oom_score_adj; done for i in $(pgrep ceph-osd); do echo -17 > /proc/$i/oom_score_adj; done for i in $(pgrep ceph-mgr); do echo 50 > /proc/$i/oom_score_adj; done /etc/sysctl.conf - vm.swappiness = 100 vm.vfs_cache_pressure = 1000 vm.min_free_kbytes = 512 /etc/ceph/ceph.conf - [osd] bluestore_cache_size = 52428800 bluestore_cache_size_hdd = 52428800 bluestore_cache_size_ssd = 52428800 bluestore_cache_kv_max = 52428800 You're going to see memory page-{in,out} skyrocket with this setup but it should keep oom-killer at bay until a memory fix can be applied. Client performance to the cluster wasn't spectacular but wasn't terrible. I was seeing +/- 60Mb/sec of bandwidth. Ultimately I upgraded the nodes to 16Gb /Chris C On Tue, Oct 31, 2017 at 10:30 PM, shadow_lin <shadow_...@163.com> wrote: > Hi Sage, > We have tried compiled the latest ceph source code from github. > The build is ceph version 12.2.1-249-g42172a4 ( > 42172a443183ffe6b36e85770e53fe678db293bf) luminous (stable). > The memory problem seems better but the memory usage of osd is still keep > increasing as more data are wrote into the rbd image and the memory usage > won't drop after the write is stopped. > Could you specify from which commit the memeory bug is fixed? > Thanks > 2017-11-01 > ------------------------------ > lin.yunfan > ------------------------------ > > *发件人:*Sage Weil <s...@newdream.net> > *发送时间:*2017-10-24 20:03 > *主题:*Re: [ceph-users] [luminous]OSD memory usage increase when writing a > lot of data to cluster > *收件人:*"shadow_lin"<shadow_...@163.com> > *抄送:*"ceph-users"<ceph-users@lists.ceph.com> > > On Tue, 24 Oct 2017, shadow_lin wrote: > > BLOCKQUOTE{margin-Top: 0px; margin-Bottom: 0px; margin-Left: 2em} body > > {border-width:0;margin:0} img {border:0;margin:0;padding:0} Hi All, > > The cluster has 24 osd with 24 8TB hdd. > > Each osd server has 2GB ram and runs 2OSD with 2 8TBHDD. > I know the memory > > is below the remmanded value, but this osd server is > an ARM server so I > > can't do anything to add more ram. > > I created a replicated(2 rep) pool and an 20TB image > and mounted to the test > > server with xfs fs. > > > > I have set the ceph.conf to this(according to other > related post suggested): > > [osd] > > bluestore_cache_size = 104857600 > > bluestore_cache_size_hdd = 104857600 > > bluestore_cache_size_ssd = 104857600 > > bluestore_cache_kv_max = 103809024 > > > > osd map cache size = 20 > > osd map max advance = 10 > > osd map share max epochs = 10 > > osd pg epoch persisted max stale = 10 > > The bluestore cache setting did improve the situation,but > if i try to write > > 1TB data by dd command(dd if=/dev/zero of=test bs=1G > count=1000) to rbd the > > osd will eventually be killed by oom killer. > > If I only wirte like 100G data once then everything is fine. > > > > Why does the osd memory usage keep increasing whle writing ? > > Is there anything I can do to reduce the memory usage? > > There is a bluestore memory bug that was fixed just after 12.2.1 was > released; it will be fixed in 12.2.2. In the meantime, you can run > consider running the latest luminous branch (not fully tested) from > https://shaman.ceph.com/builds/ceph/luminous. > > sage > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com