[ceph-users] Re: ceph-mon store.db disk usage increase on OSD-Host fail

2020-03-12 Thread XuYun
We got the same problem today while we were adding memory to OSD nodes, and it decreased monitor’s performance a lot. I noticed that the db kept increasing after an OSD is shutdown, so I guess that it is caused by the warning reports collected by mgr insights module. When I disabled the mgr

[ceph-users] Re: Nautilus OSD memory consumption?

2020-03-11 Thread XuYun
Hi Nigel, Any update to this issue? Br, Xu Yun > 2020年2月27日 上午10:10,Nigel Williams 写道: > > On Thu, 27 Feb 2020 at 13:08, Nigel Williams > wrote >> On Thu, 27 Feb 2020 at 06:27, Anthony D'Atri wrote: >>> If the heap stats reported by telling the OSD `heap stats` is large, >>> telling each

[ceph-users] Sporadic mgr segmentation fault

2020-04-21 Thread XuYun
Dear ceph users, We are experiencing sporadic mgr crash in all three ceph clusters (version 14.2.6 and version 14.2.8), the crash log is: 2020-04-17 23:10:08.986 7fed7fe07700 -1

[ceph-users] Re: Sporadic mgr segmentation fault

2020-04-25 Thread XuYun
10: (clone()+0x6d) [0x7f8cf617a88d] NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. Is there an issue opened for it? BR, Xu Yun > 2020年4月23日 上午10:28,XuYun 写道: > > Thank you, Brad. We’ll try to upgrade 14.2.9 today. > >> 2020年4月23日 上午7:21,Brad Hubbard > <

[ceph-users] Re: Sporadic mgr segmentation fault

2020-04-22 Thread XuYun
Thank you, Brad. We’ll try to upgrade 14.2.9 today. > 2020年4月23日 上午7:21,Brad Hubbard 写道: > > On Tue, Apr 21, 2020 at 11:39 PM XuYun mailto:yu...@me.com>> > wrote: >> >> Dear ceph users, >> >> We are experiencing sporadic mgr crash in all three cep

[ceph-users] Re: Problem with OSD::osd_op_tp thread had timed out and other connected issues

2020-03-21 Thread XuYun
48GB used). > > On 21/03/2020 13.14, XuYun wrote: >> Bluestore requires more than 4G memory per OSD, do you have enough memory? >> >>> 2020年3月21日 下午8:09,Jan Pekař - Imatic 写道: >>> >>> Hello, >>> >>> I have ceph cluster version

[ceph-users] Re: Problem with OSD::osd_op_tp thread had timed out and other connected issues

2020-03-21 Thread XuYun
Bluestore requires more than 4G memory per OSD, do you have enough memory? > 2020年3月21日 下午8:09,Jan Pekař - Imatic 写道: > > Hello, > > I have ceph cluster version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) > nautilus (stable) > > 4 nodes - each node 11 HDD, 1 SSD, 10Gbit network > >

[ceph-users] Re: ceph rbd volumes/images IO details

2020-03-07 Thread XuYun
You can enable prometheus module of mgr if you are running Nautilus. > 2020年3月8日 上午2:15,M Ranga Swami Reddy 写道: > > On Fri, Mar 6, 2020 at 1:06 AM M Ranga Swami Reddy > wrote: > >> Hello, >> Can we get the IOPs of any rbd image/volume? >> >> For ex: I have created volumes via OpenStack

[ceph-users] Re: log_latency_fn slow operation

2020-03-07 Thread XuYun
I finally figured out this problem: swap memory was assigned to OSD processes for some reasons (vm.swappiness is already set to 0) which decreased the performance of KV. I restarted OSDs and switch swap off. Now the warning seems disappeared from OSD logs. > 2020年3月4日 上午11:08,徐蕴 写道: > > Hi,

[ceph-users] Re: Data loss by adding 2OSD causing Long heartbeat ping times

2020-05-07 Thread XuYun
We had got some ping back/front problems after upgrading from filestore to bluestore. It turned out to be related to insufficient memory/swap usage. > 2020年5月6日 下午10:08,Frank Schilder 写道: > > To answer some of my own questions: > > 1) Setting > > ceph osd set noout > ceph osd set nodown >

[ceph-users] Suspicious memory leakage

2020-06-30 Thread XuYun
Hi, We’ve observed some suspicious memory leak problems of MGR since upgraded to Nautilus. Yesterday I upgrade our cluster to the latest 14.2.10 and this problem seems still reproducible. According to the monitoring chart (memory usage of the active mgr node), the memory consumption started

[ceph-users] Re: Suspicious memory leakage

2020-06-30 Thread XuYun
Seems the attached log file is missing: https://pastebin.com/wAULN20N <https://pastebin.com/wAULN20N> > 2020年6月30日 下午1:26,XuYun 写道: > > Hi, > > We’ve observed some suspicious memory leak problems of MGR since upgraded to > Nautilus. > Yesterday I upgrade our cl

[ceph-users] Re: mgr log shows a lot of ms_handle_reset messages

2020-06-28 Thread XuYun
It seems that this log is produced by MDSDaemon, but we don’t have MDS installed. > 2020年6月28日 下午9:33,Zhenshi Zhou <mailto:deader...@gmail.com>> 写道: > > From my point of view, I don't think it's an error. But the messages are much > more frequent. > > XuYun mailto:

[ceph-users] mgr log shows a lot of ms_handle_reset messages

2020-06-28 Thread XuYun
Hi, We are running Ceph nautilus (14.2.10 now), every mgr reports a ‘ms_handle_reset’ message every second. Would it be a normal behavior? 2020-06-28 16:35:43.580 7fcdc96e6700 0 client.0 ms_handle_reset on v2:111.111.121.3:6807/7 2020-06-28 16:35:44.580 7fcdc96e6700 0 client.0

[ceph-users] Re: mgr log shows a lot of ms_handle_reset messages

2020-06-28 Thread XuYun
k you can increase the > log level and see if there is an error or not. > > XuYun mailto:yu...@me.com>> 于2020年6月28日周日 下午4:39写道: > Hi, > > We are running Ceph nautilus (14.2.10 now), every mgr reports a > ‘ms_handle_reset’ message every second. Would it be a normal be

[ceph-users] Re: ceph mgr memory leak

2020-07-28 Thread XuYun
We have the same mgr memory leak problem. I doubt it’s related to the PID which is used to identify peer address. Maybe you cloud try to set the ‘PidMode’ to ‘host’ in your deployment. > 2020年7月28日 上午2:44,Frank Ritchie 写道: > > Hi all, > > When running containerized Ceph (Nautilus) is anyone