Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
Hi, I have upgraded to 12.2.7 , 2 weeks ago, and I don't see anymore memory increase ! (can't confirm that it was related to your patch). Thanks again for helping ! Regards, Alexandre Derumier - Mail original - De: "Zheng Yan" À: "aderumier" Cc: "ceph-users" Envoyé: Mardi 29 Mai 2018 04:40:27 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? Could you try path https://github.com/ceph/ceph/pull/22240/files. The leakage of MMDSBeacon messages can explain your issue. Regards Yan, Zheng On Mon, May 28, 2018 at 12:06 PM, Alexandre DERUMIER wrote: >>>could you send me full output of dump_mempools > > # ceph daemon mds.ceph4-2.odiso.net dump_mempools > { > "bloom_filter": { > "items": 41262668, > "bytes": 41262668 > }, > "bluestore_alloc": { > "items": 0, > "bytes": 0 > }, > "bluestore_cache_data": { > "items": 0, > "bytes": 0 > }, > "bluestore_cache_onode": { > "items": 0, > "bytes": 0 > }, > "bluestore_cache_other": { > "items": 0, > "bytes": 0 > }, > "bluestore_fsck": { > "items": 0, > "bytes": 0 > }, > "bluestore_txc": { > "items": 0, > "bytes": 0 > }, > "bluestore_writing_deferred": { > "items": 0, > "bytes": 0 > }, > "bluestore_writing": { > "items": 0, > "bytes": 0 > }, > "bluefs": { > "items": 0, > "bytes": 0 > }, > "buffer_anon": { > "items": 712726, > "bytes": 106964870 > }, > "buffer_meta": { > "items": 15, > "bytes": 1320 > }, > "osd": { > "items": 0, > "bytes": 0 > }, > "osd_mapbl": { > "items": 0, > "bytes": 0 > }, > "osd_pglog": { > "items": 0, > "bytes": 0 > }, > "osdmap": { > "items": 216, > "bytes": 12168 > }, > "osdmap_mapping": { > "items": 0, > "bytes": 0 > }, > "pgmap": { > "items": 0, > "bytes": 0 > }, > "mds_co": { > "items": 50741038, > "bytes": 5114319203 > }, > "unittest_1": { > "items": 0, > "bytes": 0 > }, > "unittest_2": { > "items": 0, > "bytes": 0 > }, > "total": { > "items": 92716663, > "bytes": 5262560229 > } > } > > > > > > ceph daemon mds.ceph4-2.odiso.net perf dump > { > "AsyncMessenger::Worker-0": { > "msgr_recv_messages": 1276789161, > "msgr_send_messages": 1317625246, > "msgr_recv_bytes": 10630409633633, > "msgr_send_bytes": 1093972769957, > "msgr_created_connections": 207, > "msgr_active_connections": 204, > "msgr_running_total_time": 63745.463077594, > "msgr_running_send_time": 22210.867549070, > "msgr_running_recv_time": 51944.624353942, > "msgr_running_fast_dispatch_time": 9185.274084187 > }, > "AsyncMessenger::Worker-1": { > "msgr_recv_messages": 641622644, > "msgr_send_messages": 616664293, > "msgr_recv_bytes": 7287546832466, > "msgr_send_bytes": 588278035895, > "msgr_created_connections": 494, > "msgr_active_connections": 494, > "msgr_running_total_time": 35390.081250881, > "msgr_running_send_time": 11559.689889195, > "msgr_running_recv_time": 29844.885712902, > "msgr_running_fast_dispatch_time": 6361.466445253 > }, > "AsyncMessenger::Worker-2": { > "msgr_recv_messages": 1972469623, > "msgr_send_messages": 1886060294, > "msgr_recv_bytes": 7924136565846, > "msgr_send_bytes": 5072502101797, > "msgr_created_connections": 181, > "msgr_active_connections": 176, > "msgr_running_total_time": 93257.811989806, > "msgr_running_send_time": 35556.662488302, > "msgr_running_recv_time": 81686.262228047, > "msgr_running_fast_dispatch_time": 6476.875317930 > }, > "finisher-PurgeQueue": { > "queue_len": 0, > "complete_latency": { &g
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
>>Could you try path https://github.com/ceph/ceph/pull/22240/files. >> >>The leakage of MMDSBeacon messages can explain your issue. Thanks. I can't test it in production for now, and I can't reproduce it in my test environment. I'll wait for next luminous release to test it. Thanks you very much again ! I'll keep you in touch in this thread. Regards, Alexandre - Mail original - De: "Zheng Yan" À: "aderumier" Cc: "ceph-users" Envoyé: Mardi 29 Mai 2018 04:40:27 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? Could you try path https://github.com/ceph/ceph/pull/22240/files. The leakage of MMDSBeacon messages can explain your issue. Regards Yan, Zheng On Mon, May 28, 2018 at 12:06 PM, Alexandre DERUMIER wrote: >>>could you send me full output of dump_mempools > > # ceph daemon mds.ceph4-2.odiso.net dump_mempools > { > "bloom_filter": { > "items": 41262668, > "bytes": 41262668 > }, > "bluestore_alloc": { > "items": 0, > "bytes": 0 > }, > "bluestore_cache_data": { > "items": 0, > "bytes": 0 > }, > "bluestore_cache_onode": { > "items": 0, > "bytes": 0 > }, > "bluestore_cache_other": { > "items": 0, > "bytes": 0 > }, > "bluestore_fsck": { > "items": 0, > "bytes": 0 > }, > "bluestore_txc": { > "items": 0, > "bytes": 0 > }, > "bluestore_writing_deferred": { > "items": 0, > "bytes": 0 > }, > "bluestore_writing": { > "items": 0, > "bytes": 0 > }, > "bluefs": { > "items": 0, > "bytes": 0 > }, > "buffer_anon": { > "items": 712726, > "bytes": 106964870 > }, > "buffer_meta": { > "items": 15, > "bytes": 1320 > }, > "osd": { > "items": 0, > "bytes": 0 > }, > "osd_mapbl": { > "items": 0, > "bytes": 0 > }, > "osd_pglog": { > "items": 0, > "bytes": 0 > }, > "osdmap": { > "items": 216, > "bytes": 12168 > }, > "osdmap_mapping": { > "items": 0, > "bytes": 0 > }, > "pgmap": { > "items": 0, > "bytes": 0 > }, > "mds_co": { > "items": 50741038, > "bytes": 5114319203 > }, > "unittest_1": { > "items": 0, > "bytes": 0 > }, > "unittest_2": { > "items": 0, > "bytes": 0 > }, > "total": { > "items": 92716663, > "bytes": 5262560229 > } > } > > > > > > ceph daemon mds.ceph4-2.odiso.net perf dump > { > "AsyncMessenger::Worker-0": { > "msgr_recv_messages": 1276789161, > "msgr_send_messages": 1317625246, > "msgr_recv_bytes": 10630409633633, > "msgr_send_bytes": 1093972769957, > "msgr_created_connections": 207, > "msgr_active_connections": 204, > "msgr_running_total_time": 63745.463077594, > "msgr_running_send_time": 22210.867549070, > "msgr_running_recv_time": 51944.624353942, > "msgr_running_fast_dispatch_time": 9185.274084187 > }, > "AsyncMessenger::Worker-1": { > "msgr_recv_messages": 641622644, > "msgr_send_messages": 616664293, > "msgr_recv_bytes": 7287546832466, > "msgr_send_bytes": 588278035895, > "msgr_created_connections": 494, > "msgr_active_connections": 494, > "msgr_running_total_time": 35390.081250881, > "msgr_running_send_time": 11559.689889195, > "msgr_running_recv_time": 29844.885712902, > "msgr_running_fast_dispatch_time": 6361.466445253 > }, > "AsyncMessenger::Worker-2": { > "msgr_recv_messages": 1972469623, > "msgr_send_messages": 1886060294, > "msgr_recv_bytes": 7924136565846, > "msgr_send_bytes": 5072502101797, > "msgr_created_connections": 181, > "msgr_active_connections": 176, > "msgr_running_total_time": 93257.811989806, > "msgr_running_send_time": 35556.662488302, > "msgr_running_recv_time":
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
t; "req_lookup": 2162695254, > "req_lookuphash": 0, > "req_lookupino": 0, > "req_lookupname": 16114, > "req_lookupparent": 0, > "req_lookupsnap": 0, > "req_lssnap": 0, > "req_mkdir": 423120, > "req_mknod": 0, > "req_mksnap": 0, > "req_open": 549851331, > "req_readdir": 25836771, > "req_rename": 2865148, > "req_renamesnap": 0, > "req_rmdir": 143496, > "req_rmsnap": 0, > "req_rmxattr": 0, > "req_setattr": 6833015, > "req_setdirlayout": 0, > "req_setfilelock": 960105, > "req_setlayout": 0, > "req_setxattr": 2, > "req_symlink": 2561, > "req_unlink": 2966589 > }, > "mds_sessions": { > "session_count": 326, > "session_add": 472, > "session_remove": 146 > }, > "objecter": { > "op_active": 0, > "op_laggy": 0, > "op_send": 297200358, > "op_send_bytes": 943806252615, > "op_resend": 22, > "op_reply": 297200336, > "op": 297200336, > "op_r": 173655966, > "op_w": 123544370, > "op_rmw": 0, > "op_pg": 0, > "osdop_stat": 2843429, > "osdop_create": 5729675, > "osdop_read": 126350, > "osdop_write": 89171030, > "osdop_writefull": 365835, > "osdop_writesame": 0, > "osdop_append": 0, > "osdop_zero": 2, > "osdop_truncate": 15, > "osdop_delete": 4128067, > "osdop_mapext": 0, > "osdop_sparse_read": 0, > "osdop_clonerange": 0, > "osdop_getxattr": 46958217, > "osdop_setxattr": 11459350, > "osdop_cmpxattr": 0, > "osdop_rmxattr": 0, > "osdop_resetxattrs": 0, > "osdop_tmap_up": 0, > "osdop_tmap_put": 0, > "osdop_tmap_get": 0, > "osdop_call": 0, > "osdop_watch": 0, > "osdop_notify": 0, > "osdop_src_cmpxattr": 0, > "osdop_pgls": 0, > "osdop_pgls_filter": 0, > "osdop_other": 20547060, > "linger_active": 0, > "linger_send": 0, > "linger_resend": 0, > "linger_ping": 0, > "poolop_active": 0, > "poolop_send": 0, > "poolop_resend": 0, > "poolstat_active": 0, > "poolstat_send": 0, > "poolstat_resend": 0, > "statfs_active": 0, > "statfs_send": 0, > "statfs_resend": 0, > "command_active": 0, > "command_send": 0, > "command_resend": 0, > "map_epoch": 4048, > "map_full": 0, > "map_inc": 742, > "osd_sessions": 18, > "osd_session_open": 26, > "osd_session_close": 8, > "osd_laggy": 0, > "omap_wr": 6209755, > "omap_rd": 346748196, > "omap_del": 605991 > }, > "purge_queue": { > "pq_executing_ops": 0, > "pq_executing": 0, > "pq_executed": 3118819 > }, > "throttle-msgr_dispatch_throttler-mds": { > "val": 0, > "max": 104857600, > "get_started": 0, > "get": 3890881428, > "get_sum": 25554167806273, > "get_or_fail_fail": 0, > "get_or_fail_success": 3890881428, > "take": 0, > "take_sum": 0, > "put": 3890881428, > "put_sum": 25554167806273, > "wait": { > "avgcount": 0, > "sum": 0.0, > "avgtime&
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
23544370, "op_rmw": 0, "op_pg": 0, "osdop_stat": 2843429, "osdop_create": 5729675, "osdop_read": 126350, "osdop_write": 89171030, "osdop_writefull": 365835, "osdop_writesame": 0, "osdop_append": 0, "osdop_zero": 2, "osdop_truncate": 15, "osdop_delete": 4128067, "osdop_mapext": 0, "osdop_sparse_read": 0, "osdop_clonerange": 0, "osdop_getxattr": 46958217, "osdop_setxattr": 11459350, "osdop_cmpxattr": 0, "osdop_rmxattr": 0, "osdop_resetxattrs": 0, "osdop_tmap_up": 0, "osdop_tmap_put": 0, "osdop_tmap_get": 0, "osdop_call": 0, "osdop_watch": 0, "osdop_notify": 0, "osdop_src_cmpxattr": 0, "osdop_pgls": 0, "osdop_pgls_filter": 0, "osdop_other": 20547060, "linger_active": 0, "linger_send": 0, "linger_resend": 0, "linger_ping": 0, "poolop_active": 0, "poolop_send": 0, "poolop_resend": 0, "poolstat_active": 0, "poolstat_send": 0, "poolstat_resend": 0, "statfs_active": 0, "statfs_send": 0, "statfs_resend": 0, "command_active": 0, "command_send": 0, "command_resend": 0, "map_epoch": 4048, "map_full": 0, "map_inc": 742, "osd_sessions": 18, "osd_session_open": 26, "osd_session_close": 8, "osd_laggy": 0, "omap_wr": 6209755, "omap_rd": 346748196, "omap_del": 605991 }, "purge_queue": { "pq_executing_ops": 0, "pq_executing": 0, "pq_executed": 3118819 }, "throttle-msgr_dispatch_throttler-mds": { "val": 0, "max": 104857600, "get_started": 0, "get": 3890881428, "get_sum": 25554167806273, "get_or_fail_fail": 0, "get_or_fail_success": 3890881428, "take": 0, "take_sum": 0, "put": 3890881428, "put_sum": 25554167806273, "wait": { "avgcount": 0, "sum": 0.00000, "avgtime": 0.0 } }, "throttle-objecter_bytes": { "val": 0, "max": 104857600, "get_started": 0, "get": 0, "get_sum": 0, "get_or_fail_fail": 0, "get_or_fail_success": 0, "take": 297200336, "take_sum": 944272996789, "put": 272525107, "put_sum": 944272996789, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-objecter_ops": { "val": 0, "max": 1024, "get_started": 0, "get": 0, "get_sum": 0, "get_or_fail_fail": 0, "get_or_fail_success": 0, "take": 297200336, "take_sum": 297200336, "put": 297200336, "put_sum": 297200336, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-write_buf_throttle": { "val": 0, "max": 3758096384, "get_started": 0, "get": 3118819, "get_sum": 290050463, "get_or_fail_fail": 0, "get_or_fail_success": 3118819, "take": 0, "take_sum": 0, "put": 126240, "put_sum": 290050463, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-write_buf_throttle-0x55decea8e140": { "val": 117619, "max": 3758096384, "get_started": 0,
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
could you send me full output of dump_mempools On Thu, May 24, 2018 at 7:22 PM, Alexandre DERUMIER wrote: > Thanks! > > > here the profile.pdf > > 10-15min profiling, I can't do it longer because my clients where lagging. > > but I think it should be enough to observe the rss memory increase. > > > > > - Mail original - > De: "Zheng Yan" > À: "aderumier" > Cc: "ceph-users" > Envoyé: Jeudi 24 Mai 2018 11:34:20 > Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? > > On Tue, May 22, 2018 at 3:11 PM, Alexandre DERUMIER > wrote: >> Hi,some new stats, mds memory is not 16G, >> >> I have almost same number of items and bytes in cache vs some weeks ago when >> mds was using 8G. (ceph 12.2.5) >> >> >> root@ceph4-2:~# while sleep 1; do ceph daemon mds.ceph4-2.odiso.net perf >> dump | jq '.mds_mem.rss'; ceph daemon mds.ceph4-2.odiso.net dump_mempools | >> jq -c '.mds_co'; done >> 16905052 >> {"items":43350988,"bytes":5257428143} >> 16905052 >> {"items":43428329,"bytes":5283850173} >> 16905052 >> {"items":43209167,"bytes":5208578149} >> 16905052 >> {"items":43177631,"bytes":5198833577} >> 16905052 >> {"items":43312734,"bytes":5252649462} >> 16905052 >> {"items":43355753,"bytes":5277197972} >> 16905052 >> {"items":43700693,"bytes":5303376141} >> 16905052 >> {"items":43115809,"bytes":5156628138} >> ^C >> >> >> >> >> root@ceph4-2:~# ceph status >> cluster: >> id: e22b8e83-3036-4fe5-8fd5-5ce9d539beca >> health: HEALTH_OK >> >> services: >> mon: 3 daemons, quorum ceph4-1,ceph4-2,ceph4-3 >> mgr: ceph4-1.odiso.net(active), standbys: ceph4-2.odiso.net, >> ceph4-3.odiso.net >> mds: cephfs4-1/1/1 up {0=ceph4-2.odiso.net=up:active}, 2 up:standby >> osd: 18 osds: 18 up, 18 in >> rgw: 3 daemons active >> >> data: >> pools: 11 pools, 1992 pgs >> objects: 75677k objects, 6045 GB >> usage: 20579 GB used, 6246 GB / 26825 GB avail >> pgs: 1992 active+clean >> >> io: >> client: 14441 kB/s rd, 2550 kB/s wr, 371 op/s rd, 95 op/s wr >> >> >> root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net cache status >> { >> "pool": { >> "items": 44523608, >> "bytes": 5326049009 >> } >> } >> >> >> root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net perf dump >> { >> "AsyncMessenger::Worker-0": { >> "msgr_recv_messages": 798876013, >> "msgr_send_messages": 825999506, >> "msgr_recv_bytes": 7003223097381, >> "msgr_send_bytes": 691501283744, >> "msgr_created_connections": 148, >> "msgr_active_connections": 146, >> "msgr_running_total_time": 39914.832387470, >> "msgr_running_send_time": 13744.704199430, >> "msgr_running_recv_time": 32342.160588451, >> "msgr_running_fast_dispatch_time": 5996.336446782 >> }, >> "AsyncMessenger::Worker-1": { >> "msgr_recv_messages": 429668771, >> "msgr_send_messages": 414760220, >> "msgr_recv_bytes": 5003149410825, >> "msgr_send_bytes": 396281427789, >> "msgr_created_connections": 132, >> "msgr_active_connections": 132, >> "msgr_running_total_time": 23644.410515392, >> "msgr_running_send_time": 7669.068710688, >> "msgr_running_recv_time": 19751.610043696, >> "msgr_running_fast_dispatch_time": 4331.023453385 >> }, >> "AsyncMessenger::Worker-2": { >> "msgr_recv_messages": 1312910919, >> "msgr_send_messages": 1260040403, >> "msgr_recv_bytes": 5330386980976, >> "msgr_send_bytes": 3341965016878, >> "msgr_created_connections": 143, >> "msgr_active_connections": 138, >> "msgr_running_total_time": 61696.635450100, >> "msgr_running_send_time": 23491.027014598, >> "msgr_running_recv_time": 53858.409319734, >> "msgr_running_fast_dispatch_time": 4312.451966809 >> }, >> "finisher-PurgeQueue": { >> "queue_len": 0, >> "complete_latency": { >> "avgcount": 1889416, >> "sum": 29224.227703697, >>
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
I'm not sure this is a cache issue. To me, this feels like a memory leak. I'm now at 129GB (haven't had a window to upgrade yet) on a configured 80GB cache. [root@mds0 ceph-admin]# ceph daemon mds.mds0 cache status { "pool": { "items": 166753076, "bytes": 71766944952 } } ran a 10 minute heap profile. [root@mds0 ceph-admin]# ceph tell mds.mds0 heap start_profiler 2018-05-25 08:15:04.428519 7f3f657fa700 0 client.127046191 ms_handle_reset on 10.124.103.50:6800/2248223690 2018-05-25 08:15:04.447528 7f3f667fc700 0 client.127055541 ms_handle_reset on 10.124.103.50:6800/2248223690 mds.mds0 started profiler [root@mds0 ceph-admin]# ceph tell mds.mds0 heap dump 2018-05-25 08:25:14.265450 7f1774ff9700 0 client.127057266 ms_handle_reset on 10.124.103.50:6800/2248223690 2018-05-25 08:25:14.356292 7f1775ffb700 0 client.127057269 ms_handle_reset on 10.124.103.50:6800/2248223690 mds.mds0 dumping heap profile now. MALLOC: 123658130320 (117929.6 MiB) Bytes in use by application MALLOC: +0 (0.0 MiB) Bytes in page heap freelist MALLOC: + 6969713096 ( 6646.8 MiB) Bytes in central cache freelist MALLOC: + 26700832 ( 25.5 MiB) Bytes in transfer cache freelist MALLOC: + 54460040 ( 51.9 MiB) Bytes in thread cache freelists MALLOC: +531034272 ( 506.4 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 131240038560 (125160.3 MiB) Actual memory used (physical + swap) MALLOC: + 7426875392 ( 7082.8 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 138666913952 (132243.1 MiB) Virtual address space used MALLOC: MALLOC:7434952 Spans in use MALLOC: 20 Thread heaps in use MALLOC: 8192 Tcmalloc page size Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the OS take up virtual address space but no physical memory. [root@mds0 ceph-admin]# ceph tell mds.mds0 heap stop_profiler 2018-05-25 08:25:26.394877 7fbe48ff9700 0 client.127047898 ms_handle_reset on 10.124.103.50:6800/2248223690 2018-05-25 08:25:26.736909 7fbe49ffb700 0 client.127035608 ms_handle_reset on 10.124.103.50:6800/2248223690 mds.mds0 stopped profiler [root@mds0 ceph-admin]# pprof --pdf /bin/ceph-mds /var/log/ceph/mds.mds0.profile.000* > profile.pdf On Thu, May 10, 2018 at 2:11 PM, Patrick Donnelly wrote: > On Thu, May 10, 2018 at 12:00 PM, Brady Deetz wrote: > > [ceph-admin@mds0 ~]$ ps aux | grep ceph-mds > > ceph1841 3.5 94.3 133703308 124425384 ? Ssl Apr04 1808:32 > > /usr/bin/ceph-mds -f --cluster ceph --id mds0 --setuser ceph --setgroup > ceph > > > > > > [ceph-admin@mds0 ~]$ sudo ceph daemon mds.mds0 cache status > > { > > "pool": { > > "items": 173261056, > > "bytes": 76504108600 > > } > > } > > > > So, 80GB is my configured limit for the cache and it appears the mds is > > following that limit. But, the mds process is using over 100GB RAM in my > > 128GB host. I thought I was playing it safe by configuring at 80. What > other > > things consume a lot of RAM for this process? > > > > Let me know if I need to create a new thread. > > The cache size measurement is imprecise pre-12.2.5 [1]. You should upgrade > ASAP. > > [1] https://tracker.ceph.com/issues/22972 > > -- > Patrick Donnelly > profile.pdf Description: Adobe PDF document ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
On Fri, May 25, 2018 at 4:28 PM, Yan, Zheng wrote: > I found some memory leak. could you please try > https://github.com/ceph/ceph/pull/22240 > the leak only affects multiple active mds, I think it's unrelated to your issue. > > On Fri, May 25, 2018 at 1:49 PM, Alexandre DERUMIER > wrote: >> Here the result: >> >> >> root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net flush journal >> { >> "message": "", >> "return_code": 0 >> } >> root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net config set mds_cache_size >> 1 >> { >> "success": "mds_cache_size = '1' (not observed, change may require >> restart) " >> } >> >> wait ... >> >> >> root@ceph4-2:~# ceph tell mds.ceph4-2.odiso.net heap stats >> 2018-05-25 07:44:02.185911 7f4cad7fa700 0 client.50748489 ms_handle_reset >> on 10.5.0.88:6804/994206868 >> 2018-05-25 07:44:02.196160 7f4cae7fc700 0 client.50792764 ms_handle_reset >> on 10.5.0.88:6804/994206868 >> mds.ceph4-2.odiso.net tcmalloc heap >> stats: >> MALLOC:13175782328 (12565.4 MiB) Bytes in use by application >> MALLOC: +0 (0.0 MiB) Bytes in page heap freelist >> MALLOC: + 1774628488 ( 1692.4 MiB) Bytes in central cache freelist >> MALLOC: + 34274608 ( 32.7 MiB) Bytes in transfer cache freelist >> MALLOC: + 57260176 ( 54.6 MiB) Bytes in thread cache freelists >> MALLOC: +120582336 ( 115.0 MiB) Bytes in malloc metadata >> MALLOC: >> MALLOC: = 15162527936 (14460.1 MiB) Actual memory used (physical + swap) >> MALLOC: + 4974067712 ( 4743.6 MiB) Bytes released to OS (aka unmapped) >> MALLOC: >> MALLOC: = 20136595648 (19203.8 MiB) Virtual address space used >> MALLOC: >> MALLOC:1852388 Spans in use >> MALLOC: 18 Thread heaps in use >> MALLOC: 8192 Tcmalloc page size >> >> Call ReleaseFreeMemory() to release freelist memory to the OS (via >> madvise()). >> Bytes released to the OS take up virtual address space but no physical >> memory. >> >> >> root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net config set mds_cache_size 0 >> { >> "success": "mds_cache_size = '0' (not observed, change may require >> restart) " >> } >> >> - Mail original - >> De: "Zheng Yan" >> À: "aderumier" >> Envoyé: Vendredi 25 Mai 2018 05:56:31 >> Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? >> >> On Thu, May 24, 2018 at 11:34 PM, Alexandre DERUMIER >> wrote: >>>>>Still don't find any clue. Does the cephfs have idle period. If it >>>>>has, could you decrease mds's cache size and check what happens. For >>>>>example, run following commands during the old period. >>> >>>>>ceph daemon mds.xx flush journal >>>>>ceph daemon mds.xx config set mds_cache_size 1; >>>>>"wait a minute" >>>>>ceph tell mds.xx heap stats >>>>>ceph daemon mds.xx config set mds_cache_size 0 >>> >>> ok thanks. I'll try this night. >>> >>> I have already mds_cache_memory_limit = 5368709120, >>> >>> does it need to remove it first before setting mds_cache_size 1 ? >> >> no >>> >>> >>> >>> >>> - Mail original - >>> De: "Zheng Yan" >>> À: "aderumier" >>> Cc: "ceph-users" >>> Envoyé: Jeudi 24 Mai 2018 16:27:21 >>> Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? >>> >>> On Thu, May 24, 2018 at 7:22 PM, Alexandre DERUMIER >>> wrote: >>>> Thanks! >>>> >>>> >>>> here the profile.pdf >>>> >>>> 10-15min profiling, I can't do it longer because my clients where lagging. >>>> >>>> but I think it should be enough to observe the rss memory increase. >>>> >>>> >>> >>> Still don't find any clue. Does the cephfs have idle period. If it >>> has, could you decrease mds's cache size and check what happens. For >>> example, run following commands during the old period. >>> >>> ceph daemon mds.xx flush journal >>&
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
I found some memory leak. could you please try https://github.com/ceph/ceph/pull/22240 On Fri, May 25, 2018 at 1:49 PM, Alexandre DERUMIER wrote: > Here the result: > > > root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net flush journal > { > "message": "", > "return_code": 0 > } > root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net config set mds_cache_size > 1 > { > "success": "mds_cache_size = '1' (not observed, change may require > restart) " > } > > wait ... > > > root@ceph4-2:~# ceph tell mds.ceph4-2.odiso.net heap stats > 2018-05-25 07:44:02.185911 7f4cad7fa700 0 client.50748489 ms_handle_reset on > 10.5.0.88:6804/994206868 > 2018-05-25 07:44:02.196160 7f4cae7fc700 0 client.50792764 ms_handle_reset on > 10.5.0.88:6804/994206868 > mds.ceph4-2.odiso.net tcmalloc heap > stats: > MALLOC:13175782328 (12565.4 MiB) Bytes in use by application > MALLOC: +0 (0.0 MiB) Bytes in page heap freelist > MALLOC: + 1774628488 ( 1692.4 MiB) Bytes in central cache freelist > MALLOC: + 34274608 ( 32.7 MiB) Bytes in transfer cache freelist > MALLOC: + 57260176 ( 54.6 MiB) Bytes in thread cache freelists > MALLOC: +120582336 ( 115.0 MiB) Bytes in malloc metadata > MALLOC: > MALLOC: = 15162527936 (14460.1 MiB) Actual memory used (physical + swap) > MALLOC: + 4974067712 ( 4743.6 MiB) Bytes released to OS (aka unmapped) > MALLOC: > MALLOC: = 20136595648 (19203.8 MiB) Virtual address space used > MALLOC: > MALLOC:1852388 Spans in use > MALLOC: 18 Thread heaps in use > MALLOC: 8192 Tcmalloc page size > > Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). > Bytes released to the OS take up virtual address space but no physical memory. > > > root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net config set mds_cache_size 0 > { > "success": "mds_cache_size = '0' (not observed, change may require > restart) " > } > > - Mail original - > De: "Zheng Yan" > À: "aderumier" > Envoyé: Vendredi 25 Mai 2018 05:56:31 > Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? > > On Thu, May 24, 2018 at 11:34 PM, Alexandre DERUMIER > wrote: >>>>Still don't find any clue. Does the cephfs have idle period. If it >>>>has, could you decrease mds's cache size and check what happens. For >>>>example, run following commands during the old period. >> >>>>ceph daemon mds.xx flush journal >>>>ceph daemon mds.xx config set mds_cache_size 1; >>>>"wait a minute" >>>>ceph tell mds.xx heap stats >>>>ceph daemon mds.xx config set mds_cache_size 0 >> >> ok thanks. I'll try this night. >> >> I have already mds_cache_memory_limit = 5368709120, >> >> does it need to remove it first before setting mds_cache_size 1 ? > > no >> >> >> >> >> - Mail original - >> De: "Zheng Yan" >> À: "aderumier" >> Cc: "ceph-users" >> Envoyé: Jeudi 24 Mai 2018 16:27:21 >> Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? >> >> On Thu, May 24, 2018 at 7:22 PM, Alexandre DERUMIER >> wrote: >>> Thanks! >>> >>> >>> here the profile.pdf >>> >>> 10-15min profiling, I can't do it longer because my clients where lagging. >>> >>> but I think it should be enough to observe the rss memory increase. >>> >>> >> >> Still don't find any clue. Does the cephfs have idle period. If it >> has, could you decrease mds's cache size and check what happens. For >> example, run following commands during the old period. >> >> ceph daemon mds.xx flush journal >> ceph daemon mds.xx config set mds_cache_size 1; >> "wait a minute" >> ceph tell mds.xx heap stats >> ceph daemon mds.xx config set mds_cache_size 0 >> >> >>> >>> >>> - Mail original - >>> De: "Zheng Yan" >>> À: "aderumier" >>> Cc: "ceph-users" >>> Envoyé: Jeudi 24 Mai 2018 11:34:20 >>> Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? >>> >>> On Tue, May 22, 2018 at 3:11 PM, Alexandre DERUMIER >>> wrote:
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
Here the result: root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net flush journal { "message": "", "return_code": 0 } root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net config set mds_cache_size 1 { "success": "mds_cache_size = '1' (not observed, change may require restart) " } wait ... root@ceph4-2:~# ceph tell mds.ceph4-2.odiso.net heap stats 2018-05-25 07:44:02.185911 7f4cad7fa700 0 client.50748489 ms_handle_reset on 10.5.0.88:6804/994206868 2018-05-25 07:44:02.196160 7f4cae7fc700 0 client.50792764 ms_handle_reset on 10.5.0.88:6804/994206868 mds.ceph4-2.odiso.net tcmalloc heap stats: MALLOC:13175782328 (12565.4 MiB) Bytes in use by application MALLOC: +0 (0.0 MiB) Bytes in page heap freelist MALLOC: + 1774628488 ( 1692.4 MiB) Bytes in central cache freelist MALLOC: + 34274608 ( 32.7 MiB) Bytes in transfer cache freelist MALLOC: + 57260176 ( 54.6 MiB) Bytes in thread cache freelists MALLOC: +120582336 ( 115.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 15162527936 (14460.1 MiB) Actual memory used (physical + swap) MALLOC: + 4974067712 ( 4743.6 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 20136595648 (19203.8 MiB) Virtual address space used MALLOC: MALLOC:1852388 Spans in use MALLOC: 18 Thread heaps in use MALLOC: 8192 Tcmalloc page size Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the OS take up virtual address space but no physical memory. root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net config set mds_cache_size 0 { "success": "mds_cache_size = '0' (not observed, change may require restart) " } - Mail original ----- De: "Zheng Yan" À: "aderumier" Envoyé: Vendredi 25 Mai 2018 05:56:31 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? On Thu, May 24, 2018 at 11:34 PM, Alexandre DERUMIER wrote: >>>Still don't find any clue. Does the cephfs have idle period. If it >>>has, could you decrease mds's cache size and check what happens. For >>>example, run following commands during the old period. > >>>ceph daemon mds.xx flush journal >>>ceph daemon mds.xx config set mds_cache_size 1; >>>"wait a minute" >>>ceph tell mds.xx heap stats >>>ceph daemon mds.xx config set mds_cache_size 0 > > ok thanks. I'll try this night. > > I have already mds_cache_memory_limit = 5368709120, > > does it need to remove it first before setting mds_cache_size 1 ? no > > > > > - Mail original - > De: "Zheng Yan" > À: "aderumier" > Cc: "ceph-users" > Envoyé: Jeudi 24 Mai 2018 16:27:21 > Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? > > On Thu, May 24, 2018 at 7:22 PM, Alexandre DERUMIER > wrote: >> Thanks! >> >> >> here the profile.pdf >> >> 10-15min profiling, I can't do it longer because my clients where lagging. >> >> but I think it should be enough to observe the rss memory increase. >> >> > > Still don't find any clue. Does the cephfs have idle period. If it > has, could you decrease mds's cache size and check what happens. For > example, run following commands during the old period. > > ceph daemon mds.xx flush journal > ceph daemon mds.xx config set mds_cache_size 1; > "wait a minute" > ceph tell mds.xx heap stats > ceph daemon mds.xx config set mds_cache_size 0 > > >> >> >> - Mail original - >> De: "Zheng Yan" >> À: "aderumier" >> Cc: "ceph-users" >> Envoyé: Jeudi 24 Mai 2018 11:34:20 >> Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? >> >> On Tue, May 22, 2018 at 3:11 PM, Alexandre DERUMIER >> wrote: >>> Hi,some new stats, mds memory is not 16G, >>> >>> I have almost same number of items and bytes in cache vs some weeks ago >>> when mds was using 8G. (ceph 12.2.5) >>> >>> >>> root@ceph4-2:~# while sleep 1; do ceph daemon mds.ceph4-2.odiso.net perf >>> dump | jq '.mds_mem.rss'; ceph daemon mds.ceph4-2.odiso.net dump_mempools | >>> jq -c '.mds_co'; done >>> 16905052 >>> {"items":43350988,"bytes":5257428143} >>> 16905052 >>> {"items":434
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
>>Still don't find any clue. Does the cephfs have idle period. If it >>has, could you decrease mds's cache size and check what happens. For >>example, run following commands during the old period. >>ceph daemon mds.xx flush journal >>ceph daemon mds.xx config set mds_cache_size 1; >>"wait a minute" >>ceph tell mds.xx heap stats >>ceph daemon mds.xx config set mds_cache_size 0 ok thanks. I'll try this night. I have already mds_cache_memory_limit = 5368709120, does it need to remove it first before setting mds_cache_size 1 ? - Mail original - De: "Zheng Yan" À: "aderumier" Cc: "ceph-users" Envoyé: Jeudi 24 Mai 2018 16:27:21 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? On Thu, May 24, 2018 at 7:22 PM, Alexandre DERUMIER wrote: > Thanks! > > > here the profile.pdf > > 10-15min profiling, I can't do it longer because my clients where lagging. > > but I think it should be enough to observe the rss memory increase. > > Still don't find any clue. Does the cephfs have idle period. If it has, could you decrease mds's cache size and check what happens. For example, run following commands during the old period. ceph daemon mds.xx flush journal ceph daemon mds.xx config set mds_cache_size 1; "wait a minute" ceph tell mds.xx heap stats ceph daemon mds.xx config set mds_cache_size 0 > > > - Mail original - > De: "Zheng Yan" > À: "aderumier" > Cc: "ceph-users" > Envoyé: Jeudi 24 Mai 2018 11:34:20 > Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? > > On Tue, May 22, 2018 at 3:11 PM, Alexandre DERUMIER > wrote: >> Hi,some new stats, mds memory is not 16G, >> >> I have almost same number of items and bytes in cache vs some weeks ago when >> mds was using 8G. (ceph 12.2.5) >> >> >> root@ceph4-2:~# while sleep 1; do ceph daemon mds.ceph4-2.odiso.net perf >> dump | jq '.mds_mem.rss'; ceph daemon mds.ceph4-2.odiso.net dump_mempools | >> jq -c '.mds_co'; done >> 16905052 >> {"items":43350988,"bytes":5257428143} >> 16905052 >> {"items":43428329,"bytes":5283850173} >> 16905052 >> {"items":43209167,"bytes":5208578149} >> 16905052 >> {"items":43177631,"bytes":5198833577} >> 16905052 >> {"items":43312734,"bytes":5252649462} >> 16905052 >> {"items":43355753,"bytes":5277197972} >> 16905052 >> {"items":43700693,"bytes":5303376141} >> 16905052 >> {"items":43115809,"bytes":5156628138} >> ^C >> >> >> >> >> root@ceph4-2:~# ceph status >> cluster: >> id: e22b8e83-3036-4fe5-8fd5-5ce9d539beca >> health: HEALTH_OK >> >> services: >> mon: 3 daemons, quorum ceph4-1,ceph4-2,ceph4-3 >> mgr: ceph4-1.odiso.net(active), standbys: ceph4-2.odiso.net, >> ceph4-3.odiso.net >> mds: cephfs4-1/1/1 up {0=ceph4-2.odiso.net=up:active}, 2 up:standby >> osd: 18 osds: 18 up, 18 in >> rgw: 3 daemons active >> >> data: >> pools: 11 pools, 1992 pgs >> objects: 75677k objects, 6045 GB >> usage: 20579 GB used, 6246 GB / 26825 GB avail >> pgs: 1992 active+clean >> >> io: >> client: 14441 kB/s rd, 2550 kB/s wr, 371 op/s rd, 95 op/s wr >> >> >> root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net cache status >> { >> "pool": { >> "items": 44523608, >> "bytes": 5326049009 >> } >> } >> >> >> root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net perf dump >> { >> "AsyncMessenger::Worker-0": { >> "msgr_recv_messages": 798876013, >> "msgr_send_messages": 825999506, >> "msgr_recv_bytes": 7003223097381, >> "msgr_send_bytes": 691501283744, >> "msgr_created_connections": 148, >> "msgr_active_connections": 146, >> "msgr_running_total_time": 39914.832387470, >> "msgr_running_send_time": 13744.704199430, >> "msgr_running_recv_time": 32342.160588451, >> "msgr_running_fast_dispatch_time": 5996.336446782 >> }, >> "AsyncMessenger::Worker-1": { >> "msgr_recv_messages": 429668771, >> "msgr_send_messages": 414760220, >> "msgr_recv_bytes": 5003149410
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
On Thu, May 24, 2018 at 7:22 PM, Alexandre DERUMIER wrote: > Thanks! > > > here the profile.pdf > > 10-15min profiling, I can't do it longer because my clients where lagging. > > but I think it should be enough to observe the rss memory increase. > > Still don't find any clue. Does the cephfs have idle period. If it has, could you decrease mds's cache size and check what happens. For example, run following commands during the old period. ceph daemon mds.xx flush journal ceph daemon mds.xx config set mds_cache_size 1; "wait a minute" ceph tell mds.xx heap stats ceph daemon mds.xx config set mds_cache_size 0 > > > - Mail original - > De: "Zheng Yan" > À: "aderumier" > Cc: "ceph-users" > Envoyé: Jeudi 24 Mai 2018 11:34:20 > Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? > > On Tue, May 22, 2018 at 3:11 PM, Alexandre DERUMIER > wrote: >> Hi,some new stats, mds memory is not 16G, >> >> I have almost same number of items and bytes in cache vs some weeks ago when >> mds was using 8G. (ceph 12.2.5) >> >> >> root@ceph4-2:~# while sleep 1; do ceph daemon mds.ceph4-2.odiso.net perf >> dump | jq '.mds_mem.rss'; ceph daemon mds.ceph4-2.odiso.net dump_mempools | >> jq -c '.mds_co'; done >> 16905052 >> {"items":43350988,"bytes":5257428143} >> 16905052 >> {"items":43428329,"bytes":5283850173} >> 16905052 >> {"items":43209167,"bytes":5208578149} >> 16905052 >> {"items":43177631,"bytes":5198833577} >> 16905052 >> {"items":43312734,"bytes":5252649462} >> 16905052 >> {"items":43355753,"bytes":5277197972} >> 16905052 >> {"items":43700693,"bytes":5303376141} >> 16905052 >> {"items":43115809,"bytes":5156628138} >> ^C >> >> >> >> >> root@ceph4-2:~# ceph status >> cluster: >> id: e22b8e83-3036-4fe5-8fd5-5ce9d539beca >> health: HEALTH_OK >> >> services: >> mon: 3 daemons, quorum ceph4-1,ceph4-2,ceph4-3 >> mgr: ceph4-1.odiso.net(active), standbys: ceph4-2.odiso.net, >> ceph4-3.odiso.net >> mds: cephfs4-1/1/1 up {0=ceph4-2.odiso.net=up:active}, 2 up:standby >> osd: 18 osds: 18 up, 18 in >> rgw: 3 daemons active >> >> data: >> pools: 11 pools, 1992 pgs >> objects: 75677k objects, 6045 GB >> usage: 20579 GB used, 6246 GB / 26825 GB avail >> pgs: 1992 active+clean >> >> io: >> client: 14441 kB/s rd, 2550 kB/s wr, 371 op/s rd, 95 op/s wr >> >> >> root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net cache status >> { >> "pool": { >> "items": 44523608, >> "bytes": 5326049009 >> } >> } >> >> >> root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net perf dump >> { >> "AsyncMessenger::Worker-0": { >> "msgr_recv_messages": 798876013, >> "msgr_send_messages": 825999506, >> "msgr_recv_bytes": 7003223097381, >> "msgr_send_bytes": 691501283744, >> "msgr_created_connections": 148, >> "msgr_active_connections": 146, >> "msgr_running_total_time": 39914.832387470, >> "msgr_running_send_time": 13744.704199430, >> "msgr_running_recv_time": 32342.160588451, >> "msgr_running_fast_dispatch_time": 5996.336446782 >> }, >> "AsyncMessenger::Worker-1": { >> "msgr_recv_messages": 429668771, >> "msgr_send_messages": 414760220, >> "msgr_recv_bytes": 5003149410825, >> "msgr_send_bytes": 396281427789, >> "msgr_created_connections": 132, >> "msgr_active_connections": 132, >> "msgr_running_total_time": 23644.410515392, >> "msgr_running_send_time": 7669.068710688, >> "msgr_running_recv_time": 19751.610043696, >> "msgr_running_fast_dispatch_time": 4331.023453385 >> }, >> "AsyncMessenger::Worker-2": { >> "msgr_recv_messages": 1312910919, >> "msgr_send_messages": 1260040403, >> "msgr_recv_bytes": 5330386980976, >> "msgr_send_bytes": 3341965016878, >> "msgr_created_connections": 143, >> "msgr_active_connections": 138, >> "msgr_running_total_time": 61696.635450100, >> "msgr_running_send_time": 23491.027014598, >> "msgr_running_
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
Thanks! here the profile.pdf 10-15min profiling, I can't do it longer because my clients where lagging. but I think it should be enough to observe the rss memory increase. - Mail original - De: "Zheng Yan" À: "aderumier" Cc: "ceph-users" Envoyé: Jeudi 24 Mai 2018 11:34:20 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? On Tue, May 22, 2018 at 3:11 PM, Alexandre DERUMIER wrote: > Hi,some new stats, mds memory is not 16G, > > I have almost same number of items and bytes in cache vs some weeks ago when > mds was using 8G. (ceph 12.2.5) > > > root@ceph4-2:~# while sleep 1; do ceph daemon mds.ceph4-2.odiso.net perf dump > | jq '.mds_mem.rss'; ceph daemon mds.ceph4-2.odiso.net dump_mempools | jq -c > '.mds_co'; done > 16905052 > {"items":43350988,"bytes":5257428143} > 16905052 > {"items":43428329,"bytes":5283850173} > 16905052 > {"items":43209167,"bytes":5208578149} > 16905052 > {"items":43177631,"bytes":5198833577} > 16905052 > {"items":43312734,"bytes":5252649462} > 16905052 > {"items":43355753,"bytes":5277197972} > 16905052 > {"items":43700693,"bytes":5303376141} > 16905052 > {"items":43115809,"bytes":5156628138} > ^C > > > > > root@ceph4-2:~# ceph status > cluster: > id: e22b8e83-3036-4fe5-8fd5-5ce9d539beca > health: HEALTH_OK > > services: > mon: 3 daemons, quorum ceph4-1,ceph4-2,ceph4-3 > mgr: ceph4-1.odiso.net(active), standbys: ceph4-2.odiso.net, > ceph4-3.odiso.net > mds: cephfs4-1/1/1 up {0=ceph4-2.odiso.net=up:active}, 2 up:standby > osd: 18 osds: 18 up, 18 in > rgw: 3 daemons active > > data: > pools: 11 pools, 1992 pgs > objects: 75677k objects, 6045 GB > usage: 20579 GB used, 6246 GB / 26825 GB avail > pgs: 1992 active+clean > > io: > client: 14441 kB/s rd, 2550 kB/s wr, 371 op/s rd, 95 op/s wr > > > root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net cache status > { > "pool": { > "items": 44523608, > "bytes": 5326049009 > } > } > > > root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net perf dump > { > "AsyncMessenger::Worker-0": { > "msgr_recv_messages": 798876013, > "msgr_send_messages": 825999506, > "msgr_recv_bytes": 7003223097381, > "msgr_send_bytes": 691501283744, > "msgr_created_connections": 148, > "msgr_active_connections": 146, > "msgr_running_total_time": 39914.832387470, > "msgr_running_send_time": 13744.704199430, > "msgr_running_recv_time": 32342.160588451, > "msgr_running_fast_dispatch_time": 5996.336446782 > }, > "AsyncMessenger::Worker-1": { > "msgr_recv_messages": 429668771, > "msgr_send_messages": 414760220, > "msgr_recv_bytes": 5003149410825, > "msgr_send_bytes": 396281427789, > "msgr_created_connections": 132, > "msgr_active_connections": 132, > "msgr_running_total_time": 23644.410515392, > "msgr_running_send_time": 7669.068710688, > "msgr_running_recv_time": 19751.610043696, > "msgr_running_fast_dispatch_time": 4331.023453385 > }, > "AsyncMessenger::Worker-2": { > "msgr_recv_messages": 1312910919, > "msgr_send_messages": 1260040403, > "msgr_recv_bytes": 5330386980976, > "msgr_send_bytes": 3341965016878, > "msgr_created_connections": 143, > "msgr_active_connections": 138, > "msgr_running_total_time": 61696.635450100, > "msgr_running_send_time": 23491.027014598, > "msgr_running_recv_time": 53858.409319734, > "msgr_running_fast_dispatch_time": 4312.451966809 > }, > "finisher-PurgeQueue": { > "queue_len": 0, > "complete_latency": { > "avgcount": 1889416, > "sum": 29224.227703697, > "avgtime": 0.015467333 > } > }, > "mds": { > "request": 1822420924, > "reply": 1822420886, > "reply_latency": { > "avgcount": 1822420886, > "sum": 5258467.616943274, > "avgtime": 0.002885429 > }, > "forward": 0, > "dir_fetch": 116035485, > "dir_commit": 1865012, > "dir_split": 17, > "dir_merge": 24, > "inode_max": 2147483647,
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
t;req_symlink": 2520, > "req_unlink": 1589288 > }, > "mds_sessions": { > "session_count": 321, > "session_add": 383, > "session_remove": 62 > }, > "objecter": { > "op_active": 0, > "op_laggy": 0, > "op_send": 197932443, > "op_send_bytes": 605992324653, > "op_resend": 22, > "op_reply": 197932421, > "op": 197932421, > "op_r": 116256030, > "op_w": 81676391, > "op_rmw": 0, > "op_pg": 0, > "osdop_stat": 1518341, > "osdop_create": 4314348, > "osdop_read": 79810, > "osdop_write": 59151421, > "osdop_writefull": 237358, > "osdop_writesame": 0, > "osdop_append": 0, > "osdop_zero": 2, > "osdop_truncate": 9, > "osdop_delete": 2320714, > "osdop_mapext": 0, > "osdop_sparse_read": 0, > "osdop_clonerange": 0, > "osdop_getxattr": 29426577, > "osdop_setxattr": 8628696, > "osdop_cmpxattr": 0, > "osdop_rmxattr": 0, > "osdop_resetxattrs": 0, > "osdop_tmap_up": 0, > "osdop_tmap_put": 0, > "osdop_tmap_get": 0, > "osdop_call": 0, > "osdop_watch": 0, > "osdop_notify": 0, > "osdop_src_cmpxattr": 0, > "osdop_pgls": 0, > "osdop_pgls_filter": 0, > "osdop_other": 13551599, > "linger_active": 0, > "linger_send": 0, > "linger_resend": 0, > "linger_ping": 0, > "poolop_active": 0, > "poolop_send": 0, > "poolop_resend": 0, > "poolstat_active": 0, > "poolstat_send": 0, > "poolstat_resend": 0, > "statfs_active": 0, > "statfs_send": 0, > "statfs_resend": 0, > "command_active": 0, > "command_send": 0, > "command_resend": 0, > "map_epoch": 3907, > "map_full": 0, > "map_inc": 601, > "osd_sessions": 18, > "osd_session_open": 20, > "osd_session_close": 2, > "osd_laggy": 0, > "omap_wr": 3595801, > "omap_rd": 232070972, > "omap_del": 272598 > }, > "purge_queue": { > "pq_executing_ops": 0, > "pq_executing": 0, > "pq_executed": 1659514 > }, > "throttle-msgr_dispatch_throttler-mds": { > "val": 0, > "max": 104857600, > "get_started": 0, > "get": 2541455703, > "get_sum": 17148691767160, > "get_or_fail_fail": 0, > "get_or_fail_success": 2541455703, > "take": 0, > "take_sum": 0, > "put": 2541455703, > "put_sum": 17148691767160, > "wait": { > "avgcount": 0, > "sum": 0.0, > "avgtime": 0.0 > } > }, > "throttle-objecter_bytes": { > "val": 0, > "max": 104857600, > "get_started": 0, > "get": 0, > "get_sum": 0, > "get_or_fail_fail": 0, > "get_or_fail_success": 0, > "take": 197932421, > "take_sum": 606323353310, > "put": 182060027, > "put_sum": 606323353310, > "wait": { > "avgcount": 0, > "sum": 0.0, > "avgtime": 0.0 > } > }, > "throttle-objecter_ops": { > "val": 0, > "max": 1024, > "get_started": 0, > "get": 0, > "get_sum": 0, > "get_or_fail_fail": 0, > "get_or_fail_success": 0, > "take": 197932421, > "take_sum": 197932421, > "put": 197932421, > "put_sum": 197932421, > "wait": { > "avgcount": 0, > "sum": 0.0, > "avgtime": 0.0 > } > }, > "throttle-write_buf_throttle": { > "val": 0, > "max": 3758096384, > "get_started": 0, > "get": 1659514, > "get_sum": 154334946, > "get_or_fail_fail": 0, > "get_or_fail_success": 1659514, > "take": 0, > "take_sum": 0, > "put": 79728, > "put_sum": 154334946, > "wait": { > "avgcount": 0, > "sum": 0.0, > "avgtime": 0.0 > } > }, > "throttle-write_buf_throttle-0x55decea8e140": { > "val": 255839, > "max": 3758096384, > "get_started": 0, > "get": 357717092, > "get_sum": 596677113363, > "get_or_fail_fail": 0, > "get_or_fail_success": 357717092, > "take": 0, > "take_sum": 0, > "put": 59071693, > "put_sum": 596676857524, > "wait": { > "avgcount": 0, > "sum": 0.0, > "avgtime": 0.0 > } > } > } > > Maybe there is memory leak. What is output of 'ceph tell mds.xx heap stats'. If the RSS size keeps increasing, please run profile heap for a period of time. ceph tell mds.xx heap start_profiler "wait some time" ceph tell mds.xx heap dump ceph tell mds.xx heap stop_profiler pprof --pdf /var/log/ceph/mds.xxx.profile.* > profile.pdf send profile.pdf to us Regards Yan, Zheng > > - Mail original - > De: "Webert de Souza Lima" > À: "ceph-users" > Envoyé: Lundi 14 Mai 2018 15:14:35 > Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? > > On Sat, May 12, 2018 at 3:11 AM Alexandre DERUMIER < [ > mailto:aderum...@odiso.com | aderum...@odiso.com ] > wrote: > > > The documentation (luminous) say: > > > > > BQ_BEGIN >>mds cache size >> >>Description: The number of inodes to cache. A value of 0 indicates an >>unlimited number. It is recommended to use mds_cache_memory_limit to limit >>the amount of memory the MDS cache uses. >>Type: 32-bit Integer >>Default: 0 >> > BQ_END > > BQ_BEGIN > and, my mds_cache_memory_limit is currently at 5GB. > BQ_END > > yeah I have only suggested that because the high memory usage seemed to > trouble you and it might be a bug, so it's more of a workaround. > > Regards, > Webert Lima > DevOps Engineer at MAV Tecnologia > Belo Horizonte - Brasil > IRC NICK - WebertRLZ > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
26577, "osdop_setxattr": 8628696, "osdop_cmpxattr": 0, "osdop_rmxattr": 0, "osdop_resetxattrs": 0, "osdop_tmap_up": 0, "osdop_tmap_put": 0, "osdop_tmap_get": 0, "osdop_call": 0, "osdop_watch": 0, "osdop_notify": 0, "osdop_src_cmpxattr": 0, "osdop_pgls": 0, "osdop_pgls_filter": 0, "osdop_other": 13551599, "linger_active": 0, "linger_send": 0, "linger_resend": 0, "linger_ping": 0, "poolop_active": 0, "poolop_send": 0, "poolop_resend": 0, "poolstat_active": 0, "poolstat_send": 0, "poolstat_resend": 0, "statfs_active": 0, "statfs_send": 0, "statfs_resend": 0, "command_active": 0, "command_send": 0, "command_resend": 0, "map_epoch": 3907, "map_full": 0, "map_inc": 601, "osd_sessions": 18, "osd_session_open": 20, "osd_session_close": 2, "osd_laggy": 0, "omap_wr": 3595801, "omap_rd": 232070972, "omap_del": 272598 }, "purge_queue": { "pq_executing_ops": 0, "pq_executing": 0, "pq_executed": 1659514 }, "throttle-msgr_dispatch_throttler-mds": { "val": 0, "max": 104857600, "get_started": 0, "get": 2541455703, "get_sum": 17148691767160, "get_or_fail_fail": 0, "get_or_fail_success": 2541455703, "take": 0, "take_sum": 0, "put": 2541455703, "put_sum": 17148691767160, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-objecter_bytes": { "val": 0, "max": 104857600, "get_started": 0, "get": 0, "get_sum": 0, "get_or_fail_fail": 0, "get_or_fail_success": 0, "take": 197932421, "take_sum": 606323353310, "put": 182060027, "put_sum": 606323353310, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-objecter_ops": { "val": 0, "max": 1024, "get_started": 0, "get": 0, "get_sum": 0, "get_or_fail_fail": 0, "get_or_fail_success": 0, "take": 197932421, "take_sum": 197932421, "put": 197932421, "put_sum": 197932421, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-write_buf_throttle": { "val": 0, "max": 3758096384, "get_started": 0, "get": 1659514, "get_sum": 154334946, "get_or_fail_fail": 0, "get_or_fail_success": 1659514, "take": 0, "take_sum": 0, "put": 79728, "put_sum": 154334946, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-write_buf_throttle-0x55decea8e140": { "val": 255839, "max": 3758096384, "get_started": 0, "get": 357717092, "get_sum": 596677113363, "get_or_fail_fail": 0, "get_or_fail_success": 357717092, "take": 0, "take_sum": 0, "put": 59071693, "put_sum": 596676857524, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } } } - Mail original - De: "Webert de Souza Lima" À: "ceph-users" Envoyé: Lundi 14 Mai 2018 15:14:35 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? On Sat, May 12, 2018 at 3:11 AM Alexandre DERUMIER < [ mailto:aderum...@odiso.com | aderum...@odiso.com ] > wrote: The documentation (luminous) say: BQ_BEGIN >mds cache size > >Description: The number of inodes to cache. A value of 0 indicates an >unlimited number. It is recommended to use mds_cache_memory_limit to limit the >amount of memory the MDS cache uses. >Type: 32-bit Integer >Default: 0 > BQ_END BQ_BEGIN and, my mds_cache_memory_limit is currently at 5GB. BQ_END yeah I have only suggested that because the high memory usage seemed to trouble you and it might be a bug, so it's more of a workaround. Regards, Webert Lima DevOps Engineer at MAV Tecnologia Belo Horizonte - Brasil IRC NICK - WebertRLZ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
On Sat, May 12, 2018 at 3:11 AM Alexandre DERUMIER wrote: > The documentation (luminous) say: > > >mds cache size > > > >Description:The number of inodes to cache. A value of 0 indicates an > unlimited number. It is recommended to use mds_cache_memory_limit to limit > the amount of memory the MDS cache uses. > >Type: 32-bit Integer > >Default:0 > > and, my mds_cache_memory_limit is currently at 5GB. yeah I have only suggested that because the high memory usage seemed to trouble you and it might be a bug, so it's more of a workaround. Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
my cache is correctly capped at 5G currently here some stats: (mds has been restarted yesterday, using around 8,8gb, and cache capped at 5G). I'll try to sent some stats in 1 or 2 week, when the memory should be at 20g # while sleep 1; do ceph daemon mds.ceph4-2.odiso.net perf dump | jq '.mds_mem.rss'; ceph daemon mds.ceph4-2.odiso.net dump_mempools | jq -c '.mds_co'; done 8821728 {"items":44512173,"bytes":5346723108} 8821728 {"items":44647862,"bytes":5356139145} 8821728 {"items":43644205,"bytes":5129276043} 8821728 {"items":44134481,"bytes":5260485627} 8821728 {"items":44418491,"bytes":5338308734} 8821728 {"items":45091444,"bytes":5404019118} 8821728 {"items":44714180,"bytes":5322182878} 8821728 {"items":43853828,"bytes":5221597919} 8821728 {"items":44518074,"bytes":5323670444} 8821728 {"items":44679829,"bytes":5367219523} 8821728 {"items":44809929,"bytes":5382383166} 8821728 {"items":43441538,"bytes":5180408997} 8821728 {"items":44239001,"bytes":5349655543} 8821728 {"items":44558135,"bytes":5414566237} 8821728 {"items":44664773,"bytes":5433279976} 8821728 {"items":43433859,"bytes":5148008705} 8821728 {"items":43683053,"bytes":5236668693} 8821728 {"items":44248833,"bytes":5310420155} 8821728 {"items":45013698,"bytes":5381693077} 8821728 {"items":44928825,"bytes":5313048602} 8821728 {"items":43828630,"bytes":5146482155} 8821728 {"items":44005515,"bytes":5167930294} 8821728 {"items":44412223,"bytes":5182643376} 8821728 {"items":44842966,"bytes":5198073066} - Mail original - De: "aderumier" À: "Webert de Souza Lima" Cc: "ceph-users" Envoyé: Samedi 12 Mai 2018 08:11:04 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? Hi >>You could use "mds_cache_size" to limit number of CAPS untill you have this >>fixed, but I'd say for your number of caps and inodes, 20GB is normal. The documentation (luminous) say: " mds cache size Description: The number of inodes to cache. A value of 0 indicates an unlimited number. It is recommended to use mds_cache_memory_limit to limit the amount of memory the MDS cache uses. Type: 32-bit Integer Default: 0 " and, my mds_cache_memory_limit is currently at 5GB. - Mail original - De: "Webert de Souza Lima" À: "ceph-users" Envoyé: Vendredi 11 Mai 2018 20:18:27 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? You could use "mds_cache_size" to limit number of CAPS untill you have this fixed, but I'd say for your number of caps and inodes, 20GB is normal. this mds (jewel) here is consuming 24GB RAM: { "mds": { "request": 7194867047, "reply": 7194866688, "reply_latency": { "avgcount": 7194866688, "sum": 27779142.611775008 }, "forward": 0, "dir_fetch": 179223482, "dir_commit": 1529387896, "dir_split": 0, "inode_max": 300, "inodes": 3001264, "inodes_top": 160517, "inodes_bottom": 226577, "inodes_pin_tail": 2614170, "inodes_pinned": 2770689, "inodes_expired": 2920014835, "inodes_with_caps": 2743194, "caps": 2803568, "subtrees": 2, "traverse": 8255083028, "traverse_hit": 7452972311, "traverse_forward": 0, "traverse_discover": 0, "traverse_dir_fetch": 180547123, "traverse_remote_ino": 122257, "traverse_lock": 5957156, "load_cent": 18446743934203149911, "q": 54, "exported": 0, "exported_inodes": 0, "imported": 0, "imported_inodes": 0 } } Regards, Webert Lima DevOps Engineer at MAV Tecnologia Belo Horizonte - Brasil IRC NICK - WebertRLZ On Fri, May 11, 2018 at 3:13 PM Alexandre DERUMIER < [ mailto:aderum...@odiso.com | aderum...@odiso.com ] > wrote: Hi, I'm still seeing memory leak with 12.2.5. seem to leak some MB each 5 minutes. I'll try to resent some stats next weekend. - Mail original - De: "Patrick Donnelly" < [ mailto:pdonn...@redhat.com | pdonn...@redhat.com ] > À: "Brady Deetz" < [ mailto:bde...@gmail.com | bde...@gmail.com ] > Cc: "Alexandre Derumier" < [ mailto:aderum...@odiso.com | aderum...@odiso.com ] >, "ceph-users"
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
Hi >>You could use "mds_cache_size" to limit number of CAPS untill you have this >>fixed, but I'd say for your number of caps and inodes, 20GB is normal. The documentation (luminous) say: " mds cache size Description:The number of inodes to cache. A value of 0 indicates an unlimited number. It is recommended to use mds_cache_memory_limit to limit the amount of memory the MDS cache uses. Type: 32-bit Integer Default:0 " and, my mds_cache_memory_limit is currently at 5GB. - Mail original - De: "Webert de Souza Lima" À: "ceph-users" Envoyé: Vendredi 11 Mai 2018 20:18:27 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? You could use "mds_cache_size" to limit number of CAPS untill you have this fixed, but I'd say for your number of caps and inodes, 20GB is normal. this mds (jewel) here is consuming 24GB RAM: { "mds": { "request": 7194867047, "reply": 7194866688, "reply_latency": { "avgcount": 7194866688, "sum": 27779142.611775008 }, "forward": 0, "dir_fetch": 179223482, "dir_commit": 1529387896, "dir_split": 0, "inode_max": 300, "inodes": 3001264, "inodes_top": 160517, "inodes_bottom": 226577, "inodes_pin_tail": 2614170, "inodes_pinned": 2770689, "inodes_expired": 2920014835, "inodes_with_caps": 2743194, "caps": 2803568, "subtrees": 2, "traverse": 8255083028, "traverse_hit": 7452972311, "traverse_forward": 0, "traverse_discover": 0, "traverse_dir_fetch": 180547123, "traverse_remote_ino": 122257, "traverse_lock": 5957156, "load_cent": 18446743934203149911, "q": 54, "exported": 0, "exported_inodes": 0, "imported": 0, "imported_inodes": 0 } } Regards, Webert Lima DevOps Engineer at MAV Tecnologia Belo Horizonte - Brasil IRC NICK - WebertRLZ On Fri, May 11, 2018 at 3:13 PM Alexandre DERUMIER < [ mailto:aderum...@odiso.com | aderum...@odiso.com ] > wrote: Hi, I'm still seeing memory leak with 12.2.5. seem to leak some MB each 5 minutes. I'll try to resent some stats next weekend. ----- Mail original ----- De: "Patrick Donnelly" < [ mailto:pdonn...@redhat.com | pdonn...@redhat.com ] > À: "Brady Deetz" < [ mailto:bde...@gmail.com | bde...@gmail.com ] > Cc: "Alexandre Derumier" < [ mailto:aderum...@odiso.com | aderum...@odiso.com ] >, "ceph-users" < [ mailto:ceph-users@lists.ceph.com | ceph-users@lists.ceph.com ] > Envoyé: Jeudi 10 Mai 2018 21:11:19 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? On Thu, May 10, 2018 at 12:00 PM, Brady Deetz < [ mailto:bde...@gmail.com | bde...@gmail.com ] > wrote: > [ceph-admin@mds0 ~]$ ps aux | grep ceph-mds > ceph 1841 3.5 94.3 133703308 124425384 ? Ssl Apr04 1808:32 > /usr/bin/ceph-mds -f --cluster ceph --id mds0 --setuser ceph --setgroup ceph > > > [ceph-admin@mds0 ~]$ sudo ceph daemon mds.mds0 cache status > { > "pool": { > "items": 173261056, > "bytes": 76504108600 > } > } > > So, 80GB is my configured limit for the cache and it appears the mds is > following that limit. But, the mds process is using over 100GB RAM in my > 128GB host. I thought I was playing it safe by configuring at 80. What other > things consume a lot of RAM for this process? > > Let me know if I need to create a new thread. The cache size measurement is imprecise pre-12.2.5 [1]. You should upgrade ASAP. [1] [ https://tracker.ceph.com/issues/22972 | https://tracker.ceph.com/issues/22972 ] -- Patrick Donnelly ___ ceph-users mailing list [ mailto:ceph-users@lists.ceph.com | ceph-users@lists.ceph.com ] [ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ] ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
You could use "mds_cache_size" to limit number of CAPS untill you have this fixed, but I'd say for your number of caps and inodes, 20GB is normal. this mds (jewel) here is consuming 24GB RAM: { "mds": { "request": 7194867047, "reply": 7194866688, "reply_latency": { "avgcount": 7194866688, "sum": 27779142.611775008 }, "forward": 0, "dir_fetch": 179223482, "dir_commit": 1529387896, "dir_split": 0, "inode_max": 300, "inodes": 3001264, "inodes_top": 160517, "inodes_bottom": 226577, "inodes_pin_tail": 2614170, "inodes_pinned": 2770689, "inodes_expired": 2920014835, "inodes_with_caps": 2743194, "caps": 2803568, "subtrees": 2, "traverse": 8255083028, "traverse_hit": 7452972311, "traverse_forward": 0, "traverse_discover": 0, "traverse_dir_fetch": 180547123, "traverse_remote_ino": 122257, "traverse_lock": 5957156, "load_cent": 18446743934203149911, "q": 54, "exported": 0, "exported_inodes": 0, "imported": 0, "imported_inodes": 0 } } Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* On Fri, May 11, 2018 at 3:13 PM Alexandre DERUMIER wrote: > Hi, > > I'm still seeing memory leak with 12.2.5. > > seem to leak some MB each 5 minutes. > > I'll try to resent some stats next weekend. > > > - Mail original - > De: "Patrick Donnelly" > À: "Brady Deetz" > Cc: "Alexandre Derumier" , "ceph-users" < > ceph-users@lists.ceph.com> > Envoyé: Jeudi 10 Mai 2018 21:11:19 > Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? > > On Thu, May 10, 2018 at 12:00 PM, Brady Deetz wrote: > > [ceph-admin@mds0 ~]$ ps aux | grep ceph-mds > > ceph 1841 3.5 94.3 133703308 124425384 ? Ssl Apr04 1808:32 > > /usr/bin/ceph-mds -f --cluster ceph --id mds0 --setuser ceph --setgroup > ceph > > > > > > [ceph-admin@mds0 ~]$ sudo ceph daemon mds.mds0 cache status > > { > > "pool": { > > "items": 173261056, > > "bytes": 76504108600 > > } > > } > > > > So, 80GB is my configured limit for the cache and it appears the mds is > > following that limit. But, the mds process is using over 100GB RAM in my > > 128GB host. I thought I was playing it safe by configuring at 80. What > other > > things consume a lot of RAM for this process? > > > > Let me know if I need to create a new thread. > > The cache size measurement is imprecise pre-12.2.5 [1]. You should upgrade > ASAP. > > [1] https://tracker.ceph.com/issues/22972 > > -- > Patrick Donnelly > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
Hi, I'm still seeing memory leak with 12.2.5. seem to leak some MB each 5 minutes. I'll try to resent some stats next weekend. - Mail original - De: "Patrick Donnelly" À: "Brady Deetz" Cc: "Alexandre Derumier" , "ceph-users" Envoyé: Jeudi 10 Mai 2018 21:11:19 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? On Thu, May 10, 2018 at 12:00 PM, Brady Deetz wrote: > [ceph-admin@mds0 ~]$ ps aux | grep ceph-mds > ceph 1841 3.5 94.3 133703308 124425384 ? Ssl Apr04 1808:32 > /usr/bin/ceph-mds -f --cluster ceph --id mds0 --setuser ceph --setgroup ceph > > > [ceph-admin@mds0 ~]$ sudo ceph daemon mds.mds0 cache status > { > "pool": { > "items": 173261056, > "bytes": 76504108600 > } > } > > So, 80GB is my configured limit for the cache and it appears the mds is > following that limit. But, the mds process is using over 100GB RAM in my > 128GB host. I thought I was playing it safe by configuring at 80. What other > things consume a lot of RAM for this process? > > Let me know if I need to create a new thread. The cache size measurement is imprecise pre-12.2.5 [1]. You should upgrade ASAP. [1] https://tracker.ceph.com/issues/22972 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
On Thu, May 10, 2018 at 12:00 PM, Brady Deetz wrote: > [ceph-admin@mds0 ~]$ ps aux | grep ceph-mds > ceph1841 3.5 94.3 133703308 124425384 ? Ssl Apr04 1808:32 > /usr/bin/ceph-mds -f --cluster ceph --id mds0 --setuser ceph --setgroup ceph > > > [ceph-admin@mds0 ~]$ sudo ceph daemon mds.mds0 cache status > { > "pool": { > "items": 173261056, > "bytes": 76504108600 > } > } > > So, 80GB is my configured limit for the cache and it appears the mds is > following that limit. But, the mds process is using over 100GB RAM in my > 128GB host. I thought I was playing it safe by configuring at 80. What other > things consume a lot of RAM for this process? > > Let me know if I need to create a new thread. The cache size measurement is imprecise pre-12.2.5 [1]. You should upgrade ASAP. [1] https://tracker.ceph.com/issues/22972 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
[ceph-admin@mds0 ~]$ ps aux | grep ceph-mds ceph1841 3.5 94.3 133703308 124425384 ? Ssl Apr04 1808:32 /usr/bin/ceph-mds -f --cluster ceph --id mds0 --setuser ceph --setgroup ceph [ceph-admin@mds0 ~]$ sudo ceph daemon mds.mds0 cache status { "pool": { "items": 173261056, "bytes": 76504108600 } } So, 80GB is my configured limit for the cache and it appears the mds is following that limit. But, the mds process is using over 100GB RAM in my 128GB host. I thought I was playing it safe by configuring at 80. What other things consume a lot of RAM for this process? Let me know if I need to create a new thread. On Thu, May 10, 2018 at 12:40 PM, Patrick Donnelly wrote: > Hello Brady, > > On Thu, May 10, 2018 at 7:35 AM, Brady Deetz wrote: > > I am now seeing the exact same issues you are reporting. A heap release > did > > nothing for me. > > I'm not sure it's the same issue... > > > [root@mds0 ~]# ceph daemon mds.mds0 config get mds_cache_memory_limit > > { > > "mds_cache_memory_limit": "80530636800" > > } > > 80G right? What was the memory use from `ps aux | grep ceph-mds`? > > > [root@mds0 ~]# ceph daemon mds.mds0 perf dump > > { > > ... > > "inode_max": 2147483647, > > "inodes": 35853368, > > "inodes_top": 23669670, > > "inodes_bottom": 12165298, > > "inodes_pin_tail": 18400, > > "inodes_pinned": 2039553, > > "inodes_expired": 142389542, > > "inodes_with_caps": 831824, > > "caps": 881384, > > Your cap count is 2% of the inodes in cache; the inodes pinned 5% of > the total. Your cache should be getting trimmed assuming the cache > size (as measured by the MDS, there are fixes in 12.2.5 which improve > its precision) is larger than your configured limit. > > If the cache size is larger than the limit (use `cache status` admin > socket command) then we'd be interested in seeing a few seconds of the > MDS debug log with higher debugging set (`config set debug_mds 20`). > > -- > Patrick Donnelly > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
Hello Brady, On Thu, May 10, 2018 at 7:35 AM, Brady Deetz wrote: > I am now seeing the exact same issues you are reporting. A heap release did > nothing for me. I'm not sure it's the same issue... > [root@mds0 ~]# ceph daemon mds.mds0 config get mds_cache_memory_limit > { > "mds_cache_memory_limit": "80530636800" > } 80G right? What was the memory use from `ps aux | grep ceph-mds`? > [root@mds0 ~]# ceph daemon mds.mds0 perf dump > { > ... > "inode_max": 2147483647, > "inodes": 35853368, > "inodes_top": 23669670, > "inodes_bottom": 12165298, > "inodes_pin_tail": 18400, > "inodes_pinned": 2039553, > "inodes_expired": 142389542, > "inodes_with_caps": 831824, > "caps": 881384, Your cap count is 2% of the inodes in cache; the inodes pinned 5% of the total. Your cache should be getting trimmed assuming the cache size (as measured by the MDS, there are fixes in 12.2.5 which improve its precision) is larger than your configured limit. If the cache size is larger than the limit (use `cache status` admin socket command) then we'd be interested in seeing a few seconds of the MDS debug log with higher debugging set (`config set debug_mds 20`). -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
ed": 0, > "num_recovering_processing": 0, > "num_recovering_enqueued": 0, > "num_recovering_prioritized": 0, > "recovery_started": 1, > "recovery_completed": 1, > "ireq_enqueue_scrub": 0, > "ireq_exportdir": 0, > "ireq_flush": 0, > "ireq_fragmentdir": 19058, > "ireq_fragstats": 0, > "ireq_inodestats": 0 > }, > "mds_log": { > "evadd": 108025412, > "evex": 108027485, > "evtrm": 108026461, > "ev": 25484, > "evexg": 0, > "evexd": 1024, > "segadd": 131605, > "segex": 131609, > "segtrm": 131608, > "seg": 31, > "segexg": 0, > "segexd": 1, > "expos": 5222483101644, > "wrpos": 5222526671740, > "rdpos": 5036811490502, > "jlat": { > "avgcount": 19597987, > "sum": 41720.071108694, > "avgtime": 0.002128793 > }, > "replayed": 26533 > }, > "mds_mem": { > "ino": 2087350, > "ino+": 5533126211, > "ino-": 5531038861, > "dir": 321262, > "dir+": 5672027, > "dir-": 5350765, > "dn": 2087920, > "dn+": 5553775487, > "dn-": 5551687567, > "cap": 3170853, > "cap+": 646307641, > "cap-": 643136788, > "rss": 12286508, > "heap": 313916, > "buf": 0 > }, > "mds_server": { > "dispatch_client_request": 651833084, > "dispatch_server_request": 0, > "handle_client_request": 608911096, > "handle_client_session": 5163844, > "handle_slave_request": 0, > "req_create": 754987, > "req_getattr": 5199299, > "req_getfilelock": 0, > "req_link": 170, > "req_lookup": 476304151, > "req_lookuphash": 0, > "req_lookupino": 0, > "req_lookupname": 16868, > "req_lookupparent": 0, > "req_lookupsnap": 0, > "req_lssnap": 0, > "req_mkdir": 12204, > "req_mknod": 0, > "req_mksnap": 0, > "req_open": 106156167, > "req_readdir": 20293077, > "req_rename": 28443, > "req_renamesnap": 0, > "req_rmdir": 17522, > "req_rmsnap": 0, > "req_rmxattr": 0, > "req_setattr": 34735, > "req_setdirlayout": 0, > "req_setfilelock": 238574, > "req_setlayout": 0, > "req_setxattr": 2, > "req_symlink": 122, > "req_unlink": 609565 > }, > "mds_sessions": { > "session_count": 307, > "session_add": 398, > "session_remove": 91 > }, > "objecter": { > "op_active": 0, > "op_laggy": 0, > "op_send": 60152761, > "op_send_bytes": 189780235877, > "op_resend": 4, > "op_reply": 60152757, > "op": 60152757, > "op_r": 32760612, > "op_w": 27392145, > "op_rmw": 0, > "op_pg": 0, > "osdop_stat": 1131412, > "osdop_create": 791110, > "osdop_read": 27868, > "osdop_write": 19625820, > "osdop_writefull": 81003, > "osdop_writesame": 0, > "osdop_append": 0, > "osdop_zero": 2, > "osdop_truncate": 4161, > "osdop_delete": 931372, > "osdop_mapext": 0, > "osdop_sparse_read": 0, > "osdop_clonerange": 0, > "osdop_getxattr": 9914736, > "osdop_setxattr": 1582220, >
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
quot;: 0, "poolop_active": 0, "poolop_send": 0, "poolop_resend": 0, "poolstat_active": 0, "poolstat_send": 0, "poolstat_resend": 0, "statfs_active": 0, "statfs_send": 0, "statfs_resend": 0, "command_active": 0, "command_send": 0, "command_resend": 0, "map_epoch": 3121, "map_full": 0, "map_inc": 76, "osd_sessions": 18, "osd_session_open": 20, "osd_session_close": 2, "osd_laggy": 0, "omap_wr": 2227270, "omap_rd": 65197068, "omap_del": 48058 }, "purge_queue": { "pq_executing_ops": 0, "pq_executing": 0, "pq_executed": 619458 }, "throttle-msgr_dispatch_throttler-mds": { "val": 0, "max": 104857600, "get_started": 0, "get": 831356927, "get_sum": 4299208168815, "get_or_fail_fail": 0, "get_or_fail_success": 831356927, "take": 0, "take_sum": 0, "put": 831356927, "put_sum": 4299208168815, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-objecter_bytes": { "val": 0, "max": 104857600, "get_started": 0, "get": 0, "get_sum": 0, "get_or_fail_fail": 0, "get_or_fail_success": 0, "take": 60152757, "take_sum": 189890861007, "put": 54571445, "put_sum": 189890861007, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-objecter_ops": { "val": 0, "max": 1024, "get_started": 0, "get": 0, "get_sum": 0, "get_or_fail_fail": 0, "get_or_fail_success": 0, "take": 60152757, "take_sum": 60152757, "put": 60152757, "put_sum": 60152757, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-write_buf_throttle": { "val": 0, "max": 3758096384, "get_started": 0, "get": 619458, "get_sum": 57609986, "get_or_fail_fail": 0, "get_or_fail_success": 619458, "take": 0, "take_sum": 0, "put": 27833, "put_sum": 57609986, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-write_buf_throttle-0x559471d00140": { "val": 105525, "max": 3758096384, "get_started": 0, "get": 108025412, "get_sum": 185715179864, "get_or_fail_fail": 0, "get_or_fail_success": 108025412, "take": 0, "take_sum": 0, "put": 19597987, "put_sum": 185715074339, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } } } - Mail original - De: "Zheng Yan" À: "aderumier" Cc: "Patrick Donnelly" , "ceph-users" Envoyé: Mardi 17 Avril 2018 05:20:18 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? On Sat, Apr 14, 2018 at 9:23 PM, Alexandre DERUMIER wrote: > Hi, > > Still leaking again after update to 12.2.4, around 17G after 9 days > > > > > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > > ceph 629903 50.7 25.9 17473680 17082432 ? Ssl avril05 6498:21 > /usr/bin/ceph-mds -f --cluster ceph --id ceph4-1.odiso.net --setuser ceph > --setgroup ceph > > > > > > ~# ceph daemon mds.ceph4-1.odiso.net cache status > { > "pool": { > "items": 16019302, > "bytes": 5100941968 > } > } > > > &
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
quot;: 27007173, > "osdop_setxattr": 38801594, > "osdop_cmpxattr": 0, > "osdop_rmxattr": 0, > "osdop_resetxattrs": 0, > "osdop_tmap_up": 0, > "osdop_tmap_put": 0, > "osdop_tmap_get": 0, > "osdop_call": 0, > "osdop_watch": 0, > "osdop_notify": 0, > "osdop_src_cmpxattr": 0, > "osdop_pgls": 0, > "osdop_pgls_filter": 0, > "osdop_other": 10143158, > "linger_active": 0, > "linger_send": 0, > "linger_resend": 0, > "linger_ping": 0, > "poolop_active": 0, > "poolop_send": 0, > "poolop_resend": 0, > "poolstat_active": 0, > "poolstat_send": 0, > "poolstat_resend": 0, > "statfs_active": 0, > "statfs_send": 0, > "statfs_resend": 0, > "command_active": 0, > "command_send": 0, > "command_resend": 0, > "map_epoch": 3044, > "map_full": 0, > "map_inc": 160, > "osd_sessions": 18, > "osd_session_open": 20, > "osd_session_close": 2, > "osd_laggy": 0, > "omap_wr": 9743114, > "omap_rd": 191911089, > "omap_del": 684272 > }, > "purge_queue": { > "pq_executing_ops": 0, > "pq_executing": 0, > "pq_executed": 2316671 > }, > "throttle-msgr_dispatch_throttler-mds": { > "val": 0, > "max": 104857600, > "get_started": 0, > "get": 1884071270, > "get_sum": 12697353890803, > "get_or_fail_fail": 0, > "get_or_fail_success": 1884071270, > "take": 0, > "take_sum": 0, > "put": 1884071270, > "put_sum": 12697353890803, > "wait": { > "avgcount": 0, > "sum": 0.0, > "avgtime": 0.0 > } > }, > "throttle-objecter_bytes": { > "val": 0, > "max": 104857600, > "get_started": 0, > "get": 0, > "get_sum": 0, > "get_or_fail_fail": 0, > "get_or_fail_success": 0, > "take": 197270390, > "take_sum": 796529593788, > "put": 183928495, > "put_sum": 796529593788, > "wait": { > "avgcount": 0, > "sum": 0.0, > "avgtime": 0.0 > } > }, > "throttle-objecter_ops": { > "val": 0, > "max": 1024, > "get_started": 0, > "get": 0, > "get_sum": 0, > "get_or_fail_fail": 0, > "get_or_fail_success": 0, > "take": 197270390, > "take_sum": 197270390, > "put": 197270390, > "put_sum": 197270390, > "wait": { > "avgcount": 0, > "sum": 0.0, > "avgtime": 0.0 > } > }, > "throttle-write_buf_throttle": { > "val": 0, > "max": 3758096384, > "get_started": 0, > "get": 2316671, > "get_sum": 215451035, > "get_or_fail_fail": 0, > "get_or_fail_success": 2316671, > "take": 0, > "take_sum": 0, > "put": 31223, > "put_sum": 215451035, > "wait": { > "avgcount": 0, > "sum": 0.0, > "avgtime": 0.0 > } > }, > "throttle-write_buf_throttle-0x563c33bea220": { > "val": 29763, > "max": 3758096384, > "get_started": 0, > "get": 293928039, > "get_sum": 765120443785, > "get_or_fail_fail": 0, > "get_or_fail_success": 293928039, > "take": 0, > "take_sum": 0, > "put": 62629276, > "put_sum": 765120414022, > "wait": { > "avgcount": 0, > "sum": 0.0, > "avgtime": 0.0 > } > } > } > I don't find any clue. Next time it happens, could you please try "ceph tell mds.xxx heap release" > > > # ceph status > cluster: > id: e22b8e83-3036-4fe5-8fd5-5ce9d539beca > health: HEALTH_OK > > services: > mon: 3 daemons, quorum ceph4-1,ceph4-2,ceph4-3 > mgr: ceph4-2.odiso.net(active), standbys: ceph4-3.odiso.net, > ceph4-1.odiso.net > mds: cephfs4-1/1/1 up {0=ceph4-1.odiso.net=up:active}, 2 up:standby > osd: 18 osds: 18 up, 18 in > > data: > pools: 11 pools, 1992 pgs > objects: 72258k objects, 5918 GB > usage: 20088 GB used, 6737 GB / 26825 GB avail > pgs: 1992 active+clean > > io: > client: 3099 kB/s rd, 6412 kB/s wr, 108 op/s rd, 481 op/s wr > > > - Mail original - > De: "Patrick Donnelly" > À: "aderumier" > Cc: "ceph-users" > Envoyé: Mardi 27 Mars 2018 20:35:08 > Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? > > Hello Alexandre, > > On Thu, Mar 22, 2018 at 2:29 AM, Alexandre DERUMIER > wrote: >> Hi, >> >> I'm running cephfs since 2 months now, >> >> and my active msd memory usage is around 20G now (still growing). >> >> ceph 1521539 10.8 31.2 20929836 20534868 ? Ssl janv.26 8573:34 >> /usr/bin/ceph-mds -f --cluster ceph --id 2 --setuser ceph --setgroup ceph >> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND >> >> >> this is on luminous 12.2.2 >> >> only tuning done is: >> >> mds_cache_memory_limit = 5368709120 >> >> >> (5GB). I known it's a soft limit, but 20G seem quite huge vs 5GB >> >> >> Is it normal ? > > No, that's definitely not normal! > > >> # ceph daemon mds.2 perf dump mds >> { >> "mds": { >> "request": 1444009197, >> "reply": 1443999870, >> "reply_latency": { >> "avgcount": 1443999870, >> "sum": 1657849.656122933, >> "avgtime": 0.001148095 >> }, >> "forward": 0, >> "dir_fetch": 51740910, >> "dir_commit": 9069568, >> "dir_split": 64367, >> "dir_merge": 58016, >> "inode_max": 2147483647, >> "inodes": 2042975, >> "inodes_top": 152783, >> "inodes_bottom": 138781, >> "inodes_pin_tail": 1751411, >> "inodes_pinned": 1824714, >> "inodes_expired": 7258145573, >> "inodes_with_caps": 1812018, >> "caps": 2538233, >> "subtrees": 2, >> "traverse": 1591668547, >> "traverse_hit": 1259482170, >> "traverse_forward": 0, >> "traverse_discover": 0, >> "traverse_dir_fetch": 30827836, >> "traverse_remote_ino": 7510, >> "traverse_lock": 86236, >> "load_cent": 144401980319, >> "q": 49, >> "exported": 0, >> "exported_inodes": 0, >> "imported": 0, >> "imported_inodes": 0 >> } >> } > > Can you also share `ceph daemon mds.2 cache status`, the full `ceph > daemon mds.2 perf dump`, and `ceph status`? > > Note [1] will be in 12.2.5 and may help with your issue. > > [1] https://github.com/ceph/ceph/pull/20527 > > -- > Patrick Donnelly > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
"map_inc": 160, "osd_sessions": 18, "osd_session_open": 20, "osd_session_close": 2, "osd_laggy": 0, "omap_wr": 9743114, "omap_rd": 191911089, "omap_del": 684272 }, "purge_queue": { "pq_executing_ops": 0, "pq_executing": 0, "pq_executed": 2316671 }, "throttle-msgr_dispatch_throttler-mds": { "val": 0, "max": 104857600, "get_started": 0, "get": 1884071270, "get_sum": 12697353890803, "get_or_fail_fail": 0, "get_or_fail_success": 1884071270, "take": 0, "take_sum": 0, "put": 1884071270, "put_sum": 12697353890803, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-objecter_bytes": { "val": 0, "max": 104857600, "get_started": 0, "get": 0, "get_sum": 0, "get_or_fail_fail": 0, "get_or_fail_success": 0, "take": 197270390, "take_sum": 796529593788, "put": 183928495, "put_sum": 796529593788, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-objecter_ops": { "val": 0, "max": 1024, "get_started": 0, "get": 0, "get_sum": 0, "get_or_fail_fail": 0, "get_or_fail_success": 0, "take": 197270390, "take_sum": 197270390, "put": 197270390, "put_sum": 197270390, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-write_buf_throttle": { "val": 0, "max": 3758096384, "get_started": 0, "get": 2316671, "get_sum": 215451035, "get_or_fail_fail": 0, "get_or_fail_success": 2316671, "take": 0, "take_sum": 0, "put": 31223, "put_sum": 215451035, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } }, "throttle-write_buf_throttle-0x563c33bea220": { "val": 29763, "max": 3758096384, "get_started": 0, "get": 293928039, "get_sum": 765120443785, "get_or_fail_fail": 0, "get_or_fail_success": 293928039, "take": 0, "take_sum": 0, "put": 62629276, "put_sum": 765120414022, "wait": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 } } } # ceph status cluster: id: e22b8e83-3036-4fe5-8fd5-5ce9d539beca health: HEALTH_OK services: mon: 3 daemons, quorum ceph4-1,ceph4-2,ceph4-3 mgr: ceph4-2.odiso.net(active), standbys: ceph4-3.odiso.net, ceph4-1.odiso.net mds: cephfs4-1/1/1 up {0=ceph4-1.odiso.net=up:active}, 2 up:standby osd: 18 osds: 18 up, 18 in data: pools: 11 pools, 1992 pgs objects: 72258k objects, 5918 GB usage: 20088 GB used, 6737 GB / 26825 GB avail pgs: 1992 active+clean io: client: 3099 kB/s rd, 6412 kB/s wr, 108 op/s rd, 481 op/s wr - Mail original - De: "Patrick Donnelly" À: "aderumier" Cc: "ceph-users" Envoyé: Mardi 27 Mars 2018 20:35:08 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? Hello Alexandre, On Thu, Mar 22, 2018 at 2:29 AM, Alexandre DERUMIER wrote: > Hi, > > I'm running cephfs since 2 months now, > > and my active msd memory usage is around 20G now (still growing). > > ceph 1521539 10.8 31.2 20929836 20534868 ? Ssl janv.26 8573:34 > /usr/bin/ceph-mds -f --cluster ceph --id 2 --setuser ceph --setgroup ceph > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > > > this is on luminous 12.2.2 > > only tuning done is: > > mds_cache_memory_limit = 5368709120 > > > (5GB). I known it's a soft limit, but 20G seem quite huge vs 5GB > > > Is it normal ? No, that's definitely not normal! > # ceph daemon mds.2 perf dump mds > { > "mds": { > "request": 1444009197, > "reply": 1443999870, > "reply_latency": { > "avgcount": 1443999870, > "sum": 1657849.656122933, > "avgtime": 0.001148095 > }, > "forward": 0, > "dir_fetch": 51740910, > "dir_commit": 9069568, > "dir_split": 64367, > "dir_merge": 58016, > "inode_max": 2147483647, > "inodes": 2042975, > "inodes_top": 152783, > "inodes_bottom": 138781, > "inodes_pin_tail": 1751411, > "inodes_pinned": 1824714, > "inodes_expired": 7258145573, > "inodes_with_caps": 1812018, > "caps": 2538233, > "subtrees": 2, > "traverse": 1591668547, > "traverse_hit": 1259482170, > "traverse_forward": 0, > "traverse_discover": 0, > "traverse_dir_fetch": 30827836, > "traverse_remote_ino": 7510, > "traverse_lock": 86236, > "load_cent": 144401980319, > "q": 49, > "exported": 0, > "exported_inodes": 0, > "imported": 0, > "imported_inodes": 0 > } > } Can you also share `ceph daemon mds.2 cache status`, the full `ceph daemon mds.2 perf dump`, and `ceph status`? Note [1] will be in 12.2.5 and may help with your issue. [1] https://github.com/ceph/ceph/pull/20527 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
>>Can you also share `ceph daemon mds.2 cache status`, the full `ceph >>daemon mds.2 perf dump`, and `ceph status`? Sorry, too late, I needed to restart the mds daemon because I was out of memory :( Seem stable for now. (around 500mb) Not sure It was related, but I had a ganesha-nfs ->cephfs daemon running on this cluster. (but no client connected to it) >>Note [1] will be in 12.2.5 and may help with your issue. >>[1] https://github.com/ceph/ceph/pull/20527 ok thanks ! - Mail original - De: "Patrick Donnelly" À: "Alexandre Derumier" Cc: "ceph-users" Envoyé: Mardi 27 Mars 2018 20:35:08 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? Hello Alexandre, On Thu, Mar 22, 2018 at 2:29 AM, Alexandre DERUMIER wrote: > Hi, > > I'm running cephfs since 2 months now, > > and my active msd memory usage is around 20G now (still growing). > > ceph 1521539 10.8 31.2 20929836 20534868 ? Ssl janv.26 8573:34 > /usr/bin/ceph-mds -f --cluster ceph --id 2 --setuser ceph --setgroup ceph > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > > > this is on luminous 12.2.2 > > only tuning done is: > > mds_cache_memory_limit = 5368709120 > > > (5GB). I known it's a soft limit, but 20G seem quite huge vs 5GB > > > Is it normal ? No, that's definitely not normal! > # ceph daemon mds.2 perf dump mds > { > "mds": { > "request": 1444009197, > "reply": 1443999870, > "reply_latency": { > "avgcount": 1443999870, > "sum": 1657849.656122933, > "avgtime": 0.001148095 > }, > "forward": 0, > "dir_fetch": 51740910, > "dir_commit": 9069568, > "dir_split": 64367, > "dir_merge": 58016, > "inode_max": 2147483647, > "inodes": 2042975, > "inodes_top": 152783, > "inodes_bottom": 138781, > "inodes_pin_tail": 1751411, > "inodes_pinned": 1824714, > "inodes_expired": 7258145573, > "inodes_with_caps": 1812018, > "caps": 2538233, > "subtrees": 2, > "traverse": 1591668547, > "traverse_hit": 1259482170, > "traverse_forward": 0, > "traverse_discover": 0, > "traverse_dir_fetch": 30827836, > "traverse_remote_ino": 7510, > "traverse_lock": 86236, > "load_cent": 144401980319, > "q": 49, > "exported": 0, > "exported_inodes": 0, > "imported": 0, > "imported_inodes": 0 > } > } Can you also share `ceph daemon mds.2 cache status`, the full `ceph daemon mds.2 perf dump`, and `ceph status`? Note [1] will be in 12.2.5 and may help with your issue. [1] https://github.com/ceph/ceph/pull/20527 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
Hello Alexandre, On Thu, Mar 22, 2018 at 2:29 AM, Alexandre DERUMIER wrote: > Hi, > > I'm running cephfs since 2 months now, > > and my active msd memory usage is around 20G now (still growing). > > ceph 1521539 10.8 31.2 20929836 20534868 ? Ssl janv.26 8573:34 > /usr/bin/ceph-mds -f --cluster ceph --id 2 --setuser ceph --setgroup ceph > USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND > > > this is on luminous 12.2.2 > > only tuning done is: > > mds_cache_memory_limit = 5368709120 > > > (5GB). I known it's a soft limit, but 20G seem quite huge vs 5GB > > > Is it normal ? No, that's definitely not normal! > # ceph daemon mds.2 perf dump mds > { > "mds": { > "request": 1444009197, > "reply": 1443999870, > "reply_latency": { > "avgcount": 1443999870, > "sum": 1657849.656122933, > "avgtime": 0.001148095 > }, > "forward": 0, > "dir_fetch": 51740910, > "dir_commit": 9069568, > "dir_split": 64367, > "dir_merge": 58016, > "inode_max": 2147483647, > "inodes": 2042975, > "inodes_top": 152783, > "inodes_bottom": 138781, > "inodes_pin_tail": 1751411, > "inodes_pinned": 1824714, > "inodes_expired": 7258145573, > "inodes_with_caps": 1812018, > "caps": 2538233, > "subtrees": 2, > "traverse": 1591668547, > "traverse_hit": 1259482170, > "traverse_forward": 0, > "traverse_discover": 0, > "traverse_dir_fetch": 30827836, > "traverse_remote_ino": 7510, > "traverse_lock": 86236, > "load_cent": 144401980319, > "q": 49, > "exported": 0, > "exported_inodes": 0, > "imported": 0, > "imported_inodes": 0 > } > } Can you also share `ceph daemon mds.2 cache status`, the full `ceph daemon mds.2 perf dump`, and `ceph status`? Note [1] will be in 12.2.5 and may help with your issue. [1] https://github.com/ceph/ceph/pull/20527 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
Hi, >>Did the fs have lots of mount/umount? not too much, I have around 300 ceph-fuse clients (12.2.2 && 12.2.4) and ceph cluster is 12.2.2. maybe when client reboot, but that don't happen too much. >> We recently found a memory leak >>bug in that area https://github.com/ceph/ceph/pull/20148 Ok thanks. Does session occur only at mount/unmount ? I have another cluster, with 64 fuse-client, mds memory is around 500mb. (with default mds_cache_memory_limit , no tuning, and ceph cluster is 12.2.4 instead 12.2.2) Clients are also ceph-fuse 12.2.2 && 12.2.4 I'll try to upgrade this buggy mds to 12.2.4 to see if it's helping. - Mail original - De: "Zheng Yan" À: "aderumier" Cc: "ceph-users" Envoyé: Vendredi 23 Mars 2018 01:08:46 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? Did the fs have lots of mount/umount? We recently found a memory leak bug in that area https://github.com/ceph/ceph/pull/20148 Regards Yan, Zheng On Thu, Mar 22, 2018 at 5:29 PM, Alexandre DERUMIER wrote: > Hi, > > I'm running cephfs since 2 months now, > > and my active msd memory usage is around 20G now (still growing). > > ceph 1521539 10.8 31.2 20929836 20534868 ? Ssl janv.26 8573:34 > /usr/bin/ceph-mds -f --cluster ceph --id 2 --setuser ceph --setgroup ceph > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > > > this is on luminous 12.2.2 > > only tuning done is: > > mds_cache_memory_limit = 5368709120 > > > (5GB). I known it's a soft limit, but 20G seem quite huge vs 5GB > > > Is it normal ? > > > > > # ceph daemon mds.2 perf dump mds > { > "mds": { > "request": 1444009197, > "reply": 1443999870, > "reply_latency": { > "avgcount": 1443999870, > "sum": 1657849.656122933, > "avgtime": 0.001148095 > }, > "forward": 0, > "dir_fetch": 51740910, > "dir_commit": 9069568, > "dir_split": 64367, > "dir_merge": 58016, > "inode_max": 2147483647, > "inodes": 2042975, > "inodes_top": 152783, > "inodes_bottom": 138781, > "inodes_pin_tail": 1751411, > "inodes_pinned": 1824714, > "inodes_expired": 7258145573, > "inodes_with_caps": 1812018, > "caps": 2538233, > "subtrees": 2, > "traverse": 1591668547, > "traverse_hit": 1259482170, > "traverse_forward": 0, > "traverse_discover": 0, > "traverse_dir_fetch": 30827836, > "traverse_remote_ino": 7510, > "traverse_lock": 86236, > "load_cent": 144401980319, > "q": 49, > "exported": 0, > "exported_inodes": 0, > "imported": 0, > "imported_inodes": 0 > } > } > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?
Did the fs have lots of mount/umount? We recently found a memory leak bug in that area https://github.com/ceph/ceph/pull/20148 Regards Yan, Zheng On Thu, Mar 22, 2018 at 5:29 PM, Alexandre DERUMIER wrote: > Hi, > > I'm running cephfs since 2 months now, > > and my active msd memory usage is around 20G now (still growing). > > ceph 1521539 10.8 31.2 20929836 20534868 ? Ssl janv.26 8573:34 > /usr/bin/ceph-mds -f --cluster ceph --id 2 --setuser ceph --setgroup ceph > USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND > > > this is on luminous 12.2.2 > > only tuning done is: > > mds_cache_memory_limit = 5368709120 > > > (5GB). I known it's a soft limit, but 20G seem quite huge vs 5GB > > > Is it normal ? > > > > > # ceph daemon mds.2 perf dump mds > { > "mds": { > "request": 1444009197, > "reply": 1443999870, > "reply_latency": { > "avgcount": 1443999870, > "sum": 1657849.656122933, > "avgtime": 0.001148095 > }, > "forward": 0, > "dir_fetch": 51740910, > "dir_commit": 9069568, > "dir_split": 64367, > "dir_merge": 58016, > "inode_max": 2147483647, > "inodes": 2042975, > "inodes_top": 152783, > "inodes_bottom": 138781, > "inodes_pin_tail": 1751411, > "inodes_pinned": 1824714, > "inodes_expired": 7258145573, > "inodes_with_caps": 1812018, > "caps": 2538233, > "subtrees": 2, > "traverse": 1591668547, > "traverse_hit": 1259482170, > "traverse_forward": 0, > "traverse_discover": 0, > "traverse_dir_fetch": 30827836, > "traverse_remote_ino": 7510, > "traverse_lock": 86236, > "load_cent": 144401980319, > "q": 49, > "exported": 0, > "exported_inodes": 0, > "imported": 0, > "imported_inodes": 0 > } > } > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph mds memory usage 20GB : is it normal ?
Hi, I'm running cephfs since 2 months now, and my active msd memory usage is around 20G now (still growing). ceph 1521539 10.8 31.2 20929836 20534868 ? Ssl janv.26 8573:34 /usr/bin/ceph-mds -f --cluster ceph --id 2 --setuser ceph --setgroup ceph USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND this is on luminous 12.2.2 only tuning done is: mds_cache_memory_limit = 5368709120 (5GB). I known it's a soft limit, but 20G seem quite huge vs 5GB Is it normal ? # ceph daemon mds.2 perf dump mds { "mds": { "request": 1444009197, "reply": 1443999870, "reply_latency": { "avgcount": 1443999870, "sum": 1657849.656122933, "avgtime": 0.001148095 }, "forward": 0, "dir_fetch": 51740910, "dir_commit": 9069568, "dir_split": 64367, "dir_merge": 58016, "inode_max": 2147483647, "inodes": 2042975, "inodes_top": 152783, "inodes_bottom": 138781, "inodes_pin_tail": 1751411, "inodes_pinned": 1824714, "inodes_expired": 7258145573, "inodes_with_caps": 1812018, "caps": 2538233, "subtrees": 2, "traverse": 1591668547, "traverse_hit": 1259482170, "traverse_forward": 0, "traverse_discover": 0, "traverse_dir_fetch": 30827836, "traverse_remote_ino": 7510, "traverse_lock": 86236, "load_cent": 144401980319, "q": 49, "exported": 0, "exported_inodes": 0, "imported": 0, "imported_inodes": 0 } } ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com