Wow, Thanks Nathan and NeilBrown. It is great to learn about slub merging. It is awesome to have a reproducer. I am yet to trigger my original problem with slurm_nomerge but slabinfo tool (in kernel sources) can actually show merged caches: kernel/3.10.0-693.5.2.el7/tools/slabinfo -a
:t-0000112 <- sysfs_dir_cache kernfs_node_cache blkdev_integrity task_delay_info :t-0000144 <- flow_cache cl_env_kmem :t-0000160 <- sigqueue lov_object_kmem :t-0000168 <- lovsub_object_kmem osc_extent_kmem :t-0000176 <- vvp_object_kmem nfsd4_stateids :t-0000192 <- ldlm_resources kiocb cred_jar inet_peer_cache key_jar file_lock_cache kmalloc-192 dmaengine-unmap-16 bio_integrity_payload :t-0000216 <- vvp_session_kmem vm_area_struct :t-0000256 <- biovec-16 ip_dst_cache bio-0 ll_file_data kmalloc-256 sgpool-8 filp request_sock_TCP rpc_tasks request_sock_TCPv6 skbuff_head_cache pool_workqueue lov_thread_kmem :t-0000264 <- osc_lock_kmem numa_policy :t-0000328 <- osc_session_kmem taskstats :t-0000576 <- kioctx xfrm_dst_cache vvp_thread_kmem :t-0001152 <- signal_cache lustre_inode_cache It is not on a machine that had the problem i described before but the kernel version is the same so I am assuming the cache merges are the same. Looks like signal_cache points to lustre_inode_cache. Regards. Jacek Tomaka On Thu, Apr 25, 2019 at 7:42 AM NeilBrown <[email protected]> wrote: > > Hi, > you seem to be able to reproduce this fairly easily. > If so, could you please boot with the "slub_nomerge" kernel parameter > and then reproduce the (apparent) memory leak. > I'm hoping that this will show some other slab that is actually using > the memory - a slab with very similar object-size to signal_cache that > is, by default, being merged with signal_cache. > > Thanks, > NeilBrown > > > On Wed, Apr 24 2019, Nathan Dauchy - NOAA Affiliate wrote: > > > On Mon, Apr 15, 2019 at 9:18 PM Jacek Tomaka <[email protected]> wrote: > > > >> > >> >signal_cache should have one entry for each process (or thread-group). > >> > >> That is what i thought as well, looking at the kernel source, > allocations > >> from > >> signal_cache happen only during fork. > >> > >> > > I was recently chasing an issue with clients suffering from low memory > and > > saw that "signal_cache" was a major player. But the workload on those > > clients was not doing a lot of forking. (and I don't *think* threading > > either) Rather it was a LOT of metadata read operations. > > > > You can see the symptoms by a simple "du" on a Lustre file system: > > > > # grep signal_cache /proc/slabinfo > > signal_cache 967 1092 1152 28 8 : tunables 0 0 > 0 > > : slabdata 39 39 0 > > > > # du -s /mnt/lfs1/projects/foo > > 339744908 /mnt/lfs1/projects/foo > > > > # grep signal_cache /proc/slabinfo > > signal_cache 164724 164724 1152 28 8 : tunables 0 0 > 0 > > : slabdata 5883 5883 0 > > > > # slabtop -s c -o | head -n 20 > > Active / Total Objects (% used) : 3660791 / 3662863 (99.9%) > > Active / Total Slabs (% used) : 93019 / 93019 (100.0%) > > Active / Total Caches (% used) : 72 / 107 (67.3%) > > Active / Total Size (% used) : 836474.91K / 837502.16K (99.9%) > > Minimum / Average / Maximum Object : 0.01K / 0.23K / 12.75K > > > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > > > > 164724 164724 100% 1.12K 5883 28 188256K signal_cache > > > > 331712 331712 100% 0.50K 10366 32 165856K ldlm_locks > > > > 656896 656896 100% 0.12K 20528 32 82112K kmalloc-128 > > > > 340200 339971 99% 0.19K 8100 42 64800K kmalloc-192 > > > > 162838 162838 100% 0.30K 6263 26 50104K osc_object_kmem > > > > 744192 744192 100% 0.06K 11628 64 46512K kmalloc-64 > > > > 205128 205128 100% 0.19K 4884 42 39072K dentry > > > > 4268 4256 99% 8.00K 1067 4 34144K kmalloc-8192 > > > > 162978 162978 100% 0.17K 3543 46 28344K vvp_object_kmem > > > > 162792 162792 100% 0.16K 6783 24 27132K > kvm_mmu_page_header > > > > 162825 162825 100% 0.16K 6513 25 26052K sigqueue > > > > 16368 16368 100% 1.02K 528 31 16896K nfs_inode_cache > > > > 20385 20385 100% 0.58K 755 27 12080K inode_cache > > > > > > Repeat that for more (and bigger) directories and slab cache added up to > > more than half the memory on this 24GB node. > > > > This is with CentOS-7.6 and lustre-2.10.5_ddn6. > > > > I worked around the problem by tackling the "ldlm_locks" memory usage > with: > > # lctl set_param ldlm.namespaces.lfs*.lru_max_age=10000 > > > > ...but I did not find a way to reduce the "signal_cache". > > > > Regards, > > Nathan > -- *Jacek Tomaka* Geophysical Software Developer *DownUnder GeoSolutions* 76 Kings Park Road West Perth 6005 WA, Australia *tel *+61 8 9287 4143 <+61%208%209287%204143> [email protected] *www.dug.com <http://www.dug.com>*
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
