Re: Likely mem leak in 3.7
I've extensively tested 2844a48706e5 (tip at the time I compiled) for the last few days and have been unable to reproduce. This bug appears to be fixed. Thanks. -JimC -- James Cloos OpenPGP: 1024D/ED7DAEA6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Likely mem leak in 3.7
> "DR" == David Rientjes writes: I had to reboot to get some work done. In order to re-create the missing ram I had to use a btrfs fs for the temp files for a few emerge(1) runs. It seems that the ram which was used for the cache of those temp files is not recovered when the files are deleted? Right now cached is about 4G smaller than typical on previous kernels. I also notice that there are ten 1G maps (end of /proc/meminfo); I wonder whether those end up used inefficiently? I compared the output below to a capture on 3.6 about 1¾ hours after boot; the numbers do not seem much different, although the btrfs slabs have different names. DR> echo m > /proc/sysrq-trigger [95432.729187] SysRq : Show Memory [95432.729192] Mem-Info: [95432.729194] Node 0 DMA per-cpu: [95432.729196] CPU0: hi:0, btch: 1 usd: 0 [95432.729198] CPU1: hi:0, btch: 1 usd: 0 [95432.729199] CPU2: hi:0, btch: 1 usd: 0 [95432.729200] CPU3: hi:0, btch: 1 usd: 0 [95432.729201] Node 0 DMA32 per-cpu: [95432.729203] CPU0: hi: 186, btch: 31 usd: 157 [95432.729205] CPU1: hi: 186, btch: 31 usd: 184 [95432.729206] CPU2: hi: 186, btch: 31 usd: 173 [95432.729207] CPU3: hi: 186, btch: 31 usd: 181 [95432.729208] Node 0 Normal per-cpu: [95432.729210] CPU0: hi: 186, btch: 31 usd: 158 [95432.729211] CPU1: hi: 186, btch: 31 usd: 157 [95432.729212] CPU2: hi: 186, btch: 31 usd: 89 [95432.729214] CPU3: hi: 186, btch: 31 usd: 122 [95432.729218] active_anon:965802 inactive_anon:174825 isolated_anon:0 active_file:162783 inactive_file:223765 isolated_file:0 unevictable:0 dirty:4486 writeback:315 unstable:0 free:178576 slab_reclaimable:113370 slab_unreclaimable:8715 mapped:949576 shmem:940282 pagetables:21020 bounce:0 free_cma:0 [95432.729221] Node 0 DMA free:15864kB min:88kB low:108kB high:132kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15880kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [95432.729226] lowmem_reserve[]: 0 3168 11682 11682 [95432.729229] Node 0 DMA32 free:376380kB min:18300kB low:22872kB high:27448kB active_anon:1551176kB inactive_anon:345756kB active_file:486140kB inactive_file:361936kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3244136kB mlocked:0kB dirty:1368kB writeback:88kB mapped:1883492kB shmem:1880344kB slab_reclaimable:157768kB slab_unreclaimable:3944kB kernel_stack:496kB pagetables:23592kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [95432.729235] lowmem_reserve[]: 0 0 8514 8514 [95432.729237] Node 0 Normal free:322060kB min:49188kB low:61484kB high:73780kB active_anon:2312032kB inactive_anon:353544kB active_file:164992kB inactive_file:533124kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:8718776kB mlocked:0kB dirty:16576kB writeback:1172kB mapped:1914812kB shmem:1880784kB slab_reclaimable:295712kB slab_unreclaimable:30900kB kernel_stack:3552kB pagetables:60488kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [95432.729242] lowmem_reserve[]: 0 0 0 0 [95432.729245] Node 0 DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15864kB [95432.729252] Node 0 DMA32: 36923*4kB 15958*8kB 4599*16kB 741*32kB 27*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 376428kB [95432.729258] Node 0 Normal: 35337*4kB 93927*8kB 46306*16kB 19127*32kB 6764*64kB 2788*128kB 1236*256kB 539*512kB 273*1024kB 84*2048kB 201*4096kB = 4902748kB [95432.729265] 1329636 total pagecache pages [95432.729267] 2823 pages in swap cache [95432.729268] Swap cache stats: add 46261, delete 43438, find 709513/710665 [95432.729269] Free swap = 3540840kB [95432.729270] Total swap = 3670008kB [95432.780343] 3080176 pages RAM [95432.780345] 68408 pages reserved [95432.780346] 7950465 pages shared [95432.780347] 418322 pages non-shared DR> zgrep CONFIG_SL[AU]B /proc/config.gz CONFIG_SLUB_DEBUG=y # CONFIG_SLAB is not set CONFIG_SLUB=y CONFIG_SLABINFO=y # CONFIG_SLUB_DEBUG_ON is not set # CONFIG_SLUB_STATS is not set DR> cat /proc/slabinfo slabinfo - version: 2.1 # name : tunables: slabdata ext4_groupinfo_1k 64 64128 321 : tunables000 : slabdata 2 2 0 fat_inode_cache 48 48680 244 : tunables000 : slabdata 2 2 0 fat_cache 0 0 40 1021 : tunables000 : slabdata 0 0 0 UDPLITEv6 0 0 1088 308 : tunables000 : slabdata 0 0 0 UDPv6120120 1088 308 : tunables
Re: Likely mem leak in 3.7
On Thu, 15 Nov 2012, James Cloos wrote: > The kernel does not log anything relevant to this. > Can you do the following as root: dmesg -c > /dev/null echo m > /proc/sysrq-trigger dmesg > foo and send foo inline in your reply? > Slabinfo gives some odd output. It seems to think there are negative > quantities of some slabs: > > Name Objects ObjsizeSpace Slabs/Part/Cpu O/S O %Fr %Ef > Flg > :at-016 5632 1690.1K 18446744073709551363/0/275 > 256 0 0 100 *a > :t-0483386 48 249.8K 18446744073709551558/22/119 > 85 0 36 65 * > :t-1201022 120 167.9K 18446744073709551604/14/53 > 34 0 34 73 * > blkdev_requests182 376 122.8K 18446744073709551604/7/27 > 21 1 46 55 > ext4_io_end3481128 393.2K 18446744073709551588/0/40 > 29 3 0 99 a > Can you send the output of zgrep CONFIG_SL[AU]B /proc/config.gz cat /proc/slabinfo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Likely mem leak in 3.7
Starting with 3.7 rc1, my workstation seems to loose ram. Up until (and including) 3.6, used-(buffers+cached) was roughly the same as sum(rss) (taking shared into account). Now there is an approx 6G gap. When the box first starts, it is clearly less swappy than with <= 3.6; I can't tell whether that is related. The reduced swappiness persists. It seems to get worse when I update packages (it runs Gentoo). The portage tree and overlays are on btrfs filesystems. As is /var/log (with compression, except for the distfiles fs). The compilations themselves are done in a tmpfs. I CCed l-b because of that apparent correlation. My postgress db is on xfs (tested faster) and has a 3G shared segment, but that recovers when the pg process is stopped; neither of those seem to be implicated. There are also several ext4 partitions, including / and /home. Cgroups are configured, and openrc does put everything it starts into its own directory under /sys/fs/cgroup/openrc. But top(1) shows all of the processes, and its idea of free mem does change with pg's use of its shared segment. So it doesn't *look* like the ram is hiding in some cgroup. The kernel does not log anything relevant to this. Slabinfo gives some odd output. It seems to think there are negative quantities of some slabs: Name Objects ObjsizeSpace Slabs/Part/Cpu O/S O %Fr %Ef Flg :at-016 5632 1690.1K 18446744073709551363/0/275 256 0 0 100 *a :t-0483386 48 249.8K 18446744073709551558/22/119 85 0 36 65 * :t-1201022 120 167.9K 18446744073709551604/14/53 34 0 34 73 * blkdev_requests182 376 122.8K 18446744073709551604/7/27 21 1 46 55 ext4_io_end3481128 393.2K 18446744073709551588/0/40 29 3 0 99 a The largest entries it reports are: Name Objects ObjsizeSpace Slabs/Part/Cpu O/S O %Fr %Ef Flg ext4_inode_cache 38448 864 106.1M3201/566/39 37 3 17 31 a :at-104 316429 10436.5M 8840/3257/92 39 0 36 89 *a btrfs_inode 13271 98435.7M 1078/0/14 33 3 0 36 a radix_tree_node 43785 56034.7M 2075/1800/45 28 2 84 70 a dentry 64281 19214.3M 3439/1185/55 21 0 33 86 a proc_inode_cache 15695 60812.1M 693/166/51 26 2 22 78 a inode_cache 10730 544 6.0M 349/0/21 29 2 0 96 a task_struct6285896 4.3M 123/23/105 3 17 84 The total Space is much smaller than the missing ram. The only other difference I see is that one process has left behind several score zombies. It is structured as a parent with several worker kids, but the kids stay zombie even when the parent process is stopped and restarted. wchan shows that they are stuck in exit. Their normal rss isn't enough to account for the missing ram, even if it isn't reclaimed. (Not to mention, ram != brains. :) I haven't tried bisecting because of the time it takes to confirm the problem (several hours of uptime). I've only compiled (each of) the rc tags, so v3.6 is that last known good and v3.7-rc1 is the first known bad. If there is anything that I missed, please let me know! -JimC -- James Cloos OpenPGP: 1024D/ED7DAEA6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/