Re: 11.2-STABLE kernel wired memory leak

2019-03-29 Thread Paul Mather
On Mar 29, 2019, at 5:52 AM, Robert Schulze  wrote:

> Hi,
> 
> I just want to report a similar issue here with 11.2-RELEASE-p8.
> 
> The affected machine has 64 GB ram and does daily backups from several
> machines in the night, at daytime there a parallel runs of clamav on a
> specific dataset.
> 
> One symtom is basic I/O-performance: After upgrading from 11.1 to 11.2
> backup times have increased, and are even still increasing. After one
> week of operation, backup times have doubled - without having changed
> anything else.
> 
> Then there is this wired memory and way too lazy reclaim of memory for
> user processes: The clamav scans start at 10:30 and get swapped out
> immediatly. Although vfs.zfs.arc_max=48G, wired is at 62 GB before the
> scans and it takes about 10 minutes for the scan processes to actually
> run on system ram, not swap.
> 
> There is obviously something broken, as there are several threads with
> similar observations.


I am using FreeBSD 12 (both -RELEASE and -STABLE) and your comment about "way 
too lazy reclaim of memory" struck a chord with me.  On one system I regularly 
have hundreds of MB identified as being in the "Laundry" queue but FreeBSD 
hardly ever seems to do the laundry.  I see the same total for days.

When does FreeBSD decide to do its laundry?  Right now "top" is showing 835M in 
"Laundry" and the system is >99% idle.

How can I get the system to be more proactive about doing its housekeeping when 
it has idle time?

It would be much nicer to have it do laundry during a calm time rather than get 
all flustered when it's down to its last pair of socks (metaphorically 
speaking) and page even more stuff out to swap. :-)

Cheers,

Paul.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-03-29 Thread Robert Schulze
Hi,

I just want to report a similar issue here with 11.2-RELEASE-p8.

The affected machine has 64 GB ram and does daily backups from several
machines in the night, at daytime there a parallel runs of clamav on a
specific dataset.

One symtom is basic I/O-performance: After upgrading from 11.1 to 11.2
backup times have increased, and are even still increasing. After one
week of operation, backup times have doubled - without having changed
anything else.

Then there is this wired memory and way too lazy reclaim of memory for
user processes: The clamav scans start at 10:30 and get swapped out
immediatly. Although vfs.zfs.arc_max=48G, wired is at 62 GB before the
scans and it takes about 10 minutes for the scan processes to actually
run on system ram, not swap.

There is obviously something broken, as there are several threads with
similar observations.

with kind regards,
Robert Schulze
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-20 Thread Mike Tancsa
On 2/20/2019 3:14 AM, Eugene Grosbein wrote:
> 20.02.2019 3:49, Mike Tancsa wrote:
>
>> On 2/19/2019 2:35 PM, Mike Tancsa wrote:
>>> dd if=/dev/zero of=/tanker/test bs=1m count=10
>> The box has 32G of RAM. If I do a
>>
>> # sysctl -w vfs.zfs.arc_max=12224866304
>> vfs.zfs.arc_max: 32224866304 -> 12224866304
>> #
>>
>> after WIRED memory is at 29G, it doesnt immediately reclaim it and there
>> is memory pressure.  Booting the box with
>> vfs.zfs.arc_max=12224866304
>> keeps WIRED at 15G
> Can you repeat your test with additional vm.v_free_min=131072 (512M) or 
> 262144 (1G)?
> For me, it makes kernel reclaim unused UMA memory quickly:
> first it goes from WIREE to FREE for a moment, then re-used as needed.

# sysctl -w vm.v_free_min=131072
vm.v_free_min: 51301 -> 131072
#

Keeping it at 1G seems to make things more smooth and stable and
programs are not being swapped out!




>
>

-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-20 Thread Eugene Grosbein
20.02.2019 3:49, Mike Tancsa wrote:

> On 2/19/2019 2:35 PM, Mike Tancsa wrote:
>> dd if=/dev/zero of=/tanker/test bs=1m count=10
> 
> The box has 32G of RAM. If I do a
> 
> # sysctl -w vfs.zfs.arc_max=12224866304
> vfs.zfs.arc_max: 32224866304 -> 12224866304
> #
> 
> after WIRED memory is at 29G, it doesnt immediately reclaim it and there
> is memory pressure.  Booting the box with
> vfs.zfs.arc_max=12224866304
> keeps WIRED at 15G

Can you repeat your test with additional vm.v_free_min=131072 (512M) or 262144 
(1G)?
For me, it makes kernel reclaim unused UMA memory quickly:
first it goes from WIREE to FREE for a moment, then re-used as needed.


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-19 Thread Mike Tancsa
On 2/19/2019 2:35 PM, Mike Tancsa wrote:
> dd if=/dev/zero of=/tanker/test bs=1m count=10

The box has 32G of RAM. If I do a

# sysctl -w vfs.zfs.arc_max=12224866304
vfs.zfs.arc_max: 32224866304 -> 12224866304
#

after WIRED memory is at 29G, it doesnt immediately reclaim it and there
is memory pressure.  Booting the box with
vfs.zfs.arc_max=12224866304
keeps WIRED at 15G


last pid:  1049;  load averages:  2.02,  2.30, 
1.07    up
0+00:04:21  20:17:43
38 processes:  2 running, 36 sleeping
CPU:  3.6% user,  0.0% nice,  5.9% system,  0.0% interrupt, 90.5% idle
Mem: 39M Active, 6212K Inact, 15G Wired, 30M Buf, 16G Free
ARC: 11G Total, 363K MFU, 11G MRU, 117M Anon, 23M Header, 850K Other
 11G Compressed, 11G Uncompressed, 1.01:1 Ratio
Swap:


Re-running the test but this time with swap,

# zpool create tanker raidz1 da0p1 da1p1 da2p1 da3p1 da4p1 da5p1 da6p1
da7p1 da8p1 da9p1 da10p1 da11p1 da12p1 da13p1 da14p1
# pstat -T
120/1043936 files
0M/65536M swap space
#

There is still 30G of Wired memory ?

last pid:  1096;  load averages:  0.78,  2.13, 
1.25    up
0+00:06:42  20:33:21
35 processes:  1 running, 34 sleeping
CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 976K Active, 274M Inact, 88M Laundry, 30G Wired, 1341M Buf, 722M Free
ARC: 1227K Total, 775K MFU, 339K MRU, 32K Anon, 9216 Header, 69K Other
 176K Compressed, 1179K Uncompressed, 6.70:1 Ratio
Swap: 64G Total, 64G Free

I start a new dd on a freshly created pool,

CPU:  0.0% user,  0.0% nice, 31.4% system,  0.8% interrupt, 67.8% idle
Mem: 1396K Active, 460K Inact, 18M Laundry, 30G Wired, 1341M Buf, 720M Free
ARC: 27G Total, 54K MFU, 26G MRU, 1380M Anon, 56M Header, 3459K Other
 26G Compressed, 26G Uncompressed, 1.00:1 Ratio
Swap: 64G Total, 73M Used, 64G Free

and the box starts to swap

last pid:  1104;  load averages: 11.61,  6.10, 
3.04    up
0+00:09:32  20:36:11
36 processes:  2 running, 34 sleeping
CPU:  0.1% user,  0.0% nice, 95.3% system,  2.1% interrupt,  2.4% idle
Mem: 1212K Active, 124K Inact, 688K Laundry, 30G Wired, 1341M Buf, 819M Free
ARC: 27G Total, 54K MFU, 24G MRU, 1620M Anon, 58M Header, 5572K Other
 24G Compressed, 24G Uncompressed, 1.01:1 Ratio
Swap: 64G Total, 90M Used, 64G Free



(avg's oneline script shows)
61,724,160  arc_buf_hdr_t_full  55,263,744  6,460,416
111,783,936 512 109,534,208 2,249,728
555,884,000 UMA_Slabs   553,374,640 2,509,360
1,280,966,656   zio_data_buf_131072 905,576,448 375,390,208
28,142,604,288  abd_chunk   27,965,505,536  177,098,752
30,431,435,848  TOTAL   29,663,389,468  768,046,380

# vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n",
$2*$5/1024/1024, $1}' | sort -k1,1 -rn | head
   358 zio_data_buf_131072
   168.895 abd_chunk
   18.5738 zio_cache
 14.25 zio_buf_786432
    14 zio_buf_1048576
    13.987 BUF TRIE
    13.125 zio_buf_655360
 9.625 zio_buf_917504
  8.75 zio_buf_458752
 8 zio_buf_524288
#

As the box is netbooted over nfs, when it swaps out some of the nfs
helper apps, the box gets a bit unresponsive.  e.g. if I do a find
/usr/src -type f |xargs md5 the box will stall out for periods of time. 
If I limit ARC to say 12G, all is well.


    ---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-19 Thread Mike Tancsa
On 2/12/2019 11:49 AM, Eugene Grosbein wrote:
> 12.02.2019 23:34, Mark Johnston wrote:
>
>> I suspect that the "leaked" memory is simply being used to cache UMA
>> items.  Note that the values in the FREE column of vmstat -z output are
>> quite large.  The cached items are reclaimed only when the page daemon
>> wakes up to reclaim memory; if there are no memory shortages, large
>> amounts of memory may accumulate in UMA caches.  In this case, the sum
>> of the product of columns 2 and 5 gives a total of roughly 4GB cached.
> Forgot to note, that before I got system to single user mode, there was heavy 
> swap usage (over 3.5GB)
> and heavy page-in/page-out, 10-20 megabytes per second and system was 
> crawling slow due to pageing.

I just ran into this issue on a RELENG12 box I was getting ready for the
FreeBSD netperf cluster. It seems pretty easy to trigger. I created a 12
disk raidz pool and did a simple dd test and the box started to run out
of memory

pid 776 (rpc.statd), jid 0, uid 0, was killed: out of swap space
pid 784 (rpc.lockd), jid 0, uid 0, was killed: out of swap space

CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 1120K Active, 628K Inact, 264K Laundry, 30G Wired, 26M Buf, 1133M Free
ARC: 28G Total, 73K MFU, 28G MRU, 32K Anon, 56M Header, 1713K Other
 27G Compressed, 27G Uncompressed, 1.00:1 Ratio

zpool create tanker raidz1 da0p1 da1p1 da2p1 da3p1 da4p1 da5p1 da6p1
da7p1 da8p1 da9p1 da10p1 da11p1

dd if=/dev/zero of=/tanker/test bs=1m count=10

zpool destroy tanker

last pid:  1078;  load averages:  0.37,  1.32, 
0.84    up
0+00:11:44  19:22:03
32 processes:  1 running, 31 sleeping
CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 564K Active, 792K Inact, 1000K Laundry, 30G Wired, 26M Buf, 1046M Free
Swap:

# vmstat -z | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, $1}' | sort
-k1,1 -rn | head
   73899.1 mbuf_cluster:  2048
 23693 mbuf_packet:    256
   4912.17 socket: 872
   4719.04 unpcb:  256
   147.354 udpcb:   32
   147.345 udp_inpcb:  488
   28.8717 tcpcb:  976
   28.8717 tcp_inpcb:  488
   11.6294 mbuf_jumbo_page:   4096
   2.98672 ripcb:  488
#





> -- 
> ---
> Mike Tancsa, tel +1 519 651 3400 x203
> Sentex Communications, m...@sentex.net
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada   
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-15 Thread Lev Serebryakov
Hello Eugene,

Tuesday, February 12, 2019, 10:18:09 PM, you wrote:

> Do you have/had some memory pressure here? Growth of swap usage?
 After several days, ARC is even smaller, but Wired is the same:

Mem: 88M Active, 904M Inact, 29G Wired, 1121M Free
ARC: 13G Total, 5288M MFU, 6937M MRU, 2880K Anon, 44M Header, 1036M Other
 11G Compressed, 15G Uncompressed, 1.32:1 Ratio

  :-\

 Strill "lowmem_uptime" is zero, so no "low memory" situation, but ARC
becomes smaller and smaller (and hit rate goes down too) and it is not clear
where are all this "Wired" RAM :-(

 I had 16G on this system with same load and about half-year ago 14G was
typical ARC with 97% hit rate and now it is 32G (other things are unchanged)
and it is 13G ARC and 94% hit rate. And all memory is still Wired!

-- 
Best regards,
 Levmailto:l...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-14 Thread Stefan Esser
Am 13.02.19 um 10:59 schrieb Andriy Gapon:
> On 12/02/2019 20:17, Eugene Grosbein wrote:
>> 13.02.2019 1:14, Eugene Grosbein wrote:
>>
>>> Use following command to see how much memory is wasted in your case:
>>>
>>> vmstat -z | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, $1}' | sort 
>>> -k1,1 -rn | head
>>
>> Oops, small correction:
>>
>> vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, 
>> $1}' | sort -k1,1 -rn | head
> 
> I have a much uglier but somewhat more informative "one-liner" for
> post-processing vmstat -z output:
> 
> vmstat -z | tail +3 | awk -F '[:,] *' 'BEGIN { total=0; cache=0; used=0 } {u =
> $2 * $4; c = $2 * $5; t = u + c; cache += c; used += u; total += t; name=$1;
> gsub(" ", "_", name); print t, name, u, c} END { print total, "TOTAL", used,
> cache } ' | sort -n | perl -a -p -e 'while (($j, $_) = each(@F)) { 1 while
> s/^(-?\d+)(\d{3})/$1,$2/; print $_, " "} print "\n"' | column -t
> 
> This would be much nicer as a small python script.

Or, since you are already using perl:


#!/usr/local/bin/perl5





open STDIN, "vmstat -z |" or die "Failed to run vmstat program";
open STDOUT, "| sort -n @ARGV -k 2" or die "Failed to pipe through sort";

$fmt="%-30s %8.3f %8.3f %8.3f %6.1f%%\n";
while () {
($n, $s, undef, $u, $c) = split /[:,] */;
next unless $s > 0;
$n =~ s/ /_/g;
$s /= 1024 * 1024;
$u *= $s;
$c *= $s;
$t =  $u + $c;
next unless $t > 0;
printf $fmt, $n, $t, $u, $c, 100 * $c / $t;
$cache += $c;
$used  += $u;
$total += $t;
}
printf $fmt, TOTAL, $total, $used, $cache, 100 * $cached / $total;
close STDOUT;


This script takes additional parameters, e.g. "-r" to reverse the
sort or "-r -k5" to print those entries with the highest percentage
of unused memory at the top.

(I chose to suppress lines with a current "total" of 0 - you may
want to remove the "next" command above the printf in the loop to
see them, but they did not seem useful to me.)

> Or, even, we could add a sort option for vmstat -z / -m.

Yes, and the hardest part might be to select option characters
for the various useful report formats. ;-)

Regards, STefan
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-13 Thread Eugene Grosbein
On 13.02.2019 19:04, Eugene Grosbein wrote:

> 12.02.2019 23:34, Mark Johnston wrote:
> 
>> On Tue, Feb 12, 2019 at 11:14:31PM +0700, Eugene Grosbein wrote:
>>> Hi!
>>>
>>> Long story short: 11.2-STABLE/amd64 r335757 leaked over 4600MB kernel wired 
>>> memory over 81 days uptime
>>> out of 8GB total RAM.
>>>
>>> Details follow.
>>>
>>> I have a workstation running Xorg, Firefox, Thunderbird, LibreOffice and 
>>> occasionally VirtualBox for single VM.
>>>
>>> It has two identical 320GB HDDs combined with single graid-based array with 
>>> "Intel"
>>> on-disk format having 3 volumes:
>>> - one "RAID1" volume /dev/raid/r0 occupies first 10GB or each HDD;
>>> - two "SINGLE" volumes /dev/raid/r1 and /dev/raid/r2 that utilize "tails" 
>>> of HDDs (310GB each).
>>>
>>> /dev/raid/r0 (10GB) has MBR partitioning and two slices:
>>> - /dev/raid/r0s1 (8GB) is used for swap;
>>> - /dev/raid/r0s2 (2GB) is used by non-redundant ZFS pool named "os" that 
>>> contains only
>>> root file system (177M used) and /usr file system (340M used).
>>>
>>> There is also second pool (ZMIRROR) named "z" built directly on top of 
>>> /dev/raid/r[12] volumes,
>>> this pool contains all other file systems including /var, /home, 
>>> /usr/ports, /usr/local, /usr/{src|obj} etc.
>>>
>>> # zpool list
>>> NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAGCAP  DEDUP  HEALTH  
>>> ALTROOT
>>> os1,98G   520M  1,48G- -55%25%  1.00x  ONLINE  -
>>> z  288G  79,5G   209G- -34%27%  1.00x  ONLINE  -
>>>
>>> This way I have swap outside of ZFS, boot blocks and partitioning mirrored 
>>> by means of GEOM_RAID and
>>> can use local console to break to single user mode to unmount all file 
>>> system other than root and /usr
>>> and can even export bigger ZFS pool "z". And I did that to see that ARC 
>>> usage
>>> (limited with vfs.zfs.arc_max="3G" in /boot/loader.conf) dropped from over 
>>> 2500MB 
>>> down to 44MB but "Wired" stays high. Now after I imported "z" back and 
>>> booted to multiuser mode
>>> top(1) shows:
>>>
>>> last pid: 51242;  load averages:  0.24,  0.16,  0.13  up 81+02:38:38  
>>> 22:59:18
>>> 104 processes: 1 running, 103 sleeping
>>> CPU:  0.0% user,  0.0% nice,  0.4% system,  0.2% interrupt, 99.4% idle
>>> Mem: 84M Active, 550M Inact, 4K Laundry, 4689M Wired, 2595M Free
>>> ARC: 273M Total, 86M MFU, 172M MRU, 64K Anon, 1817K Header, 12M Other
>>>  117M Compressed, 333M Uncompressed, 2.83:1 Ratio
>>> Swap: 8192M Total, 940K Used, 8191M Free
>>>
>>> I have KDB and DDB in my custom kernel also. How do I debug the leak 
>>> further?
>>>
>>> I use nvidia-driver-340-340.107 driver for GK208 [GeForce GT 710B] video 
>>> card.
>>> Here are outputs of "vmstat -m": 
>>> http://www.grosbein.net/freebsd/leak/vmstat-m.txt
>>> and "vmstat -z": http://www.grosbein.net/freebsd/leak/vmstat-z.txt
>>
>> I suspect that the "leaked" memory is simply being used to cache UMA
>> items.  Note that the values in the FREE column of vmstat -z output are
>> quite large.  The cached items are reclaimed only when the page daemon
>> wakes up to reclaim memory; if there are no memory shortages, large
>> amounts of memory may accumulate in UMA caches.  In this case, the sum
>> of the product of columns 2 and 5 gives a total of roughly 4GB cached.
> 
> After another day with mostly idle system, "Wired" increased to more than 6GB 
> out of 8GB total.
> I've tried to increase sysctl vm.v_free_min from default 12838 (50MB)
> upto 1048576 (4GB) and "Wired" dropped a bit but it is still huge, 5060M:
> 
> last pid: 61619;  load averages:  1.05,  0.78,  0.40  up 81+22:33:09  18:53:49
> 119 processes: 1 running, 118 sleeping
> CPU:  0.0% user,  0.0% nice, 50.0% system,  0.0% interrupt, 50.0% idle
> Mem: 47M Active, 731M Inact, 4K Laundry, 5060M Wired, 2080M Free
> ARC: 3049M Total, 216M MFU, 2428M MRU, 64K Anon, 80M Header, 325M Other
>  2341M Compressed, 5874M Uncompressed, 2.51:1 Ratio
> Swap: 8192M Total, 940K Used, 8191M Free
> 
> # sysctl vm.v_free_min
> vm.v_free_min: 1048576
> # sysctl vm.stats.vm.v_free_count
> vm.stats.vm.v_free_count: 533232
> 
> ARC probably cached results of nightly periodic jobs traversing file system 
> trees
> and hit its limit (3G).
> 
> Still cannot understand where have another 2G (5G-3G) of wired memory leaked 
> to?

After several more hours of pretty idle time with quoted settings memory finally
got moved from "Wired" " to "Free" category. I got the system to single user 
mode again,
exported bigger ZFS pool again and ARC dropped down to 18M and Wired to less 
than 500M.
After re-import and boot to multiuser with Xorg running, Wired is less than 
800M.

So, there seems to be no leak but insanely long time for Wired memory to be 
freed,
even with extreme value for vm.v_free_min. Even heavy memory pressure does not 
speed things up.

It seems, page daemon is somewhat broken and fails to free wired memory timely.

___

Re: 11.2-STABLE kernel wired memory leak

2019-02-13 Thread Karl Denninger

On 2/13/2019 03:59, Andriy Gapon wrote:
> On 12/02/2019 20:17, Eugene Grosbein wrote:
>> 13.02.2019 1:14, Eugene Grosbein wrote:
>>
>>> Use following command to see how much memory is wasted in your case:
>>>
>>> vmstat -z | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, $1}' | sort 
>>> -k1,1 -rn | head
>> Oops, small correction:
>>
>> vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, 
>> $1}' | sort -k1,1 -rn | head
> I have a much uglier but somewhat more informative "one-liner" for
> post-processing vmstat -z output:
>
> vmstat -z | tail +3 | awk -F '[:,] *' 'BEGIN { total=0; cache=0; used=0 } {u =
> $2 * $4; c = $2 * $5; t = u + c; cache += c; used += u; total += t; name=$1;
> gsub(" ", "_", name); print t, name, u, c} END { print total, "TOTAL", used,
> cache } ' | sort -n | perl -a -p -e 'while (($j, $_) = each(@F)) { 1 while
> s/^(-?\d+)(\d{3})/$1,$2/; print $_, " "} print "\n"' | column -t
>
> This would be much nicer as a small python script.
> Or, even, we could add a sort option for vmstat -z / -m.

On 12-STABLE, r343809, /without /my patch set (and I've yet to see the
sort of stall problems I used to have regularly with 11.1)

root@NewFS:/disk/karl # sh vmstat.script
0  0   0
0  AIO 0  0
0  AIOCB   0  0
0  AIOLIO  0  0
0  AIOP    0  0
0  DMAR_MAP_ENTRY  0  0
0  FFS1_dinode 0  0
0  FPU_save_area   0  0
0  IPFW_IPv6_dynamic_states    0  0
0  IPFW_parent_dynamic_states  0  0
0  IPsec_SA_lft_c  0  0
0  LTS_VFS_Cache   0  0
0  MAC_labels  0  0
0  NCLNODE 0  0
0  STS_VFS_Cache   0  0
0  arc_buf_hdr_t_l2only    0  0
0  audit_record    0  0
0  cryptodesc  0  0
0  cryptop 0  0
0  domainset   0  0
0  itimer  0  0
0  mbuf_jumbo_16k  0  0
0  mbuf_jumbo_9k   0  0
0  nvme_request    0  0
0  procdesc    0  0
0  rentr   0  0
0  sctp_asconf 0  0
0  sctp_asconf_ack 0  0
0  sctp_asoc   0  0
0  sctp_chunk  0  0
0  sctp_ep 0  0
0  sctp_raddr  0  0
0  sctp_readq  0  0
0  sctp_stream_msg_out 0  0
0  sio_cache   0  0
0  tcp_log 0  0
0  tcp_log_bucket  0  0
0  tcp_log_node    0  0
0  tfo 0  0
0  tfo_ccache_entries  0  0
0  udplite_inpcb   0  0
0  umtx_pi 0  0
0  vtnet_tx_hdr    0  0
0  zio_buf_10485760    0  0
0  zio_buf_12582912    0  0
0  zio_buf_14680064    0  0
0  zio_buf_1572864 0  0
0  zio_buf_16777216    0  0
0  zio_buf_1835008 0  0
0  zio_buf_2097152 0  0
0  zio_buf_2621440 0  0
0  zio_buf_3145728 0  0
0  zio_buf_3670016 0  0
0  zio_buf_4194304 0  0
0  zio_buf_5242880 0  0
0  zio_buf_6291456 0  0
0  zio_buf_7340032 0  0
0  zio_buf_8388608 0  0
0  zio_data_buf_1048576    0  0
0  zio_data_buf_10485760   0  0
0  zio_data_buf_12582912   0  0
0  zio_data_buf_1310720    0  0
0  zio_data_buf_14680064   0  0
0 

Re: 11.2-STABLE kernel wired memory leak

2019-02-13 Thread Eugene Grosbein
12.02.2019 23:34, Mark Johnston wrote:

> On Tue, Feb 12, 2019 at 11:14:31PM +0700, Eugene Grosbein wrote:
>> Hi!
>>
>> Long story short: 11.2-STABLE/amd64 r335757 leaked over 4600MB kernel wired 
>> memory over 81 days uptime
>> out of 8GB total RAM.
>>
>> Details follow.
>>
>> I have a workstation running Xorg, Firefox, Thunderbird, LibreOffice and 
>> occasionally VirtualBox for single VM.
>>
>> It has two identical 320GB HDDs combined with single graid-based array with 
>> "Intel"
>> on-disk format having 3 volumes:
>> - one "RAID1" volume /dev/raid/r0 occupies first 10GB or each HDD;
>> - two "SINGLE" volumes /dev/raid/r1 and /dev/raid/r2 that utilize "tails" of 
>> HDDs (310GB each).
>>
>> /dev/raid/r0 (10GB) has MBR partitioning and two slices:
>> - /dev/raid/r0s1 (8GB) is used for swap;
>> - /dev/raid/r0s2 (2GB) is used by non-redundant ZFS pool named "os" that 
>> contains only
>> root file system (177M used) and /usr file system (340M used).
>>
>> There is also second pool (ZMIRROR) named "z" built directly on top of 
>> /dev/raid/r[12] volumes,
>> this pool contains all other file systems including /var, /home, /usr/ports, 
>> /usr/local, /usr/{src|obj} etc.
>>
>> # zpool list
>> NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAGCAP  DEDUP  HEALTH  
>> ALTROOT
>> os1,98G   520M  1,48G- -55%25%  1.00x  ONLINE  -
>> z  288G  79,5G   209G- -34%27%  1.00x  ONLINE  -
>>
>> This way I have swap outside of ZFS, boot blocks and partitioning mirrored 
>> by means of GEOM_RAID and
>> can use local console to break to single user mode to unmount all file 
>> system other than root and /usr
>> and can even export bigger ZFS pool "z". And I did that to see that ARC usage
>> (limited with vfs.zfs.arc_max="3G" in /boot/loader.conf) dropped from over 
>> 2500MB 
>> down to 44MB but "Wired" stays high. Now after I imported "z" back and 
>> booted to multiuser mode
>> top(1) shows:
>>
>> last pid: 51242;  load averages:  0.24,  0.16,  0.13  up 81+02:38:38  
>> 22:59:18
>> 104 processes: 1 running, 103 sleeping
>> CPU:  0.0% user,  0.0% nice,  0.4% system,  0.2% interrupt, 99.4% idle
>> Mem: 84M Active, 550M Inact, 4K Laundry, 4689M Wired, 2595M Free
>> ARC: 273M Total, 86M MFU, 172M MRU, 64K Anon, 1817K Header, 12M Other
>>  117M Compressed, 333M Uncompressed, 2.83:1 Ratio
>> Swap: 8192M Total, 940K Used, 8191M Free
>>
>> I have KDB and DDB in my custom kernel also. How do I debug the leak further?
>>
>> I use nvidia-driver-340-340.107 driver for GK208 [GeForce GT 710B] video 
>> card.
>> Here are outputs of "vmstat -m": 
>> http://www.grosbein.net/freebsd/leak/vmstat-m.txt
>> and "vmstat -z": http://www.grosbein.net/freebsd/leak/vmstat-z.txt
> 
> I suspect that the "leaked" memory is simply being used to cache UMA
> items.  Note that the values in the FREE column of vmstat -z output are
> quite large.  The cached items are reclaimed only when the page daemon
> wakes up to reclaim memory; if there are no memory shortages, large
> amounts of memory may accumulate in UMA caches.  In this case, the sum
> of the product of columns 2 and 5 gives a total of roughly 4GB cached.

After another day with mostly idle system, "Wired" increased to more than 6GB 
out of 8GB total.
I've tried to increase sysctl vm.v_free_min from default 12838 (50MB)
upto 1048576 (4GB) and "Wired" dropped a bit but it is still huge, 5060M:

last pid: 61619;  load averages:  1.05,  0.78,  0.40  up 81+22:33:09  18:53:49
119 processes: 1 running, 118 sleeping
CPU:  0.0% user,  0.0% nice, 50.0% system,  0.0% interrupt, 50.0% idle
Mem: 47M Active, 731M Inact, 4K Laundry, 5060M Wired, 2080M Free
ARC: 3049M Total, 216M MFU, 2428M MRU, 64K Anon, 80M Header, 325M Other
 2341M Compressed, 5874M Uncompressed, 2.51:1 Ratio
Swap: 8192M Total, 940K Used, 8191M Free

# sysctl vm.v_free_min
vm.v_free_min: 1048576
# sysctl vm.stats.vm.v_free_count
vm.stats.vm.v_free_count: 533232

ARC probably cached results of nightly periodic jobs traversing file system 
trees
and hit its limit (3G).

Still cannot understand where have another 2G (5G-3G) of wired memory leaked to?

USED:
#  vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n", $2*$4/1024/1024, 
$1}' | sort -k1,1 -rn | head
2763,2 abd_chunk
   196,547 zio_buf_16384
   183,711 dnode_t
   128,304 zio_buf_512
   96,3062 VNODE
   79,0076 arc_buf_hdr_t_full
  66,5 zio_data_buf_131072
   63,0772 UMA Slabs
   61,6484 256
   61,2564 dmu_buf_impl_t

FREE:
#  vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, 
$1}' | sort -k1,1 -rn | head
   245,301 dnode_t
   209,086 zio_buf_512
   110,163 dmu_buf_impl_t
   31,2598 64
21,656 256
   10,9262 swblk
   10,6295 128
9,0379 RADIX NODE
   8,54521 L VFS Cache
7,4917 512

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to 

Re: 11.2-STABLE kernel wired memory leak

2019-02-13 Thread Andriy Gapon
On 12/02/2019 20:17, Eugene Grosbein wrote:
> 13.02.2019 1:14, Eugene Grosbein wrote:
> 
>> Use following command to see how much memory is wasted in your case:
>>
>> vmstat -z | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, $1}' | sort -k1,1 
>> -rn | head
> 
> Oops, small correction:
> 
> vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, 
> $1}' | sort -k1,1 -rn | head

I have a much uglier but somewhat more informative "one-liner" for
post-processing vmstat -z output:

vmstat -z | tail +3 | awk -F '[:,] *' 'BEGIN { total=0; cache=0; used=0 } {u =
$2 * $4; c = $2 * $5; t = u + c; cache += c; used += u; total += t; name=$1;
gsub(" ", "_", name); print t, name, u, c} END { print total, "TOTAL", used,
cache } ' | sort -n | perl -a -p -e 'while (($j, $_) = each(@F)) { 1 while
s/^(-?\d+)(\d{3})/$1,$2/; print $_, " "} print "\n"' | column -t

This would be much nicer as a small python script.
Or, even, we could add a sort option for vmstat -z / -m.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Lev Serebryakov
Hello Eugene,

Tuesday, February 12, 2019, 10:18:09 PM, you wrote:

>>  I'm have same problem.
>> 
>>  According to top(1) I have 29G Wired, but only 17G Total ARC (12G
>> difference! System has 32G of RAM), and this statistic shows:
>> 
>> 5487.5 zio_data_buf_524288
>>920.125 zio_data_buf_131072
>>626 zio_buf_131072
>>468 zio_data_buf_1048576
>>398.391 zio_buf_16384
>>305.464 dnode_t
>>227.989 zio_buf_512
>>  171.5 zio_data_buf_458752
>> 141.75 zio_data_buf_393216
>>116.456 dmu_buf_impl_t
>> 
>>  So, more than 6G (!) is not used in ARC, but hold by ZFS anyway.

> dnode_t and dmu_buf_impl_t are parts of ZFS too,
> so these numbers represent about 9G, not 6G.

> Do you have/had some memory pressure here? Growth of swap usage?
 I don't have memory pressure right now, but according to my previous
experience, ARC will not grow anymore even under heavy disk load (I don't
have vfs.zfs.arc_max set).

 Before new ARC (vfs.zfs.abd_scatter_enabled) I had typically ALL
memory occuped by ARC, Wired memory was almost exactly equal to ARC, and
ARC hitrate was higher (but I have not exact numbers, unfortunately).

 Now I have "vfs.zfs.abd_scatter_enabled=0", but still Wired is much larger
that ARC under any disk load (it is mostly torrent box).

-- 
Best regards,
 Levmailto:l...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Mike Tancsa
On 2/12/2019 2:38 PM, Eugene Grosbein wrote:
>
> mbuf* numbers represent memory being wasted IMHO.
>
> In contrast to ZFS memory that can contain useful cached data,
> contents of freed mbufs cannot be re-used, right?
>
> Some amount of "free" mbufs in the zone's just fine to serve bursts of traffic
> without need to grow the zone but your numbers are way too big IMHO.
>
> Do you have memory pressure here? Growth of swap usage?
>
I had to radically cut ARC down, otherwise the box would start swapping
all sorts of processes out


% sysctl -a vfs.zfs.arc_max
vfs.zfs.arc_max: 12307744768


    ---Mike



-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Eugene Grosbein
13.02.2019 2:18, Mike Tancsa wrote:

> On an nfs server, serving a few large files, my 32G box is showing
> 
> vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n",
> $2*$5/1024/1024, $1}' | sort -k1,1 -rn | head
>11014.3 abd_chunk
> 2090.5 zio_data_buf_131072
>1142.67 mbuf_jumbo_page
>1134.25 zio_buf_131072
> 355.28 mbuf_jumbo_9k
> 233.42 zio_cache
>163.099 arc_buf_hdr_t_full
>130.738 128
>97.2812 zio_buf_16384
>96.5099 UMA Slabs
> 
> 
> CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> Mem: 1348K Active, 98M Inact, 3316K Laundry, 30G Wired, 1022M Free
> ARC: 11G Total, 7025M MFU, 3580M MRU, 11M Anon, 78M Header, 681M Other
>  9328M Compressed, 28G Uncompressed, 3.05:1 Ratio
> Swap: 64G Total, 13M Used, 64G Free
> 
> 
> Right now its OK, but prior to limiting ARC, I had an issue with memory
> and the disk thrashing due to swapping
> 
> pid 643 (devd), uid 0, was killed: out of swap space

mbuf* numbers represent memory being wasted IMHO.

In contrast to ZFS memory that can contain useful cached data,
contents of freed mbufs cannot be re-used, right?

Some amount of "free" mbufs in the zone's just fine to serve bursts of traffic
without need to grow the zone but your numbers are way too big IMHO.

Do you have memory pressure here? Growth of swap usage?

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Mike Tancsa
On 2/12/2019 1:03 PM, Eugene Grosbein wrote:
> It seems page daemon is broken somehow as it did not reclaim several
> gigs of wired memory
> despite of long period of vm thrashing:
>
> $ sed 's/:/,/' vmstat-z.txt | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, 
> $1}' | sort -k1,1 -rn | head
>   1892 abd_chunk
>454.629 dnode_t
> 351.35 zio_buf_512
>228.391 zio_buf_16384
>173.968 dmu_buf_impl_t
> 130.25 zio_data_buf_131072
>93.6887 VNODE
>81.6978 arc_buf_hdr_t_full
>74.9368 256
>57.4102 4096

On an nfs server, serving a few large files, my 32G box is showing

vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n",
$2*$5/1024/1024, $1}' | sort -k1,1 -rn | head
   11014.3 abd_chunk
    2090.5 zio_data_buf_131072
   1142.67 mbuf_jumbo_page
   1134.25 zio_buf_131072
    355.28 mbuf_jumbo_9k
    233.42 zio_cache
   163.099 arc_buf_hdr_t_full
   130.738 128
   97.2812 zio_buf_16384
   96.5099 UMA Slabs


CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 1348K Active, 98M Inact, 3316K Laundry, 30G Wired, 1022M Free
ARC: 11G Total, 7025M MFU, 3580M MRU, 11M Anon, 78M Header, 681M Other
 9328M Compressed, 28G Uncompressed, 3.05:1 Ratio
Swap: 64G Total, 13M Used, 64G Free


Right now its OK, but prior to limiting ARC, I had an issue with memory
and the disk thrashing due to swapping

pid 643 (devd), uid 0, was killed: out of swap space


    ---Mike



> - 
> ---
> Mike Tancsa, tel +1 519 651 3400 x203
> Sentex Communications, m...@sentex.net
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada   
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Eugene Grosbein
13.02.2019 1:50, Lev Serebryakov wrote:

>  I'm have same problem.
> 
>  According to top(1) I have 29G Wired, but only 17G Total ARC (12G
> difference! System has 32G of RAM), and this statistic shows:
> 
> 5487.5 zio_data_buf_524288
>920.125 zio_data_buf_131072
>626 zio_buf_131072
>468 zio_data_buf_1048576
>398.391 zio_buf_16384
>305.464 dnode_t
>227.989 zio_buf_512
>  171.5 zio_data_buf_458752
> 141.75 zio_data_buf_393216
>116.456 dmu_buf_impl_t
> 
>  So, more than 6G (!) is not used in ARC, but hold by ZFS anyway.

dnode_t and dmu_buf_impl_t are parts of ZFS too,
so these numbers represent about 9G, not 6G.

Do you have/had some memory pressure here? Growth of swap usage?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Eugene Grosbein
13.02.2019 1:18, Mark Johnston wrote:

> Depending on the system's workload, it is possible for the caches to
> grow quite quickly after a reclaim.  If you are able to run kgdb on the
> kernel, you can find the time of the last reclaim by comparing the
> values of lowmem_uptime and time_uptime.

(kgdb) p time_uptime
$1 = 7019265
(kgdb) p lowmem_uptime
$2 = 7000568
(kgdb) quit
# uptime
 2:08  up 81 days,  5:48, 5 users, load averages: 0,19 0,13 0,09

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Lev Serebryakov
On 12.02.2019 21:48, Eugene Grosbein wrote:

> I will reach the console next day only. Is it wise to use kgdb over ssh for 
> running remote system? :-)
 It works for me :-)

BTW, my is:

(kgdb) p time_uptime
$1 = 81369
(kgdb) p lowmem_uptime
$2 = 0

 (yes, this system have been rebooted less than a day ago).

-- 
// Lev Serebryakov



signature.asc
Description: OpenPGP digital signature


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Mark Johnston
On Wed, Feb 13, 2019 at 01:48:21AM +0700, Eugene Grosbein wrote:
> 13.02.2019 1:42, Mark Johnston wrote:
> 
> >> Yes, I have debugger compiled into running kernel and have console access.
> >> What commands should I use?
> > 
> > I meant kgdb(1).  If you can run that, try:
> > 
> > (kgdb) p time_uptime
> > (kgdb) p lowmem_uptime
> > 
> > If you are willing to drop the system into DDB, do so and run
> > 
> > db> x time_uptime
> > db> x lowmem_uptime
> 
> I will reach the console next day only. Is it wise to use kgdb over ssh for 
> running remote system? :-)

It should be fine.  kgdb opens /dev/(k)mem read-only.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Eugene Grosbein
13.02.2019 1:42, Mark Johnston wrote:

>> Yes, I have debugger compiled into running kernel and have console access.
>> What commands should I use?
> 
> I meant kgdb(1).  If you can run that, try:
> 
> (kgdb) p time_uptime
> (kgdb) p lowmem_uptime
> 
> If you are willing to drop the system into DDB, do so and run
> 
> db> x time_uptime
> db> x lowmem_uptime

I will reach the console next day only. Is it wise to use kgdb over ssh for 
running remote system? :-)



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Lev Serebryakov
On 12.02.2019 21:37, Eugene Grosbein wrote:

>> vfs.zfs.arc_max=1216348160
> 
> Each line shows how many megabytes is allocated but currently unused by 
> corresponding
> UMA zone (and unavailable for other consumers). Your numbers are pretty low,
> you have nothing to worry about IMHO. I have hundreds of megabytes and 
> gigabytes there.
 I'm have same problem.

 According to top(1) I have 29G Wired, but only 17G Total ARC (12G
difference! System has 32G of RAM), and this statistic shows:

5487.5 zio_data_buf_524288
   920.125 zio_data_buf_131072
   626 zio_buf_131072
   468 zio_data_buf_1048576
   398.391 zio_buf_16384
   305.464 dnode_t
   227.989 zio_buf_512
 171.5 zio_data_buf_458752
141.75 zio_data_buf_393216
   116.456 dmu_buf_impl_t

 So, more than 6G (!) is not used in ARC, but hold by ZFS anyway.

-- 
// Lev Serebryakov



signature.asc
Description: OpenPGP digital signature


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Mark Johnston
On Wed, Feb 13, 2019 at 01:40:06AM +0700, Eugene Grosbein wrote:
> 13.02.2019 1:18, Mark Johnston wrote:
> 
> > On Wed, Feb 13, 2019 at 01:03:37AM +0700, Eugene Grosbein wrote:
> >> 12.02.2019 23:34, Mark Johnston wrote:
> >>
> >>> I suspect that the "leaked" memory is simply being used to cache UMA
> >>> items.  Note that the values in the FREE column of vmstat -z output are
> >>> quite large.  The cached items are reclaimed only when the page daemon
> >>> wakes up to reclaim memory; if there are no memory shortages, large
> >>> amounts of memory may accumulate in UMA caches.  In this case, the sum
> >>> of the product of columns 2 and 5 gives a total of roughly 4GB cached.
> >>>
>  as well as "sysctl hw": 
>  http://www.grosbein.net/freebsd/leak/sysctl-hw.txt
>  and "sysctl vm": http://www.grosbein.net/freebsd/leak/sysctl-vm.txt
> >>
> >> It seems page daemon is broken somehow as it did not reclaim several gigs 
> >> of wired memory
> >> despite of long period of vm thrashing:
> > 
> > Depending on the system's workload, it is possible for the caches to
> > grow quite quickly after a reclaim.  If you are able to run kgdb on the
> > kernel, you can find the time of the last reclaim by comparing the
> > values of lowmem_uptime and time_uptime.
> 
> Yes, I have debugger compiled into running kernel and have console access.
> What commands should I use?

I meant kgdb(1).  If you can run that, try:

(kgdb) p time_uptime
(kgdb) p lowmem_uptime

If you are willing to drop the system into DDB, do so and run

db> x time_uptime
db> x lowmem_uptime
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Eugene Grosbein
13.02.2019 1:18, Mark Johnston wrote:

> On Wed, Feb 13, 2019 at 01:03:37AM +0700, Eugene Grosbein wrote:
>> 12.02.2019 23:34, Mark Johnston wrote:
>>
>>> I suspect that the "leaked" memory is simply being used to cache UMA
>>> items.  Note that the values in the FREE column of vmstat -z output are
>>> quite large.  The cached items are reclaimed only when the page daemon
>>> wakes up to reclaim memory; if there are no memory shortages, large
>>> amounts of memory may accumulate in UMA caches.  In this case, the sum
>>> of the product of columns 2 and 5 gives a total of roughly 4GB cached.
>>>
 as well as "sysctl hw": http://www.grosbein.net/freebsd/leak/sysctl-hw.txt
 and "sysctl vm": http://www.grosbein.net/freebsd/leak/sysctl-vm.txt
>>
>> It seems page daemon is broken somehow as it did not reclaim several gigs of 
>> wired memory
>> despite of long period of vm thrashing:
> 
> Depending on the system's workload, it is possible for the caches to
> grow quite quickly after a reclaim.  If you are able to run kgdb on the
> kernel, you can find the time of the last reclaim by comparing the
> values of lowmem_uptime and time_uptime.

Yes, I have debugger compiled into running kernel and have console access.
What commands should I use?


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Eugene Grosbein
13.02.2019 1:29, Kurt Jaeger wrote:

> On a 8 GB 11.2p8 box doing mostly routing:
> 
>   42.3047 abd_chunk
> 40 zio_buf_131072
>  31.75 zio_data_buf_131072
>19.8901 swblk
>12.9224 RADIX NODE
>11.7344 zio_buf_16384
>10.0664 zio_data_buf_12288
>9.84375 zio_data_buf_40960
>  9.375 zio_data_buf_81920
>7.96875 zio_data_buf_98304
> 
> So, how do I understand this ? Please note that I use:
> 
> vfs.zfs.arc_max=1216348160

Each line shows how many megabytes is allocated but currently unused by 
corresponding
UMA zone (and unavailable for other consumers). Your numbers are pretty low,
you have nothing to worry about IMHO. I have hundreds of megabytes and 
gigabytes there.




___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Kurt Jaeger
Hi!

> > Use following command to see how much memory is wasted in your case:
> > 
> > vmstat -z | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, $1}' | sort 
> > -k1,1 -rn | head
> 
> Oops, small correction:
> 
> vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, 
> $1}' | sort -k1,1 -rn | head

On a 8 GB 11.2p8 box doing mostly routing:

  42.3047 abd_chunk
40 zio_buf_131072
 31.75 zio_data_buf_131072
   19.8901 swblk
   12.9224 RADIX NODE
   11.7344 zio_buf_16384
   10.0664 zio_data_buf_12288
   9.84375 zio_data_buf_40960
 9.375 zio_data_buf_81920
   7.96875 zio_data_buf_98304

So, how do I understand this ? Please note that I use:

vfs.zfs.arc_max=1216348160

-- 
p...@opsec.eu+49 171 3101372One year to go !
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Eugene Grosbein
13.02.2019 1:14, Eugene Grosbein wrote:

> Use following command to see how much memory is wasted in your case:
> 
> vmstat -z | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, $1}' | sort -k1,1 
> -rn | head

Oops, small correction:

vmstat -z | sed 's/:/,/' | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, $1}' 
| sort -k1,1 -rn | head

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Mark Johnston
On Wed, Feb 13, 2019 at 01:03:37AM +0700, Eugene Grosbein wrote:
> 12.02.2019 23:34, Mark Johnston wrote:
> 
> > I suspect that the "leaked" memory is simply being used to cache UMA
> > items.  Note that the values in the FREE column of vmstat -z output are
> > quite large.  The cached items are reclaimed only when the page daemon
> > wakes up to reclaim memory; if there are no memory shortages, large
> > amounts of memory may accumulate in UMA caches.  In this case, the sum
> > of the product of columns 2 and 5 gives a total of roughly 4GB cached.
> > 
> >> as well as "sysctl hw": http://www.grosbein.net/freebsd/leak/sysctl-hw.txt
> >> and "sysctl vm": http://www.grosbein.net/freebsd/leak/sysctl-vm.txt
> 
> It seems page daemon is broken somehow as it did not reclaim several gigs of 
> wired memory
> despite of long period of vm thrashing:

Depending on the system's workload, it is possible for the caches to
grow quite quickly after a reclaim.  If you are able to run kgdb on the
kernel, you can find the time of the last reclaim by comparing the
values of lowmem_uptime and time_uptime.

> $ sed 's/:/,/' vmstat-z.txt | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, 
> $1}' | sort -k1,1 -rn | head
>   1892 abd_chunk
>454.629 dnode_t
> 351.35 zio_buf_512
>228.391 zio_buf_16384
>173.968 dmu_buf_impl_t
> 130.25 zio_data_buf_131072
>93.6887 VNODE
>81.6978 arc_buf_hdr_t_full
>74.9368 256
>57.4102 4096
> 
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Eugene Grosbein
13.02.2019 0:57, Garrett Wollman wrote:

> In article 
> eu...@grosbein.net writes:
> 
>> Long story short: 11.2-STABLE/amd64 r335757 leaked over 4600MB kernel
>> wired memory over 81 days uptime
>> out of 8GB total RAM.
> 
> Not a whole lot of evidence yet, but anecdotally I'm seeing the same
> thing on some huge-memory NFS servers running releng/11.2.  They seem
> to run fine for a few weeks, then mysteriously start swapping
> continuously, a few hundred pages a second.  The continues for hours
> at a time, and then stops just as mysteriously.  Over time the total
> memory dedicated to ZFS ARC goes down but there's no decrease in wired
> memory.  I've tried disabling swap, but this seems to make the server
> unstable.  I have yet to find any obvious commonality (aside from the
> fact that these are all large-memory NFS servers which don't do much
> of anything else -- the only software running on them is related to
> managing and monitoring the NFS service).

I started to understand the issue. FreeBSD 11 has uma(9) zone allocator
for kernel subsystems and vmstat -z shows some stats for UMA zones.

When some subsystem using UMA frees its memory (including networking mbufs or 
ZFS ARC),
some kernel memory blocks are moved from USED to FREE category
inside corresponding UMA zone (see vmstat -z again) but this memory stays
unavailable to other consumers including userland applications
until pagedaemon reclaims this "FREE" memory back to global free pool.
This part seems to be broken in 11.2-STABLE.

Use following command to see how much memory is wasted in your case:

vmstat -z | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, $1}' | sort -k1,1 
-rn | head

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Eugene Grosbein
12.02.2019 23:34, Mark Johnston wrote:

> I suspect that the "leaked" memory is simply being used to cache UMA
> items.  Note that the values in the FREE column of vmstat -z output are
> quite large.  The cached items are reclaimed only when the page daemon
> wakes up to reclaim memory; if there are no memory shortages, large
> amounts of memory may accumulate in UMA caches.  In this case, the sum
> of the product of columns 2 and 5 gives a total of roughly 4GB cached.
> 
>> as well as "sysctl hw": http://www.grosbein.net/freebsd/leak/sysctl-hw.txt
>> and "sysctl vm": http://www.grosbein.net/freebsd/leak/sysctl-vm.txt

It seems page daemon is broken somehow as it did not reclaim several gigs of 
wired memory
despite of long period of vm thrashing:

$ sed 's/:/,/' vmstat-z.txt | awk -F, '{printf "%10s %s\n", $2*$5/1024/1024, 
$1}' | sort -k1,1 -rn | head
  1892 abd_chunk
   454.629 dnode_t
351.35 zio_buf_512
   228.391 zio_buf_16384
   173.968 dmu_buf_impl_t
130.25 zio_data_buf_131072
   93.6887 VNODE
   81.6978 arc_buf_hdr_t_full
   74.9368 256
   57.4102 4096

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Garrett Wollman
In article 
eu...@grosbein.net writes:

>Long story short: 11.2-STABLE/amd64 r335757 leaked over 4600MB kernel
>wired memory over 81 days uptime
>out of 8GB total RAM.

Not a whole lot of evidence yet, but anecdotally I'm seeing the same
thing on some huge-memory NFS servers running releng/11.2.  They seem
to run fine for a few weeks, then mysteriously start swapping
continuously, a few hundred pages a second.  The continues for hours
at a time, and then stops just as mysteriously.  Over time the total
memory dedicated to ZFS ARC goes down but there's no decrease in wired
memory.  I've tried disabling swap, but this seems to make the server
unstable.  I have yet to find any obvious commonality (aside from the
fact that these are all large-memory NFS servers which don't do much
of anything else -- the only software running on them is related to
managing and monitoring the NFS service).

-GAWollman

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Karl Denninger

On 2/12/2019 10:49, Eugene Grosbein wrote:
> 12.02.2019 23:34, Mark Johnston wrote:
>
>> I suspect that the "leaked" memory is simply being used to cache UMA
>> items.  Note that the values in the FREE column of vmstat -z output are
>> quite large.  The cached items are reclaimed only when the page daemon
>> wakes up to reclaim memory; if there are no memory shortages, large
>> amounts of memory may accumulate in UMA caches.  In this case, the sum
>> of the product of columns 2 and 5 gives a total of roughly 4GB cached.
> Forgot to note, that before I got system to single user mode, there was heavy 
> swap usage (over 3.5GB)
> and heavy page-in/page-out, 10-20 megabytes per second and system was 
> crawling slow due to pageing.

This is a manifestation of the general issue I've had an ongoing
"discussion" running in a long-running thread on bugzilla and the
interaction between UMA, ARC and the VM system.

The short version is that the VM system does pathological things
including paging out working set when there is a large amount of
allocated-but-unused UMA and the means by which the ARC code is "told"
that it needs to release RAM also interacts with the same mechanisms and
exacerbates the problem.

I've basically given up on getting anything effective to deal with this
merged into the code and have my own private set of patches that I
published for a while in that thread (and which had some collaborative
development and testing) but have given up on seeing anything meaningful
put into the base code.  To the extent I need them in a given workload
and environment I simply apply them on my own and go on my way.

I don't have enough experience with 12 yet to know if the same approach
will be necessary there (that is, what if any changes got into the 12.x
code), and never ran 11.2 much, choosing to stay on 11.1 where said
patches may not have been the most-elegant means of dealing with it but
were successful.  There was also a phabricator thread on this but I
don't know what part of it, if any, got merged (it was more-elegant, in
my opinion, than what I had coded up.)  Under certain workloads here
without the patches I was experiencing "freezes" due to unnecessary
page-outs onto spinning rust that in some cases reached into
double-digit *seconds.*  With them the issue was entirely resolved.

At the core of the issue is that "something" has to be taught that
*before* the pager starts evicting working set to swap if you have large
amounts of UMA allocated to ARC but not in use that RAM should be
released, and beyond that if you have ARC allocated and in use but are
approaching where VM is going to page working set out you need to come
up with some meaningful way of deciding whether to release some of the
ARC rather than take the page hit -- and in virtually every case the
answer to that question is to release the RAM consumed by ARC.  Part of
the issue is that UMA can be allocated for other things besides ARC yet
you really only want to release the ARC-related UMA that is
allocated-but-unused in this instance.

The logic is IMHO pretty simple on this -- a page-out of a process that
will run again always requires TWO disk operations -- one to page it out
right now and a second at a later time to page it back in.  A released
ARC cache *MAY* (if there would have been a cache hit in the future)
require ONE disk operation (to retrieve it from disk.)

Two is always greater than one and one is never worse than "maybe one
later" therefore prioritizing taking two *definite* disk I/Os or one
definite I/O now and one possible one later instead of one *possible*
disk I/O later is always a net lose -- and thus IMHO substantial effort
should be made to avoid doing that.

-- 
Karl Denninger
k...@denninger.net 
/The Market Ticker/
/[S/MIME encrypted email preferred]/


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Eugene Grosbein
12.02.2019 23:49, Eugene Grosbein wrote:
> 12.02.2019 23:34, Mark Johnston wrote:
> 
>> I suspect that the "leaked" memory is simply being used to cache UMA
>> items.  Note that the values in the FREE column of vmstat -z output are
>> quite large.  The cached items are reclaimed only when the page daemon
>> wakes up to reclaim memory; if there are no memory shortages, large
>> amounts of memory may accumulate in UMA caches.  In this case, the sum
>> of the product of columns 2 and 5 gives a total of roughly 4GB cached.
> 
> Forgot to note, that before I got system to single user mode, there was heavy 
> swap usage (over 3.5GB)
> and heavy page-in/page-out, 10-20 megabytes per second and system was 
> crawling slow due to pageing.

There was significant memory shortage due to Firefox having over 5GB RSS
plus other processes like Xorg, parts of xfce4 etc. Still, over 4GB of Wired 
memory.


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Eugene Grosbein
12.02.2019 23:34, Mark Johnston wrote:

> I suspect that the "leaked" memory is simply being used to cache UMA
> items.  Note that the values in the FREE column of vmstat -z output are
> quite large.  The cached items are reclaimed only when the page daemon
> wakes up to reclaim memory; if there are no memory shortages, large
> amounts of memory may accumulate in UMA caches.  In this case, the sum
> of the product of columns 2 and 5 gives a total of roughly 4GB cached.

Forgot to note, that before I got system to single user mode, there was heavy 
swap usage (over 3.5GB)
and heavy page-in/page-out, 10-20 megabytes per second and system was crawling 
slow due to pageing.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.2-STABLE kernel wired memory leak

2019-02-12 Thread Mark Johnston
On Tue, Feb 12, 2019 at 11:14:31PM +0700, Eugene Grosbein wrote:
> Hi!
> 
> Long story short: 11.2-STABLE/amd64 r335757 leaked over 4600MB kernel wired 
> memory over 81 days uptime
> out of 8GB total RAM.
> 
> Details follow.
> 
> I have a workstation running Xorg, Firefox, Thunderbird, LibreOffice and 
> occasionally VirtualBox for single VM.
> 
> It has two identical 320GB HDDs combined with single graid-based array with 
> "Intel"
> on-disk format having 3 volumes:
> - one "RAID1" volume /dev/raid/r0 occupies first 10GB or each HDD;
> - two "SINGLE" volumes /dev/raid/r1 and /dev/raid/r2 that utilize "tails" of 
> HDDs (310GB each).
> 
> /dev/raid/r0 (10GB) has MBR partitioning and two slices:
> - /dev/raid/r0s1 (8GB) is used for swap;
> - /dev/raid/r0s2 (2GB) is used by non-redundant ZFS pool named "os" that 
> contains only
> root file system (177M used) and /usr file system (340M used).
> 
> There is also second pool (ZMIRROR) named "z" built directly on top of 
> /dev/raid/r[12] volumes,
> this pool contains all other file systems including /var, /home, /usr/ports, 
> /usr/local, /usr/{src|obj} etc.
> 
> # zpool list
> NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAGCAP  DEDUP  HEALTH  
> ALTROOT
> os1,98G   520M  1,48G- -55%25%  1.00x  ONLINE  -
> z  288G  79,5G   209G- -34%27%  1.00x  ONLINE  -
> 
> This way I have swap outside of ZFS, boot blocks and partitioning mirrored by 
> means of GEOM_RAID and
> can use local console to break to single user mode to unmount all file system 
> other than root and /usr
> and can even export bigger ZFS pool "z". And I did that to see that ARC usage
> (limited with vfs.zfs.arc_max="3G" in /boot/loader.conf) dropped from over 
> 2500MB 
> down to 44MB but "Wired" stays high. Now after I imported "z" back and booted 
> to multiuser mode
> top(1) shows:
> 
> last pid: 51242;  load averages:  0.24,  0.16,  0.13  up 81+02:38:38  22:59:18
> 104 processes: 1 running, 103 sleeping
> CPU:  0.0% user,  0.0% nice,  0.4% system,  0.2% interrupt, 99.4% idle
> Mem: 84M Active, 550M Inact, 4K Laundry, 4689M Wired, 2595M Free
> ARC: 273M Total, 86M MFU, 172M MRU, 64K Anon, 1817K Header, 12M Other
>  117M Compressed, 333M Uncompressed, 2.83:1 Ratio
> Swap: 8192M Total, 940K Used, 8191M Free
> 
> I have KDB and DDB in my custom kernel also. How do I debug the leak further?
> 
> I use nvidia-driver-340-340.107 driver for GK208 [GeForce GT 710B] video card.
> Here are outputs of "vmstat -m": 
> http://www.grosbein.net/freebsd/leak/vmstat-m.txt
> and "vmstat -z": http://www.grosbein.net/freebsd/leak/vmstat-z.txt

I suspect that the "leaked" memory is simply being used to cache UMA
items.  Note that the values in the FREE column of vmstat -z output are
quite large.  The cached items are reclaimed only when the page daemon
wakes up to reclaim memory; if there are no memory shortages, large
amounts of memory may accumulate in UMA caches.  In this case, the sum
of the product of columns 2 and 5 gives a total of roughly 4GB cached.

> as well as "sysctl hw": http://www.grosbein.net/freebsd/leak/sysctl-hw.txt
> and "sysctl vm": http://www.grosbein.net/freebsd/leak/sysctl-vm.txt
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"