Re: numa involved in instability and swap usage despite RAM free?

2018-06-26 Thread Shane Ambler
On 27/06/2018 07:52, Steve Kargl wrote:
> On Tue, Jun 26, 2018 at 02:39:27PM -0700, Adrian Chadd wrote:
>> On Mon, 25 Jun 2018 at 11:23, Steve Kargl
>>  wrote:
>>>
>>> On Sun, Jun 24, 2018 at 12:03:29PM +0200, Alexander Leidinger wrote:

 I don't have hard evidence, but there is enough "smell" to open up a
 discussion...

 Short:
 Can it be that enabling numa in the kernel is the reason why some
 people see instability with zfs and usage of swap while a lot of free
 RAM is available?
>>>
>>> Interesting observation.  I do have NUMA in my kernel, and swap
>>> seems to be used instead of recycling freeing inactive memory.
>>> Top shows
>>>
>>> Mem: 506M Active, 27G Inact, 98M Laundry, 2735M Wired, 1474M Buf, 1536M Free
>>> Swap: 16G Total, 120M Used, 16G Free

>From someone that has had memory issues since 10.1 (bug 194654), I have
recently realised something that seems to make some sense to me.

The arc_max setting is a limit of zfs arc and this ram gets wired to
prevent it swapping, this makes sense.

The vm.max_wired is a value that I had thought was ignored but now I see
that these are two values of wired memory which are not connected.
max_wired appears to default to 30% of kmem_size.

Both of these values are added together to be reported in
vm.stats.vm.v_wire_count which is the wired value shown by top. This
appears to be the reason that I can see 9G wired when max_wired is at 5G

The implications of this is that together (arc_max + max_wired) can be
set to more than the physical installed ram. I can verify that with 8G
installed and the two values add up to more than 7G you get no choice
but a hard reset. Since upgrading to 16G I have been more vigilant and
not allowed more than 10G to be wired so haven't had that problem in a
year and a half.

With the default arc_max usually set to ram minus 1G and max_wired at 5G
it is easy to see that the current defaults are dangerous.

I have not seen max_wired mentioned in relation to zfs but it seems that
it should be considered when setting arc_max to prevent over wiring ram.

Close to three weeks ago I applied review D7538 to my everyday desktop
running stable/11. Until a few days ago I had no swap usage which is now
at 9M. In the last few years of monitoring wired usage to try and find a
solution I have not seen less than 1G of swap usage after an hour of
uptime. If nothing else D7538 makes arc more willing to be released.


-- 
FreeBSD - the place to B...Storing Data

Shane Ambler

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: numa involved in instability and swap usage despite RAM free?

2018-06-26 Thread Steve Kargl
On Tue, Jun 26, 2018 at 02:39:27PM -0700, Adrian Chadd wrote:
> On Mon, 25 Jun 2018 at 11:23, Steve Kargl
>  wrote:
> >
> > On Sun, Jun 24, 2018 at 12:03:29PM +0200, Alexander Leidinger wrote:
> > >
> > > I don't have hard evidence, but there is enough "smell" to open up a
> > > discussion...
> > >
> > > Short:
> > > Can it be that enabling numa in the kernel is the reason why some
> > > people see instability with zfs and usage of swap while a lot of free
> > > RAM is available?
> >
> > Interesting observation.  I do have NUMA in my kernel, and swap
> > seems to be used instead of recycling freeing inactive memory.
> > Top shows
> >
> > Mem: 506M Active, 27G Inact, 98M Laundry, 2735M Wired, 1474M Buf, 1536M Free
> > Swap: 16G Total, 120M Used, 16G Free
> >
> > Perhaps, I don't understand what is meant by inactive memory.  I
> > thought that this means memory is still available in the buffer
> > cache, but nothing is current using what is there.
> >
> 
> Aren't there now per-domain VM counters you can query via sysctl?
> Maybe they'd help in diagnosing what's going on.
> 

I upgraded to a r335642 yesterday.  I haven't seen the swapping
problem, yet; although I've tried to force it.  There are 158
sysctl knobs that contain the string "vm".  Do you have a pointer
any particular one to monitor?

-- 
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: numa involved in instability and swap usage despite RAM free?

2018-06-26 Thread Adrian Chadd
Hi,

Aren't there now per-domain VM counters you can query via sysctl?
Maybe they'd help in diagnosing what's going on.



-adrian

On Mon, 25 Jun 2018 at 11:23, Steve Kargl
 wrote:
>
> On Sun, Jun 24, 2018 at 12:03:29PM +0200, Alexander Leidinger wrote:
> >
> > I don't have hard evidence, but there is enough "smell" to open up a
> > discussion...
> >
> > Short:
> > Can it be that enabling numa in the kernel is the reason why some
> > people see instability with zfs and usage of swap while a lot of free
> > RAM is available?
>
> Interesting observation.  I do have NUMA in my kernel, and swap
> seems to be used instead of recycling freeing inactive memory.
> Top shows
>
> Mem: 506M Active, 27G Inact, 98M Laundry, 2735M Wired, 1474M Buf, 1536M Free
> Swap: 16G Total, 120M Used, 16G Free
>
> Perhaps, I don't understand what is meant by inactive memory.  I
> thought that this means memory is still available in the buffer
> cache, but nothing is current using what is there.
>
> --
> Steve
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: numa involved in instability and swap usage despite RAM free?

2018-06-25 Thread Steve Kargl
On Sun, Jun 24, 2018 at 12:03:29PM +0200, Alexander Leidinger wrote:
> 
> I don't have hard evidence, but there is enough "smell" to open up a  
> discussion...
> 
> Short:
> Can it be that enabling numa in the kernel is the reason why some  
> people see instability with zfs and usage of swap while a lot of free  
> RAM is available?

Interesting observation.  I do have NUMA in my kernel, and swap
seems to be used instead of recycling freeing inactive memory.
Top shows

Mem: 506M Active, 27G Inact, 98M Laundry, 2735M Wired, 1474M Buf, 1536M Free
Swap: 16G Total, 120M Used, 16G Free

Perhaps, I don't understand what is meant by inactive memory.  I
thought that this means memory is still available in the buffer
cache, but nothing is current using what is there.

--  
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: numa involved in instability and swap usage despite RAM free?

2018-06-24 Thread Mark Millard
Alexander Leidinger Alexander at leidinger.net wrote on
Sun Jun 24 10:03:49 UTC 2018 :

> Short:
> Can it be that enabling numa in the kernel is the reason why some  
> people see instability with zfs and usage of swap while a lot of free  
> RAM is available?

[It will likely be a few months before I again have access to the
environment these notes are based on. It has been about a month
since I last had access.]

On a AMD Ryzen Threadripper 1950X (16 cores, 2 HW threads per core)
I enabled:

options NUMA
options MAXMEMDOM=2

in fairly recent times. This is a UFS context, not a ZFS one. I'd
not been explicitly controlling how things run (so using defaults).
This is head with debugging disabled (via including GENERIC and
overriding).

I did not have the swap usage problem with doing many buildworld
buildkernel (self hosted and the cross builds for several targets).
Nor when did a poudriere bulk -a (with ALLOW_MAKE_JOBS=yes ). This
was a FreeBSD native boot context at the time. (I usually have run
the same drives under Hyper-V but have not seen the problem there
either.) For native FreeBSD I used -j32 (in buildworld/buildkernel
terms but also for the bulk -a) and for under-Hyper-V I used -j28 .
96 GiBytes of ECC RAM total (48 GiBytes/NUMA-node).

I'm not sure how common NUMA being enabled is, nor how common
various MAXMEMDOM settings are. I'd not be surprised if various
folks reporting problems had not explicitly enabled NUMA, nor
set some explicit MAXMEMDOM figure.

It may be that they all have ZFS in common in fairly recent times.

(I'm ignoring examples of long-latency I/O on the same device as
some swap partitions that are in use: this gets into Out Of Memory
process killing without the swap being mostly used. Some reports
of swap problems have this sort of issue involved on small systems
unlikely to be using ZFS.)


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"