Re: poudriere, swap full and top says memory is free ?

2019-09-14 Thread Don Lewis
On 14 Sep, John-Mark Gurney wrote:
> Kurt Jaeger wrote this message on Sat, Sep 14, 2019 at 19:38 +0200:
>> - a poudriere build
>> - of a list of ports
>> - on 12.0-RELEASE-p10
>> - on a 4 core+4 hyperthreads CPU, an Intel(R) Xeon(R) CPU E3-1230 v6
>>   @ 3.50GHz
>> - with 32 GB RAM
>> - zpool with 2x 500 GB SSDs as a mirror
>> 
>> and right now, this can be seen:
>> 
>> last pid: 90922;  load averages:  5.02,  5.14,  5.73up 0+03:53:08  
>> 19:31:05
>> 82 processes:  6 running, 76 sleeping
>> CPU: 60.6% user,  0.0% nice,  2.1% system,  0.0% interrupt, 37.3% idle
>> Mem: 4598M Active, 2854M Inact, 11G Laundry, 6409M Wired, 6375M Free
>> ARC: 3850M Total, 1721M MFU, 2090M MRU, 665K Anon, 19M Header, 19M Other
>>  3406M Compressed, 3942M Uncompressed, 1.16:1 Ratio
>> Swap: 18G Total, 18G Used, 396K Free, 99% Inuse, 68K In
>> 
>> So: Swap is full, approx. 6 GB memory is reported as free.
>> 
>> This is surprising. Can I somehow tune this in any way, so that
>> the memory available is used for the build ? Or is the problem somewhere
>> else ?
> 
> Are you sure that this hasn't just recently completed a large link of
> something like Chromium?  There are known to be compiles that can take
> many GB's of memory and if they recently exited, there hasn't been time
> to swap stuff back in...  or is this the steady state over the entire
> compile?

This is sort of an odd case.  I suspect that swap filled and then a
process that was using a large amount of memory but no swap exited or
was killed.  That freed a bunch of memory, but no swap.

I'm pretty sure that when a memory page is paged back in from swap, that
the copy in swap is retained and not deallocated.  Under memory
pressure, that allowed the page to be stolen without having to write it
back out to swap again, unless it was re-dirtied in the meantime.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: spurious out of swap kills

2019-09-14 Thread Don Lewis
On 13 Sep, Mark Millard wrote:
> Don Lewis truckman at FreeBSD.org wrote on
> Thu Sep 12 23:00:19 UTC 2019 :
> 
> . . .
>> Nevertheless, I see these errors,
>> with rustc being the usual victim:
>> 
>> Sep 11 23:21:43 zipper kernel: pid 16581 (rustc), jid 43, uid 65534, was 
>> killed: out of swap space
>> Sep 12 02:48:23 zipper kernel: pid 1209 (rustc), jid 62, uid 65534, was 
>> killed: out of swap space
> . . .
> 
> Unfortunately, the wording of this type of message is a misnomer for
> what drives the kills: it is actually driven by being unable to gain
> more free memory but FreeBSD will not swap-out processes that stay
> runnable (or are running), only ones that are waiting. Even a single
> process that stays runnable and keeps lots of RAM in the active
> category can lead to kills when swap is unused or little used. So the
> kill-behavior is very workload dependent.
> 
> Real "out of swap" conditions tend to also have messages
> similar to:
> 
> Aug  5 17:54:01 sentinel kernel: swap_pager_getswapspace(32): failed
> 
> If you are not seeing such messages, then it is likely that
> the mount of swap space still free is not the actual thing
> driving the kills.
> 
> Are you seeing "swap_pager_getswapspace(32): failed" messages?
> 
> (It used to be that the system simply leaves the dirty pages in
> memory when a swap_pager_getswapspace failed message is produced.
> Of itself, it did not cause a kill. I do not know about now.)

I'm only getting the "out of swap" error.  Yes, it is misleading because
there are still tens of GB of free swap.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: spurious out of swap kills

2019-09-14 Thread Don Lewis
On 13 Sep, Konstantin Belousov wrote:
> On Thu, Sep 12, 2019 at 05:42:00PM -0700, Don Lewis wrote:
>> On 12 Sep, Mark Johnston wrote:
>> > On Thu, Sep 12, 2019 at 04:00:17PM -0700, Don Lewis wrote:
>> >> My poudriere machine is running 13.0-CURRENT and gets updated to the
>> >> latest version of -CURRENT periodically.  At least in the last week or
>> >> so, I've been seeing occasional port build failures when building my
>> >> default set of ports, and I finally had some time to do some
>> >> investigation.
>> >> 
>> >> It's a 16-thread Ryzen machine, with 64 GB of RAM and 40 GB of swap.
>> >> Poudriere is configured with
>> >>   USE_TMPFS="wrkdir data localbase"
>> >> and I have
>> >>   .if ${.CURDIR:M*/www/chromium}
>> >>   MAKE_JOBS_NUMBER=16
>> >>   .else
>> >>   MAKE_JOBS_NUMBER=7
>> >>   .endif
>> >> in /usr/local/etc/poudriere.d/make.conf, since this gives me the best
>> >> overall build time for my set of ports.  This hits memory pretty hard,
>> >> especially when chromium, firefox, libreoffice, and both versions of
>> >> openoffice are all building at the same time.  During this time, the
>> >> amount of space consumed by tmpfs for /wrkdir gets large when building
>> >> these large ports.  There is not enough RAM to hold it all, so some of
>> >> the older data spills over to swap.  Swap usage peaks at about 10 GB,
>> >> leaving about 30 GB of free swap.  Nevertheless, I see these errors,
>> >> with rustc being the usual victim:
>> >> 
>> >> Sep 11 23:21:43 zipper kernel: pid 16581 (rustc), jid 43, uid 65534, was 
>> >> killed: out of swap space
>> >> Sep 12 02:48:23 zipper kernel: pid 1209 (rustc), jid 62, uid 65534, was 
>> >> killed: out of swap space
>> >> 
>> >> Top shows the size of rustc being about 2 GB, so I doubt that it
>> >> suddenly needs an additional 30 GB of swap.
>> >> 
>> >> I'm wondering if there might be a transient kmem shortage that is
>> >> causing a malloc(..., M_NOWAIT) failure in the swap allocation path
>> >> that is the cause of the problem.
>> > 
>> > Perhaps this is a consequence of r351114?  To confirm this, you might
>> > try increasing the value of vm.pfault_oom_wait to a larger value, like
>> > 20 or 30, and see if the OOM kills still occur.
>> 
>> I wonder if increasing vm.pfault_oom_attempts might also be a good idea.
> If you are sure that you cannot exhaust your swap space, set
> attempts to -1 to disable this mechanism.

I had success just by increasing vm.pfault_oom_attempts from 3 to 10.

> Basically, page fault handler waits for vm.pfault_oom_wait *
> vm.pfault_oom_attempts for a page allocation before killing the process.
> Default is 30 secs, and if you cannot get a page for 30 secs, there is
> something very wrong with the machine.

There is nothing really wrong with the machine.  The load is just high.
Probably pretty bad for interactivity, but throughput is just fine, with
CPU %idle pretty much pegged at zero the whole time.

I kept an eye on the machine for a while during a run with the new
tuning.  Most of the time, free memory bounced between 2 and 4 GB, with
little page out activity.  There were about 60 running processes, most
of which were writing to 16 tmpfs filesystems.  Sometimes free memory
dropped into the 1 to 2 GB range and pageouts spiked.  This condition
could persist for 30 seconds or more, which is probably the reason for
the OOM kills with the default tuning.  I sometimes saw free memory drop
below 1 GB.  The lowest I aaw was 470 MB.  I'm guessing that this code
fails page allocation when free memmory is below some threshold to avoid
potential deadlocks.

Swap on this machine consists of a gmirror pair of partitions on a pair
of 1 TB WD Green drives, that are now on their third computer.  The
remainder of the space of the drives are used for the mirrored vdev for
the system zpool. Not terribly fast, even in the days when these drives
were new, but mostly fast enough to keep all the CPU cores busy other
than during poudriere startup and wind down when there isn't enough work
to go around.  I could spend money on faster storage, but it really
wouldn't decrease poudriere run time much. It probably is close enough
to the limit that I would need to improve storage speed if I swapped the
Ryzen for a Threadripper.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: spurious out of swap kills

2019-09-14 Thread Mark Millard
Konstantin Belousov kostikbel at gmail.com wrote on
Fri Sep 13 05:53:41 UTC 2019 :

> Basically, page fault handler waits for vm.pfault_oom_wait *
> vm.pfault_oom_attempts for a page allocation before killing the process.
> Default is 30 secs, and if you cannot get a page for 30 secs, there is
> something very wrong with the machine.

The following was not for something like a Ryzen, but for
an armv7 board using a USB device for the file system and
swap/paging partition. Still it may be a suggestive
example of writing out a large amount of laundry.

There was an exchange I had with Warner L. that implied easily
having long waits in the queue when trying to write out the
laundry (or other such) in low end contexts. I extract some
of it below.

dT: 1.006s  w: 1.000s
 L(q)  ops/sr/s   kBps   ms/rw/s   kBps   ms/wd/s   kBps   ms/d   
%busy Name
   56312  0  00.0312  19985  142.6  0  00.0   
99.6| da0

Note: L(q) could be a lot bigger than 56 but I work with the
example figures that I used at the time and that Warner commented on.
The 142.6 ms/w includes time waiting in the queue and was vastly
more stable than the L(q) figures.

Warner wrote, in part:

QUOTE
142.6ms/write is the average of the time that the operations that completed 
during the polling interval took to complete. There's no estimating here.

So, at 6 or 7 per second for the operation to complete, coupled with a parallel 
factor of 1 (typical for low end junk flash), we wind up with 56 operations in 
the queue taking 8-10s to complete.
END QUOTE

Things went on from there but part of it was based on a
reporting patch that Mark Johnston had provided.

Me:
It appears to me that, compared to a observed capacity of
roughly around 20 MiBytes/sec for writes, large amounts of
bytes are being queued up to be written in a short time,
for which it just takes a while for the backlog to be
finished.


Warner:
Yes. That matches my expectation as well. In other devices, I've found that I 
needed to rate-limit things to more like 50-75% of the max value to keep 
variance in performance low. It's the whole reason I wrote the CAM I/O 
scheduler.

Me:
The following is from multiple such runs, several manually
stopped but some killed because of sustained low free
memory. I had left vm.pageout_oom_seq=12 in place for this,
making the kills easier to get than the 120 figure would. It
does not take very long generally for some sort of message to
show up.

(Added Note: 65s and 39s were at the large end of what I reported
at the time.)

. . .
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164064, size: 12288
waited 65s for async swap write
waited 65s for swap buffer
waited 65s for async swap write
waited 65s for async swap write
waited 65s for async swap write
v_free_count: 955, v_inactive_count: 1
Aug 20 06:11:49 pine64 kernel: pid 1047 (stress), uid 0, was killed: out of 
swap space
waited 5s for async swap write
waited 5s for swap buffer
waited 5s for async swap write
waited 5s for async swap write
waited 5s for async swap write
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314021, size: 12288
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314084, size: 32768
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314856, size: 32768
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314638, size: 131072
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 312518, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 312416, size: 16384
waited 39s for async swap write
waited 39s for swap buffer
waited 39s for async swap write
waited 39s for async swap write
waited 39s for async swap write
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314802, size: 24576
. . .

Warner:
These numbers are consistent with the theory that the swap device becomes 
overwhelmed, spiking latency and causing crappy down-stream effects. You can 
use the I/O scheduler to limit the write rates at the low end. You might also 
be able to schedule a lower write queue depth at the top end as well, but I've 
not seen good ways to do that.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: poudriere, swap full and top says memory is free ?

2019-09-14 Thread Kurt Jaeger
Hi!

> > Mem: 4598M Active, 2854M Inact, 11G Laundry, 6409M Wired, 6375M Free
> > ARC: 3850M Total, 1721M MFU, 2090M MRU, 665K Anon, 19M Header, 19M Other
> >  3406M Compressed, 3942M Uncompressed, 1.16:1 Ratio
> > Swap: 18G Total, 18G Used, 396K Free, 99% Inuse, 68K In
> > 
> > So: Swap is full, approx. 6 GB memory is reported as free.

> > This is surprising. Can I somehow tune this in any way, so that
> > the memory available is used for the build ? Or is the problem somewhere
> > else ?
> 
> Are you sure that this hasn't just recently completed a large link of
> something like Chromium?

Yes, because I plot memory/swap/etc using nagios. It's not only
a spike.

> There are known to be compiles that can take
> many GB's of memory and if they recently exited, there hasn't been time
> to swap stuff back in...  or is this the steady state over the entire
> compile?

Building a few ports (firefox, libreoffice etc) takes some time,
so it has been stable during that phase.

-- 
p...@opsec.eu+49 171 3101372One year to go !
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: head -r352274 buildkernel targetting armv7 failure: am335x/am335x_dmtpps.c:304:3: error: implicit declaration of function 'spinlock_enter' is invalid in C99 [-Werror,-Wimplicit-function-declaratio

2019-09-14 Thread Mark Millard



On 2019-Sep-14, at 11:21, Ian Lepore  wrote:

> On Sat, 2019-09-14 at 11:05 -0700, Mark Millard via freebsd-arm wrote:
>> After updating my amd64 context to head -r352274,
>> attempting an amd64->armv7 cross buildworld buildkernel
>> ended up failing with:
>> 
>> 
>> --- am335x_dmtpps.o ---
>> /usr/src/sys/arm/ti/am335x/am335x_dmtpps.c:304:3: error: implicit
>> declaration of function 'spinlock_enter' is invalid in C99 [-Werror,-
>> Wimplicit-function-declaration]
>>mtx_lock_spin(>pps_mtx);
>>^
>> (...shortened...)
>> . . .
>> 
>> (spinlock_enter was not the only example.)
>> 
>> 
> 
> My bad, I forgot to include  when I switched the code to
> spinlocks.  Should be fixed by r352333.

Thanks.

It is interesting that:

https://ci.freebsd.org/job/FreeBSD-head-armv7-build/6042/

shows a successful build of -r352274 (the last before
-r352275 broke both arm and aarch64). Prior builds also
were successful.

I'm manually applied your update to -r352274 and am rebuilding
from scratch.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: poudriere, swap full and top says memory is free ?

2019-09-14 Thread John-Mark Gurney
Kurt Jaeger wrote this message on Sat, Sep 14, 2019 at 19:38 +0200:
> - a poudriere build
> - of a list of ports
> - on 12.0-RELEASE-p10
> - on a 4 core+4 hyperthreads CPU, an Intel(R) Xeon(R) CPU E3-1230 v6
>   @ 3.50GHz
> - with 32 GB RAM
> - zpool with 2x 500 GB SSDs as a mirror
> 
> and right now, this can be seen:
> 
> last pid: 90922;  load averages:  5.02,  5.14,  5.73up 0+03:53:08  
> 19:31:05
> 82 processes:  6 running, 76 sleeping
> CPU: 60.6% user,  0.0% nice,  2.1% system,  0.0% interrupt, 37.3% idle
> Mem: 4598M Active, 2854M Inact, 11G Laundry, 6409M Wired, 6375M Free
> ARC: 3850M Total, 1721M MFU, 2090M MRU, 665K Anon, 19M Header, 19M Other
>  3406M Compressed, 3942M Uncompressed, 1.16:1 Ratio
> Swap: 18G Total, 18G Used, 396K Free, 99% Inuse, 68K In
> 
> So: Swap is full, approx. 6 GB memory is reported as free.
> 
> This is surprising. Can I somehow tune this in any way, so that
> the memory available is used for the build ? Or is the problem somewhere
> else ?

Are you sure that this hasn't just recently completed a large link of
something like Chromium?  There are known to be compiles that can take
many GB's of memory and if they recently exited, there hasn't been time
to swap stuff back in...  or is this the steady state over the entire
compile?

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 "All that I will do, has been done, All that I have, has not."
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: head -r352274 buildkernel targetting armv7 failure: am335x/am335x_dmtpps.c:304:3: error: implicit declaration of function 'spinlock_enter' is invalid in C99 [-Werror,-Wimplicit-function-declaratio

2019-09-14 Thread Ian Lepore
On Sat, 2019-09-14 at 11:05 -0700, Mark Millard via freebsd-arm wrote:
> After updating my amd64 context to head -r352274,
> attempting an amd64->armv7 cross buildworld buildkernel
> ended up failing with:
> 
> 
> --- am335x_dmtpps.o ---
> /usr/src/sys/arm/ti/am335x/am335x_dmtpps.c:304:3: error: implicit
> declaration of function 'spinlock_enter' is invalid in C99 [-Werror,-
> Wimplicit-function-declaration]
> mtx_lock_spin(>pps_mtx);
> ^
> /usr/src/sys/sys/mutex.h:383:26: note: expanded from macro
> 'mtx_lock_spin'
> #define mtx_lock_spin(m)mtx_lock_spin_flags((m), 0)
> ^
> /usr/src/sys/sys/mutex.h:452:2: note: expanded from macro
> 'mtx_lock_spin_flags'
> mtx_lock_spin_flags_((m), (opts), LOCK_FILE, LOCK_LINE)
> ^
> /usr/src/sys/sys/mutex.h:429:2: note: expanded from macro
> 'mtx_lock_spin_flags_'
> __mtx_lock_spin((m), curthread, (opts), (file), (line))
> ^
> /usr/src/sys/sys/mutex.h:258:2: note: expanded from macro
> '__mtx_lock_spin'
> spinlock_enter();
>\
> ^
> /usr/src/sys/arm/ti/am335x/am335x_dmtpps.c:304:3: error: this
> function declaration is not a prototype [-Werror,-Wstrict-prototypes]
> /usr/src/sys/sys/mutex.h:383:26: note: expanded from macro
> 'mtx_lock_spin'
> #define mtx_lock_spin(m)mtx_lock_spin_flags((m), 0)
> ^
> /usr/src/sys/sys/mutex.h:452:2: note: expanded from macro
> 'mtx_lock_spin_flags'
> mtx_lock_spin_flags_((m), (opts), LOCK_FILE, LOCK_LINE)
> ^
> /usr/src/sys/sys/mutex.h:429:2: note: expanded from macro
> 'mtx_lock_spin_flags_'
> __mtx_lock_spin((m), curthread, (opts), (file), (line))
> ^
> /usr/src/sys/sys/mutex.h:258:2: note: expanded from macro
> '__mtx_lock_spin'
> spinlock_enter();
>\
> ^
> . . .
> 
> (spinlock_enter was not the only example.)
> 
> 

My bad, I forgot to include  when I switched the code to
spinlocks.  Should be fixed by r352333.

-- Ian

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


head -r352274 buildkernel targetting armv7 failure: am335x/am335x_dmtpps.c:304:3: error: implicit declaration of function 'spinlock_enter' is invalid in C99 [-Werror,-Wimplicit-function-declaration]

2019-09-14 Thread Mark Millard
After updating my amd64 context to head -r352274,
attempting an amd64->armv7 cross buildworld buildkernel
ended up failing with:


--- am335x_dmtpps.o ---
/usr/src/sys/arm/ti/am335x/am335x_dmtpps.c:304:3: error: implicit declaration 
of function 'spinlock_enter' is invalid in C99 
[-Werror,-Wimplicit-function-declaration]
mtx_lock_spin(>pps_mtx);
^
/usr/src/sys/sys/mutex.h:383:26: note: expanded from macro 'mtx_lock_spin'
#define mtx_lock_spin(m)mtx_lock_spin_flags((m), 0)
^
/usr/src/sys/sys/mutex.h:452:2: note: expanded from macro 'mtx_lock_spin_flags'
mtx_lock_spin_flags_((m), (opts), LOCK_FILE, LOCK_LINE)
^
/usr/src/sys/sys/mutex.h:429:2: note: expanded from macro 'mtx_lock_spin_flags_'
__mtx_lock_spin((m), curthread, (opts), (file), (line))
^
/usr/src/sys/sys/mutex.h:258:2: note: expanded from macro '__mtx_lock_spin'
spinlock_enter();   \
^
/usr/src/sys/arm/ti/am335x/am335x_dmtpps.c:304:3: error: this function 
declaration is not a prototype [-Werror,-Wstrict-prototypes]
/usr/src/sys/sys/mutex.h:383:26: note: expanded from macro 'mtx_lock_spin'
#define mtx_lock_spin(m)mtx_lock_spin_flags((m), 0)
^
/usr/src/sys/sys/mutex.h:452:2: note: expanded from macro 'mtx_lock_spin_flags'
mtx_lock_spin_flags_((m), (opts), LOCK_FILE, LOCK_LINE)
^
/usr/src/sys/sys/mutex.h:429:2: note: expanded from macro 'mtx_lock_spin_flags_'
__mtx_lock_spin((m), curthread, (opts), (file), (line))
^
/usr/src/sys/sys/mutex.h:258:2: note: expanded from macro '__mtx_lock_spin'
spinlock_enter();   \
^
. . .

(spinlock_enter was not the only example.)

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


poudriere, swap full and top says memory is free ?

2019-09-14 Thread Kurt Jaeger
Hi!

I'm running

- a poudriere build
- of a list of ports
- on 12.0-RELEASE-p10
- on a 4 core+4 hyperthreads CPU, an Intel(R) Xeon(R) CPU E3-1230 v6
  @ 3.50GHz
- with 32 GB RAM
- zpool with 2x 500 GB SSDs as a mirror

and right now, this can be seen:

last pid: 90922;  load averages:  5.02,  5.14,  5.73up 0+03:53:08  19:31:05
82 processes:  6 running, 76 sleeping
CPU: 60.6% user,  0.0% nice,  2.1% system,  0.0% interrupt, 37.3% idle
Mem: 4598M Active, 2854M Inact, 11G Laundry, 6409M Wired, 6375M Free
ARC: 3850M Total, 1721M MFU, 2090M MRU, 665K Anon, 19M Header, 19M Other
 3406M Compressed, 3942M Uncompressed, 1.16:1 Ratio
Swap: 18G Total, 18G Used, 396K Free, 99% Inuse, 68K In

So: Swap is full, approx. 6 GB memory is reported as free.

This is surprising. Can I somehow tune this in any way, so that
the memory available is used for the build ? Or is the problem somewhere
else ?

Running similar builds on 12.0 without patches reported
swap_pager_getswapspace(24): failed
messages.

-- 
p...@opsec.eu+49 171 3101372One year to go !
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: spurious out of swap kills

2019-09-14 Thread bob prohaska
On Fri, Sep 13, 2019 at 10:59:58PM -0700, Mark Millard wrote:
> bob prohaska fbsd at www.zefox.net wrote on
> Fri Sep 13 16:24:57 UTC 2019 :
> 
> > Not sure this is relevant, but in compiling chromium on a Pi3 with 6 GB
> > of swap the job completed successfully some months ago, with peak swap 
> > use around 3.5 GB. The swap layout was sub-optimal, with a 2 GB partition
> > combined with a 4 GB partition. A little over 4GB total seems usable. 
> > 
> > A few days ago the same attempt stopped with a series of OOMA kills,
> > but in each case simply restarting allowed the compile to pick up
> > where it left off and continue, eventually finishing with a runnable
> > version of chromium. In this case swap use peaked a little over 4 GB.
> > 
> > Might this suggest the machine isn't freeing swap in a timely manner?
> 
> Are you saying that your increases to:
> 
> vm.pageout_oom_seq
> 
> no longer prove sufficient? What value for vm.pageout_oom_seq were
> you using that got the recent failures?
> 
Correct. Initial value was 2048, later raised to 4096. Far as I could
tell the change didn't help. No explict j value was set for make, but
no more than four jobs were observed in top 

A log of storage activity along with swap total and the last two 
console messages is at
http://www.zefox.net/~fbsd/rpi3/swaptests/r351586/swapscript.log
along with a sorted list of total swap use, which can be used as
a sort of index to the log file. 

The initial "out of swap space" at the very beginning
is a relic from before logging started. 

Da0 is a Sandisk SDCZ80 usb 3.0 device, mmcsd0 is a Samsung
Evo + 128 GB device.

The two points of curiosity to me are:
1. Why did swap use increase from 3.5 GB months ago to 4.2 GB now?
2. Why does stopping and restarting make (which would seem to free
un-needed swap) allow the job to finish?

> If more or different configuration/tuning is required, I'm going to
> eventually want to learn about it as well.
> 
You will have some company.

Thanks for reading,

bob prohaska

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: spurious out of swap kills

2019-09-14 Thread Mark Millard
Don Lewis truckman at FreeBSD.org wrote on
Thu Sep 12 23:00:19 UTC 2019 :

. . .
> Nevertheless, I see these errors,
> with rustc being the usual victim:
> 
> Sep 11 23:21:43 zipper kernel: pid 16581 (rustc), jid 43, uid 65534, was 
> killed: out of swap space
> Sep 12 02:48:23 zipper kernel: pid 1209 (rustc), jid 62, uid 65534, was 
> killed: out of swap space
. . .

Unfortunately, the wording of this type of message is a misnomer for
what drives the kills: it is actually driven by being unable to gain
more free memory but FreeBSD will not swap-out processes that stay
runnable (or are running), only ones that are waiting. Even a single
process that stays runnable and keeps lots of RAM in the active
category can lead to kills when swap is unused or little used. So the
kill-behavior is very workload dependent.

Real "out of swap" conditions tend to also have messages
similar to:

Aug  5 17:54:01 sentinel kernel: swap_pager_getswapspace(32): failed

If you are not seeing such messages, then it is likely that
the mount of swap space still free is not the actual thing
driving the kills.

Are you seeing "swap_pager_getswapspace(32): failed" messages?

(It used to be that the system simply leaves the dirty pages in
memory when a swap_pager_getswapspace failed message is produced.
Of itself, it did not cause a kill. I do not know about now.)

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: spurious out of swap kills

2019-09-14 Thread Mark Millard
bob prohaska fbsd at www.zefox.net wrote on
Fri Sep 13 16:24:57 UTC 2019 :

> Not sure this is relevant, but in compiling chromium on a Pi3 with 6 GB
> of swap the job completed successfully some months ago, with peak swap 
> use around 3.5 GB. The swap layout was sub-optimal, with a 2 GB partition
> combined with a 4 GB partition. A little over 4GB total seems usable. 
> 
> A few days ago the same attempt stopped with a series of OOMA kills,
> but in each case simply restarting allowed the compile to pick up
> where it left off and continue, eventually finishing with a runnable
> version of chromium. In this case swap use peaked a little over 4 GB.
> 
> Might this suggest the machine isn't freeing swap in a timely manner?

Are you saying that your increases to:

vm.pageout_oom_seq

no longer prove sufficient? What value for vm.pageout_oom_seq were
you using that got the recent failures?

(Mark Johnston's slow_swap.diff patch [and related] for investigations
of some OOM killing contributions never became official and has not
been updated to apply to updated code. It has been over a year since
those patches were used for the arm small board computer investigations
that lead to my learning about vm.pageout_oom_seq .)

If more or different configuration/tuning is required, I'm going to
eventually want to learn about it as well.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"