Re: poudriere: swap_pager: out of swap space

2024-01-12 Thread Lexi Winter
Ronald Klop:
> On 1/11/24 03:21, Lexi Winter wrote:
> > i'm building packages with poudriere on a system with 32GB memory, with
> > tmpfs and md disabled in poudriere (so it's using ZFS only) and with the
> > ZFS ARC limited to 8GB.

> My first guess would be that you are using a tmpfs tmp dir which uses
> swap as the backing-storage which is now full.  Configure poudriere
> with USE_TMPFS=no to prevent this.

hello,

tmpfs is already disabled in poudriere.conf:

USE_TMPFS=no

a couple of people suggested changing PARALLEL_JOBS or similar settings.
i'm still trying to find the best combination of PARALLEL_JOBS and
MAKE_JOBS_NUMBER, but what's confusing me here is that system is
claiming out of memory while having plenty of free memory - so it
doesn't seem that too many PARALLEL_JOBS is eating the memory.

in the mean time i've added 16GB of extra swap (previously it had 2GB),
and this seems to have fixed the errors, but i'm not sure why this is
necessary.

Mem: 2440M Active, 18G Inact, 1696M Laundry, 8580M Wired, 1565M Buf, 1143M Free
ARC: 1959M Total, 548M MFU, 371M MRU, 22M Anon, 29M Header, 987M Other
 404M Compressed, 1439M Uncompressed, 3.56:1 Ratio
Swap: 18G Total, 3090M Used, 15G Free, 16% Inuse

thanks, lexi.


signature.asc
Description: PGP signature


Re: 15 & 14: ram_attach vs. its using regions_to_avail vs. "bus_alloc_resource" can lead to: panic("ram_attach: resource %d failed to attach", rid)

2024-01-12 Thread Mark Millard
On Jan 12, 2024, at 09:57, Doug Rabson  wrote:

> On Sat, 30 Sept 2023 at 08:47, Mark Millard  wrote:
> ram_attach is based on regions_to_avail but that is a problem for
> its later bus_alloc_resource use --and that can lead to:
> 
> panic("ram_attach: resource %d failed to attach", rid);
> 
> Unfortunately, the known example is use of EDK2 on RPi4B
> class systems, not what is considered the supported way.
> The panic happens for main [so: 15] and will happen once
> the cortex-a72 handling in 14.0-* is in a build fixed by:
> 
> • git: 906bcc44641d - releng/14.0 - arm64: Fix errata workarounds that 
> depend on smccc Andrew Turner
> 
> The lack of the fix leads to an earlier panic as stands.
> 
> 
> sys/kern/subr_physmem.c 's regions_to_avail is based on ignoring
> phys_avail and using only hwregions and exregions. In other words,
> in part:
> 
>  * Initially dump_avail and phys_avail are identical.  Boot time memory
>  * allocations remove extents from phys_avail that may still be included
>  * in dumps.
> 
> This means that early, dedicated memory allocations are treated
> as available for general use by regions_to_avail . The distinction
> is visible in the  boot -v output in that:
> 
> real memory  = 3138154496 (2992 MB)
> Physical memory chunk(s):
> 0x20 - 0x002b7f, 727711744 bytes (177664 pages)
> 0x002ce3a000 - 0x003385, 111304704 bytes (27174 pages)
> 0x00338c - 0x00338c6fff, 28672 bytes (7 pages)
> 0x0033a3 - 0x0036ef, 55377920 bytes (13520 pages)
> 0x00372e - 0x003b2f, 67239936 bytes (16416 pages)
> 0x004000 - 0x00bb3dcfff, 2067648512 bytes (504797 pages)
> avail memory = 3027378176 (2887 MB)
> 
> does not list the wider:
> 
> 0x004000 - 0x00bfff
> 
> because of phys_avail . But the earlier dump based on hwregions and
> exregions shows:
> 
> Physical memory chunk(s):
>   0x001d - 0x001e, 0 MB ( 32 pages)
>   0x0020 - 0x338c6fff,   822 MB ( 210631 pages)
>   0x3392 - 0x3b2f,   121 MB (  31200 pages)
>   0x4000 - 0xbfff,  2048 MB ( 524288 pages)
> Excluded memory regions:
>   0x001d - 0x001e, 0 MB ( 32 pages) NoAlloc 
>   0x2b80 - 0x2ce39fff,22 MB (   5690 pages) NoAlloc 
>   0x3386 - 0x338b, 0 MB ( 96 pages) NoAlloc 
>   0x3392 - 0x33a2, 1 MB (272 pages) NoAlloc 
>   0x36f0 - 0x372d, 3 MB (992 pages) NoAlloc 
> 
> which indicates:
> 
>   0x4000 - 0xbfff
> 
> is available as far as it is concerned.
> 
> (Note some code works/displays in terms of: 0x4000 - 0xc000
> instead.)
> 
> For aarch64 , sys/arm64/arm64/nexus.c has a nexus_alloc_resource
> that is used as bus_alloc_resource . It ends up rejecting the
> RPi4B boot via using the result of the call in ram_attach:
> 
> if (bus_alloc_resource(dev, SYS_RES_MEMORY, , start, end,
> end - start, 0) == NULL)
> panic("ram_attach: resource %d failed to attach", 
> rid);
> 
> as shown by the just-prior start/end pair sequence messages:
> 
> ram0: reserving memory region:   20-2b80
> ram0: reserving memory region:   2ce3a000-3386
> ram0: reserving memory region:   338c-338c7000
> ram0: reserving memory region:   33a3-36f0
> ram0: reserving memory region:   372e-3b30
> ram0: reserving memory region:   4000-c000
> panic: ram_attach: resource 5 failed to attach
> 
> I do not see anything about this that looks inherently RPi*
> specific for possibly ending up with an analogous panic. So
> I expect the example is sufficient context to identify a
> problem is present, despite EDK2 use not being normal for
> RPi4B's and the like as far as FreeBSD is concerned.
> 
> I'm not quite clear why phys_avail changes

Do not be confused by common labeling to distinct
data: Note the "phys_avail" vs. "hwregions" despite
the label "Physical memory chunk(s):" :

static void
cpu_startup(void *dummy)
{
vm_paddr_t size;
int i;
   printf("real memory  = %ju (%ju MB)\n", ptoa((uintmax_t)realmem),
ptoa((uintmax_t)realmem) / 1024 / 1024);

if (bootverbose) {
printf("Physical memory chunk(s):\n");
for (i = 0; phys_avail[i + 1] != 0; i += 2) {
size = phys_avail[i + 1] - phys_avail[i];
printf("%#016jx - %#016jx, %ju bytes (%ju pages)\n",
(uintmax_t)phys_avail[i],
(uintmax_t)phys_avail[i + 1] - 1,
(uintmax_t)size, (uintmax_t)size / PAGE_SIZE);
}
}
 . .

vs.

physmem_dump_tables(int (*prfunc)(const char *, ...) __printflike(1, 2))
{
size_t i;
int flags;
uintmax_t addr, size;
const unsigned int mbyte = 1024 * 1024;

prfunc("Physical memory chunk(s):\n");
for (i = 0; i < hwcnt; ++i) {
  

Re: noatime on ufs2

2024-01-12 Thread Tomek CEDRO
On Fri, Jan 12, 2024 at 6:15 PM Dag-Erling Smørgrav wrote:
> Tomek CEDRO writes:
> > I am reading this interesting discussion and please verify my general
> > understanding:
> > 1. There is a request for change in core OS / FS mechanism of file
> > access time (atime) because of problem with mailing application?
>
> The atime mechanism is considered harmful by many because every file
> access results in a write which (even if coalesced) not only impacts
> performance but also increases wear on SSDs.  Many people turn it off.
> Even the FreeBSD installer turns it off when installing to ZFS, except
> on `/var/mail` which is a separate filesystem precisely so that it can
> have atime enabled independently of the rest of the system.  There is a
> proposal to turn it off by default.(..)

Okay, the discussion got some (and enough) traction, and its about
changing defaults only, not the underlying kernel or filesystem code
to change the atime behavior, so this can be done as an option that
user will see and can change easily in the installer.. and still good
old atime can be used where necessary whoever needs it :-)

> > 2. Linux change of approach to atime that keeps its value only around
> > last 24h so we should also change it in FreeBSD?
> >
> > 3. "realtime" is the alternative solution to keep atime intact?
>
> The Linux approach is an alternative mechanism dubbed “relatime”
> (relative access time) which instead of updating the access time on
> every access, does so only if the previous atime is either older than
> the current mtime or more than 24 h ago.

rel-atime roger! :-) :-)

Thanks DES :-)
Tomek

--
CeDeROM, SQ7MHZ, http://www.tomek.cedro.info



Re: 15 & 14: ram_attach vs. its using regions_to_avail vs. "bus_alloc_resource" can lead to: panic("ram_attach: resource %d failed to attach", rid)

2024-01-12 Thread Doug Rabson
On Sat, 30 Sept 2023 at 08:47, Mark Millard  wrote:

> ram_attach is based on regions_to_avail but that is a problem for
> its later bus_alloc_resource use --and that can lead to:
>
> panic("ram_attach: resource %d failed to attach", rid);
>
> Unfortunately, the known example is use of EDK2 on RPi4B
> class systems, not what is considered the supported way.
> The panic happens for main [so: 15] and will happen once
> the cortex-a72 handling in 14.0-* is in a build fixed by:
>
> • git: 906bcc44641d - releng/14.0 - arm64: Fix errata workarounds that
> depend on smccc Andrew Turner
>
> The lack of the fix leads to an earlier panic as stands.
>
>
> sys/kern/subr_physmem.c 's regions_to_avail is based on ignoring
> phys_avail and using only hwregions and exregions. In other words,
> in part:
>
>  * Initially dump_avail and phys_avail are identical.  Boot time memory
>  * allocations remove extents from phys_avail that may still be included
>  * in dumps.
>
> This means that early, dedicated memory allocations are treated
> as available for general use by regions_to_avail . The distinction
> is visible in the  boot -v output in that:
>
> real memory  = 3138154496 (2992 MB)
> Physical memory chunk(s):
> 0x20 - 0x002b7f, 727711744 bytes (177664 pages)
> 0x002ce3a000 - 0x003385, 111304704 bytes (27174 pages)
> 0x00338c - 0x00338c6fff, 28672 bytes (7 pages)
> 0x0033a3 - 0x0036ef, 55377920 bytes (13520 pages)
> 0x00372e - 0x003b2f, 67239936 bytes (16416 pages)
> 0x004000 - 0x00bb3dcfff, 2067648512 bytes (504797 pages)
> avail memory = 3027378176 (2887 MB)
>
> does not list the wider:
>
> 0x004000 - 0x00bfff
>
> because of phys_avail . But the earlier dump based on hwregions and
> exregions shows:
>
> Physical memory chunk(s):
>   0x001d - 0x001e, 0 MB ( 32 pages)
>   0x0020 - 0x338c6fff,   822 MB ( 210631 pages)
>   0x3392 - 0x3b2f,   121 MB (  31200 pages)
>   0x4000 - 0xbfff,  2048 MB ( 524288 pages)
> Excluded memory regions:
>   0x001d - 0x001e, 0 MB ( 32 pages) NoAlloc
>   0x2b80 - 0x2ce39fff,22 MB (   5690 pages) NoAlloc
>   0x3386 - 0x338b, 0 MB ( 96 pages) NoAlloc
>   0x3392 - 0x33a2, 1 MB (272 pages) NoAlloc
>   0x36f0 - 0x372d, 3 MB (992 pages) NoAlloc
>
> which indicates:
>
>   0x4000 - 0xbfff
>
> is available as far as it is concerned.
>
> (Note some code works/displays in terms of: 0x4000 - 0xc000
> instead.)
>
> For aarch64 , sys/arm64/arm64/nexus.c has a nexus_alloc_resource
> that is used as bus_alloc_resource . It ends up rejecting the
> RPi4B boot via using the result of the call in ram_attach:
>
> if (bus_alloc_resource(dev, SYS_RES_MEMORY, , start,
> end,
> end - start, 0) == NULL)
> panic("ram_attach: resource %d failed to attach",
> rid);
>
> as shown by the just-prior start/end pair sequence messages:
>
> ram0: reserving memory region:   20-2b80
> ram0: reserving memory region:   2ce3a000-3386
> ram0: reserving memory region:   338c-338c7000
> ram0: reserving memory region:   33a3-36f0
> ram0: reserving memory region:   372e-3b30
> ram0: reserving memory region:   4000-c000
> panic: ram_attach: resource 5 failed to attach
>
> I do not see anything about this that looks inherently RPi*
> specific for possibly ending up with an analogous panic. So
> I expect the example is sufficient context to identify a
> problem is present, despite EDK2 use not being normal for
> RPi4B's and the like as far as FreeBSD is concerned.
>

I'm not quite clear why phys_avail changes and why that is triggered by the
906bcc44641d commit. I'm wondering if it makes sense to arrange for
ram_attach to happen before acpi, e.g. using BUS_PASS_ORDER_FIRST?

Doug.


Re: noatime on ufs2

2024-01-12 Thread Dag-Erling Smørgrav
Tomek CEDRO  writes:
> I am reading this interesting discussion and please verify my general
> understanding:
>
> 1. There is a request for change in core OS / FS mechanism of file
> access time (atime) because of problem with mailing application?

The atime mechanism is considered harmful by many because every file
access results in a write which (even if coalesced) not only impacts
performance but also increases wear on SSDs.  Many people turn it off.
Even the FreeBSD installer turns it off when installing to ZFS, except
on `/var/mail` which is a separate filesystem precisely so that it can
have atime enabled independently of the rest of the system.  There is a
proposal to turn it off by default.

> 2. Linux change of approach to atime that keeps its value only around
> last 24h so we should also change it in FreeBSD?
>
> 3. "realtime" is the alternative solution to keep atime intact?

The Linux approach is an alternative mechanism dubbed “relatime”
(relative access time) which instead of updating the access time on
every access, does so only if the previous atime is either older than
the current mtime or more than 24 h ago.

> Why change well known standardized and widely used mechanism that is
> here for decades?

Because it's harmful and most people don't use it.

> If there is a problem with an application why change core OS/FS with
> all possible negative consequences and not fix the application?

There is not “a problem with an application”.  No application actually
requires atime to function properly because developers knows that atime
is a) not universally supported and b) often disabled even when
supported.  There is however a problem with disk performance and
lifetime being degraded.

> Wouldn't that break POSIX / backward compatiblity?

No.  Many people, and the FreeBSD installer, already turn it off.  The
relatime mechanism would restore atime functionality while causing much
less harm, in theory.  I'm not sure it would make much difference in
practice considering that we have nightly scripts which would trigger
atime updates even with relatime.

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



Re: noatime on ufs2

2024-01-12 Thread Alexander Leidinger

Am 2024-01-11 18:15, schrieb Rodney W. Grimes:

Am 2024-01-10 22:49, schrieb Mark Millard:

> I never use atime, always noatime, for UFS. That said, I'd never
> propose
> changing the long standing defaults for commands and calls. I'd avoid:

[good points I fully agree on]

There's one possibility which nobody talked about yet... changing the
default to noatime at install time in fstab / zfs set.


Perhaps you should take a closer look at what bsdinstall does
when it creates a zfs install pool and boot environment, you
might just find that noatime is already set everywhere but
on /var/mail:

/usr/libexec/bsdinstall/zfsboot:: ${ZFSBOOT_POOL_CREATE_OPTIONS:=-O 
compress=lz4 -O atime=off}

/usr/libexec/bsdinstall/zfsboot:/var/mail   atime=on


While zfs is a part of what I talked about, it is not the complete 
picture. bsdinstall covers UFS and ZFS, and we should keep them in sync 
in this regard. Ideally with an option the user can modify. Personally I 
don't mind if the default setting for this option would be noatime. A 
quick serach in the scripts of bsdinstall didn't reveal to me what we 
use for UFS. I assume we use atime.


I fully agree to not violate POLA by changing the default to noatime 
in
any FS. I always set noatime everywhere on systems I take care about, 
no

exceptions (any user visible mail is handled via maildir/IMAP, not
mbox). I haven't made up my mind if it would be a good idea to change
bsdinstall to set noatime (after asking the user about it, and later
maybe offer  the possibility to use relatime in case it gets
implemented). I think it is at least worthwile to discuss this
possibility (including what the default setting of bsdinstall should 
be

for this option).


Little late... iirc its been that way since day one of zfs support
in bsdinstall.


Which I don't mind, as this is what I use anyway. But the correct way 
would be to let the user decide.


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


signature.asc
Description: OpenPGP digital signature