Re: [RFC] Making mount_nfs to attempt NFSv4 before NFSv3 and NFSv2?

2022-01-03 Thread Alexander Leidinger via freebsd-current
Quoting Rick Macklem  (from Tue, 4 Jan 2022  
03:18:36 +):



Konstantin Belousov wrote:
[good stuff snipped]

The v4 NFS is very different from v3, it is not an upgrade, it is rather
a different network filesystem with some (significant) similarities to v3.

That said, it should be fine changing the defaults, but you need to ensure
that reasonable scenarios, like the changed FreeBSD client mounting
from v3-only server, still work correctly.  The change should be made in a
way that only affects client that connects to the server that has both
v4 and v3.

A particular test case that needs to be done is the diskless NFS root fs.
This case must use NFSv3 and if it is not the default, it might break?
I am not really set up to test this at this time.
(There are assorted reasons that NFSv4 does not, or at least might not,
 work for a diskless root fs, but that is a separate topic.)

Other than testing diskless NFS root file systems, I do not have a
strong opinion w.r.t. whether the default should change.

If the default stays as NFSv3, a fallback to NFSv4 could be done, which
would handle the NFSv4 only server case. (No one uses NFSv2 any more,
so the fallback to NFSv2 is almost irrelevant, imho.)


As you particiate in interoperability tests, would it make sense to  
check how those other implementations handle this case? I naively  
assume you have some contacts or a mailinglist you could use for that.


Bye,
Alexander.


--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF



[RFC] Making mount_nfs to attempt NFSv4 before NFSv3 and NFSv2?

2022-01-03 Thread Xin Li via freebsd-current

Hi,

Currently, mount_nfs will attempt to use NFSv3 and fallback to NFSv2. 
The manual page says:


 nfsv2   Use the NFS Version 2 protocol (the default is to try
 version 3 first then version 2).  Note that NFS version 2
 has a file size limit of 2 gigabytes.

And the code agrees, too:


if (trymntmode == V4) {
nfsvers = 4;
mntvers = 3; /* Workaround for GCC. */
} else if (trymntmode == V2) {
nfsvers = 2;
mntvers = 1;
} else {
nfsvers = 3;
mntvers = 3;
}


When trymntmode == ANY, which is the default, mount_nfs would attempt 
NFSv3, and if rpcb_getaddr() returned RPC_PROGVERSMISMATCH, it would try 
again with trymntmode = V2.


Nowadays, it seems that NFSv4 is becoming more and more popular.  If a 
server is providing only NFSv4 service, when mounting without -o nfsv4, 
the user would receive message like:


RPCPROG_MNT: RPC:Timed out

A friend of mine who is using TrueNAS core hit this yesterday and his 
Linux client worked just fine.  It took me some time to figure out that 
the root cause.  It seems that modern Linux distributions have been 
using NFSv4 by default for some time.


So I think it makes sense to teach mount_nfs to attempt NFSv4, then 
NFSv3 and NFSv2.  However, this might be a POLA violation and we would 
like to know if there is any objections.


(I've attached a patch but I haven't actually tested it yet).

Cheers,From eb6e60233d840d072d0280325ca2cb37455dc2f1 Mon Sep 17 00:00:00 2001
From: Xin LI 
Date: Mon, 3 Jan 2022 10:48:17 -0800
Subject: [PATCH] mount_nfs: Attempt NFSv4 before NFSv3 and NFSv2.

---
 sbin/mount_nfs/mount_nfs.8 |  6 +++---
 sbin/mount_nfs/mount_nfs.c | 29 -
 2 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/sbin/mount_nfs/mount_nfs.8 b/sbin/mount_nfs/mount_nfs.8
index 648cb2128e90..741b5c24a080 100644
--- a/sbin/mount_nfs/mount_nfs.8
+++ b/sbin/mount_nfs/mount_nfs.8
@@ -28,7 +28,7 @@
 .\"	@(#)mount_nfs.8	8.3 (Berkeley) 3/29/95
 .\" $FreeBSD$
 .\"
-.Dd July 10, 2021
+.Dd January 10, 2022
 .Dt MOUNT_NFS 8
 .Os
 .Sh NAME
@@ -216,8 +216,8 @@ This option requires the
 .Cm nfsv4
 option.
 .It Cm nfsv2
-Use the NFS Version 2 protocol (the default is to try version 3 first
-then version 2).
+Use the NFS Version 2 protocol (the default is to try version 4 first,
+then version 3, then version 2).
 Note that NFS version 2 has a file size limit of 2 gigabytes.
 .It Cm nfsv3
 Use the NFS Version 3 protocol.
diff --git a/sbin/mount_nfs/mount_nfs.c b/sbin/mount_nfs/mount_nfs.c
index e1eaf206e982..e6d7e0afbfb7 100644
--- a/sbin/mount_nfs/mount_nfs.c
+++ b/sbin/mount_nfs/mount_nfs.c
@@ -125,6 +125,7 @@ static enum mountmode {
 	ANY,
 	V2,
 	V3,
+	V3orV2,
 	V4
 } mountmode = ANY;
 
@@ -777,15 +778,21 @@ nfs_tryproto(struct addrinfo *ai, char *hostp, char *spec, char **errstr,
 	}
 
 tryagain:
-	if (trymntmode == V4) {
+	switch (trymntmode) {
+	case V4:
+	case ANY:
 		nfsvers = 4;
 		mntvers = 3; /* Workaround for GCC. */
-	} else if (trymntmode == V2) {
-		nfsvers = 2;
-		mntvers = 1;
-	} else {
+		break;
+	case V3orV2:
+	case V3:
 		nfsvers = 3;
 		mntvers = 3;
+		break;
+	case V2:
+		nfsvers = 2;
+		mntvers = 1;
+		break;
 	}
 
 	if (portspec != NULL) {
@@ -799,10 +806,14 @@ nfs_tryproto(struct addrinfo *ai, char *hostp, char *spec, char **errstr,
 
 		if (!rpcb_getaddr(NFS_PROGRAM, nfsvers, nconf, &nfs_nb,
 		hostp)) {
-			if (rpc_createerr.cf_stat == RPC_PROGVERSMISMATCH &&
-			trymntmode == ANY) {
-trymntmode = V2;
-goto tryagain;
+			if (rpc_createerr.cf_stat == RPC_PROGVERSMISMATCH) {
+if (trymntmode == ANY) {
+	trymntmode = V3orV2;
+	goto tryagain;
+} else if (trymntmode == V3orV2) {
+	trymntmode = V2;
+	goto tryagain;
+}
 			}
 			snprintf(errbuf, sizeof errbuf, "[%s] %s:%s: %s",
 			netid, hostp, spec,
-- 
2.34.1



Re: git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib [add /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1 ?]

2021-12-31 Thread Mark Millard via freebsd-current



On 2021-Dec-31, at 18:18, Mark Millard  wrote:

> On 2021-Dec-31, at 17:46, Mark Millard  wrote:
> 
>> On 2021-Dec-31, at 15:04, John Baldwin  wrote:
>> 
>>> On 12/31/21 2:59 PM, Mark Millard wrote:
 On 2021-Dec-31, at 14:28, Mark Millard  wrote:
> On 2021-Dec-30, at 14:04, John Baldwin  wrote:
> 
>> On 12/30/21 1:09 PM, Mark Millard wrote:
>>> On 2021-Dec-30, at 13:05, Mark Millard  wrote:
 This asks a question in a different direction that my prior
 reports about my builds vs. Cy's reported build.
 
 Background:
 
 /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so:GROUP
  ( /lib/libc++.so.1 /usr/lib/libcxxrt.so
 and:
 lrwxr-xr-x  1 root  wheel23 Dec 29 13:17:01 2021 
 /usr/lib/libcxxrt.so -> ../../lib/libcxxrt.so.1
 
 Why did libc++.so.1 not get a:
 
 /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1
>>> I forgot to remove the .1 on the left hand side:
>>> /usr/lib/libc++.so -> ../../lib/libc++.so.1
>> 
>> Because for libc++.so we don't just symlink to the current version of 
>> the library
>> (as we do for most other shared libraries) to tell the compiler what to 
>> link against
>> for -lc++, instead we use a linker script that tells the compiler to 
>> link against
>> both of those libraries when -lc++ is encountered.
> 
> A better identification of what looks odd to me is the
> path variations in:
> 
> # more /usr/lib/libc++.so
 Another not great day on my part: That path alone makes
 the mix of /lib/ and /usr/lib/ use involved, given the
 reference to /lib/libc++.so.1 . That would still be true
 if the other path had been /lib/libcxxrt.so .
>>> 
>>> /usr/lib/libc++.so is only used by the compiler/linker when linking a 
>>> binary.
>>> The resulting binary has the associated paths (/lib/libc++.so.1 and
>>> /usr/lib/libcxxrt.so.1) in its DT_NEEDED.  So it is fine for the .so to be
>>> in /usr/lib.  This is the same with /usr/lib/libc.so vs /lib/libc.so.7.
>>> 
>>> However, your point about libcxxrt.so.1 is valid.  It needs to also be moved
>>> to /lib if libc++.so.1 is moved to /lib.  Doing so will also require yet 
>>> another
>>> depend-clean.sh fixup (well, probably just adjusting the one I added to
>>> check the libcxxrt path instead of libc++ path).
>> 
>> Hmm. Looking (now after having updated so /lib/libc++.so.1
>> is in use, not that this is any different here):
>> 
>> # ls -Tld /lib/libcxx* /usr/lib/libcxx*
>> -r--r--r--  1 root  wheel  131656 Dec 31 14:19:49 2021 /lib/libcxxrt.so.1
>> -r--r--r--  1 root  wheel  355764 Dec 24 15:19:42 2021 /usr/lib/libcxxrt.a
>> lrwxr-xr-x  1 root  wheel  23 Dec 31 14:19:49 2021 /usr/lib/libcxxrt.so 
>> -> ../../lib/libcxxrt.so.1
>> 
>> # more /usr/lib/libc++.so 
>> /* $FreeBSD$ */
>> GROUP ( /lib/libc++.so.1 /usr/lib/libcxxrt.so )
>> 
>> So: no actual reference to /usr/lib/libcxxrt.so.1 but
>> a reference in the DT_NEEDED to /usr/lib/libcxxrt.so ?
>> 
>> May be just /usr/lib/libc++.so needs different text in order
>> for DT_NEEDED to have different text related to libcxxrt in
>> future build activities, avoiding /usr/lib/ ?
>> 
>> 
>> For reference:
>> 
>> # uname -apKU
>> FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #27 
>> main-n252090-5650d340ad66-dirty: Fri Dec 31 06:00:41 PST 2021 
>> root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
>>   amd64 amd64 1400046 1400046
>> 
> 
> In a aarch64 context I looked at an old executable via ldd -a :
> 
> # ldd -a bt
> /usr/home/root/c_tests/bt:
>   libexecinfo.so.1 => /usr/lib/libexecinfo.so.1 (0x41c19000)
>   libc++.so.1 => /lib/libc++.so.1 (0x42484000)
>   libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x43038000)
>   libm.so.5 => /lib/libm.so.5 (0x44a4c000)
>   libc.so.7 => /lib/libc.so.7 (0x439ce000)
> /usr/lib/libexecinfo.so.1:
>   libelf.so.2 => /lib/libelf.so.2 (0x4581e000)
>   libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x46e4f000)
>   libc.so.7 => /lib/libc.so.7 (0x439ce000)
> /lib/libc++.so.1:
>   libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x43038000)
>   libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x46e4f000)
>   libc.so.7 => /lib/libc.so.7 (0x439ce000)
> /lib/libcxxrt.so.1:
>   libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x46e4f000)
>   libc.so.7 => /lib/libc.so.7 (0x439ce000)
> /lib/libm.so.5:
>   libc.so.7 => /lib/libc.so.7 (0x439ce000)
> /lib/libelf.so.2:
>   libc.so.7 => /lib/libc.so.7 (0x439ce000)
> /lib/libgcc_s.so.1:
>   libc.so.7 => /lib/libc.so.7 (0x439ce000)
> 
> Looks like something already deals with finding
> /lib/libcxxrt.so.1 . But it is not obvious what
> path it started with and how much processing was
> done (or when) to end up with /lib/libc++.so.1
> showing.
> 
> But there is still a /usr/lib/ reference overall:
> 
> /usr/lib/libexecinfo.so.1
> 
> Bu

Re: git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib [add /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1 ?]

2021-12-31 Thread Mark Millard via freebsd-current
On 2021-Dec-31, at 17:46, Mark Millard  wrote:

> On 2021-Dec-31, at 15:04, John Baldwin  wrote:
> 
>> On 12/31/21 2:59 PM, Mark Millard wrote:
>>> On 2021-Dec-31, at 14:28, Mark Millard  wrote:
 On 2021-Dec-30, at 14:04, John Baldwin  wrote:
 
> On 12/30/21 1:09 PM, Mark Millard wrote:
>> On 2021-Dec-30, at 13:05, Mark Millard  wrote:
>>> This asks a question in a different direction that my prior
>>> reports about my builds vs. Cy's reported build.
>>> 
>>> Background:
>>> 
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so:GROUP
>>>  ( /lib/libc++.so.1 /usr/lib/libcxxrt.so
>>> and:
>>> lrwxr-xr-x  1 root  wheel23 Dec 29 13:17:01 2021 
>>> /usr/lib/libcxxrt.so -> ../../lib/libcxxrt.so.1
>>> 
>>> Why did libc++.so.1 not get a:
>>> 
>>> /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1
>> I forgot to remove the .1 on the left hand side:
>> /usr/lib/libc++.so -> ../../lib/libc++.so.1
> 
> Because for libc++.so we don't just symlink to the current version of the 
> library
> (as we do for most other shared libraries) to tell the compiler what to 
> link against
> for -lc++, instead we use a linker script that tells the compiler to link 
> against
> both of those libraries when -lc++ is encountered.
 
 A better identification of what looks odd to me is the
 path variations in:
 
 # more /usr/lib/libc++.so
>>> Another not great day on my part: That path alone makes
>>> the mix of /lib/ and /usr/lib/ use involved, given the
>>> reference to /lib/libc++.so.1 . That would still be true
>>> if the other path had been /lib/libcxxrt.so .
>> 
>> /usr/lib/libc++.so is only used by the compiler/linker when linking a binary.
>> The resulting binary has the associated paths (/lib/libc++.so.1 and
>> /usr/lib/libcxxrt.so.1) in its DT_NEEDED.  So it is fine for the .so to be
>> in /usr/lib.  This is the same with /usr/lib/libc.so vs /lib/libc.so.7.
>> 
>> However, your point about libcxxrt.so.1 is valid.  It needs to also be moved
>> to /lib if libc++.so.1 is moved to /lib.  Doing so will also require yet 
>> another
>> depend-clean.sh fixup (well, probably just adjusting the one I added to
>> check the libcxxrt path instead of libc++ path).
> 
> Hmm. Looking (now after having updated so /lib/libc++.so.1
> is in use, not that this is any different here):
> 
> # ls -Tld /lib/libcxx* /usr/lib/libcxx*
> -r--r--r--  1 root  wheel  131656 Dec 31 14:19:49 2021 /lib/libcxxrt.so.1
> -r--r--r--  1 root  wheel  355764 Dec 24 15:19:42 2021 /usr/lib/libcxxrt.a
> lrwxr-xr-x  1 root  wheel  23 Dec 31 14:19:49 2021 /usr/lib/libcxxrt.so 
> -> ../../lib/libcxxrt.so.1
> 
> # more /usr/lib/libc++.so 
> /* $FreeBSD$ */
> GROUP ( /lib/libc++.so.1 /usr/lib/libcxxrt.so )
> 
> So: no actual reference to /usr/lib/libcxxrt.so.1 but
> a reference in the DT_NEEDED to /usr/lib/libcxxrt.so ?
> 
> May be just /usr/lib/libc++.so needs different text in order
> for DT_NEEDED to have different text related to libcxxrt in
> future build activities, avoiding /usr/lib/ ?
> 
> 
> For reference:
> 
> # uname -apKU
> FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #27 
> main-n252090-5650d340ad66-dirty: Fri Dec 31 06:00:41 PST 2021 
> root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
>   amd64 amd64 1400046 1400046
> 

In a aarch64 context I looked at an old executable via ldd -a :

# ldd -a bt
/usr/home/root/c_tests/bt:
libexecinfo.so.1 => /usr/lib/libexecinfo.so.1 (0x41c19000)
libc++.so.1 => /lib/libc++.so.1 (0x42484000)
libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x43038000)
libm.so.5 => /lib/libm.so.5 (0x44a4c000)
libc.so.7 => /lib/libc.so.7 (0x439ce000)
/usr/lib/libexecinfo.so.1:
libelf.so.2 => /lib/libelf.so.2 (0x4581e000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x46e4f000)
libc.so.7 => /lib/libc.so.7 (0x439ce000)
/lib/libc++.so.1:
libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x43038000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x46e4f000)
libc.so.7 => /lib/libc.so.7 (0x439ce000)
/lib/libcxxrt.so.1:
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x46e4f000)
libc.so.7 => /lib/libc.so.7 (0x439ce000)
/lib/libm.so.5:
libc.so.7 => /lib/libc.so.7 (0x439ce000)
/lib/libelf.so.2:
libc.so.7 => /lib/libc.so.7 (0x439ce000)
/lib/libgcc_s.so.1:
libc.so.7 => /lib/libc.so.7 (0x439ce000)

Looks like something already deals with finding
/lib/libcxxrt.so.1 . But it is not obvious what
path it started with and how much processing was
done (or when) to end up with /lib/libc++.so.1
showing.

But there is still a /usr/lib/ reference overall:

/usr/lib/libexecinfo.so.1

But this is because the old program turned out to
be an old experiment:

# more bt.c 
// bt.c
// from releng/12 (12.2?) context (pre-llvm12), but not releng/13 :
// # cc -o bt bt.c -lexecinf

Re: git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib [add /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1 ?]

2021-12-31 Thread Mark Millard via freebsd-current



On 2021-Dec-31, at 15:04, John Baldwin  wrote:

> On 12/31/21 2:59 PM, Mark Millard wrote:
>> On 2021-Dec-31, at 14:28, Mark Millard  wrote:
>>> On 2021-Dec-30, at 14:04, John Baldwin  wrote:
>>> 
 On 12/30/21 1:09 PM, Mark Millard wrote:
> On 2021-Dec-30, at 13:05, Mark Millard  wrote:
>> This asks a question in a different direction that my prior
>> reports about my builds vs. Cy's reported build.
>> 
>> Background:
>> 
>> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so:GROUP
>>  ( /lib/libc++.so.1 /usr/lib/libcxxrt.so
>> and:
>> lrwxr-xr-x  1 root  wheel23 Dec 29 13:17:01 2021 
>> /usr/lib/libcxxrt.so -> ../../lib/libcxxrt.so.1
>> 
>> Why did libc++.so.1 not get a:
>> 
>> /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1
> I forgot to remove the .1 on the left hand side:
> /usr/lib/libc++.so -> ../../lib/libc++.so.1
 
 Because for libc++.so we don't just symlink to the current version of the 
 library
 (as we do for most other shared libraries) to tell the compiler what to 
 link against
 for -lc++, instead we use a linker script that tells the compiler to link 
 against
 both of those libraries when -lc++ is encountered.
>>> 
>>> A better identification of what looks odd to me is the
>>> path variations in:
>>> 
>>> # more /usr/lib/libc++.so
>> Another not great day on my part: That path alone makes
>> the mix of /lib/ and /usr/lib/ use involved, given the
>> reference to /lib/libc++.so.1 . That would still be true
>> if the other path had been /lib/libcxxrt.so .
> 
> /usr/lib/libc++.so is only used by the compiler/linker when linking a binary.
> The resulting binary has the associated paths (/lib/libc++.so.1 and
> /usr/lib/libcxxrt.so.1) in its DT_NEEDED.  So it is fine for the .so to be
> in /usr/lib.  This is the same with /usr/lib/libc.so vs /lib/libc.so.7.
> 
> However, your point about libcxxrt.so.1 is valid.  It needs to also be moved
> to /lib if libc++.so.1 is moved to /lib.  Doing so will also require yet 
> another
> depend-clean.sh fixup (well, probably just adjusting the one I added to
> check the libcxxrt path instead of libc++ path).

Hmm. Looking (now after having updated so /lib/libc++.so.1
is in use, not that this is any different here):

# ls -Tld /lib/libcxx* /usr/lib/libcxx*
-r--r--r--  1 root  wheel  131656 Dec 31 14:19:49 2021 /lib/libcxxrt.so.1
-r--r--r--  1 root  wheel  355764 Dec 24 15:19:42 2021 /usr/lib/libcxxrt.a
lrwxr-xr-x  1 root  wheel  23 Dec 31 14:19:49 2021 /usr/lib/libcxxrt.so -> 
../../lib/libcxxrt.so.1

# more /usr/lib/libc++.so 
/* $FreeBSD$ */
GROUP ( /lib/libc++.so.1 /usr/lib/libcxxrt.so )

So: no actual reference to /usr/lib/libcxxrt.so.1 but
a reference in the DT_NEEDED to /usr/lib/libcxxrt.so ?

May be just /usr/lib/libc++.so needs different text in order
for DT_NEEDED to have different text related to libcxxrt in
future build activities, avoiding /usr/lib/ ?


For reference:

# uname -apKU
FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #27 
main-n252090-5650d340ad66-dirty: Fri Dec 31 06:00:41 PST 2021 
root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
  amd64 amd64 1400046 1400046

===
Mark Millard
marklmi at yahoo.com




Re: git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib [add /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1 ?]

2021-12-31 Thread Mark Millard via freebsd-current
On 2021-Dec-31, at 14:28, Mark Millard  wrote:

> On 2021-Dec-30, at 14:04, John Baldwin  wrote:
> 
>> On 12/30/21 1:09 PM, Mark Millard wrote:
>>> On 2021-Dec-30, at 13:05, Mark Millard  wrote:
 This asks a question in a different direction that my prior
 reports about my builds vs. Cy's reported build.
 
 Background:
 
 /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so:GROUP
  ( /lib/libc++.so.1 /usr/lib/libcxxrt.so
 and:
 lrwxr-xr-x  1 root  wheel23 Dec 29 13:17:01 2021 
 /usr/lib/libcxxrt.so -> ../../lib/libcxxrt.so.1
 
 Why did libc++.so.1 not get a:
 
 /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1
>>> I forgot to remove the .1 on the left hand side:
>>> /usr/lib/libc++.so -> ../../lib/libc++.so.1
>> 
>> Because for libc++.so we don't just symlink to the current version of the 
>> library
>> (as we do for most other shared libraries) to tell the compiler what to link 
>> against
>> for -lc++, instead we use a linker script that tells the compiler to link 
>> against
>> both of those libraries when -lc++ is encountered.
> 
> A better identification of what looks odd to me is the
> path variations in:
> 
> # more /usr/lib/libc++.so

Another not great day on my part: That path alone makes
the mix of /lib/ and /usr/lib/ use involved, given the
reference to /lib/libc++.so.1 . That would still be true
if the other path had been /lib/libcxxrt.so .

I guess I've just not figured out what specific, detailed
issue(s) the move to /lib/libc++.so.1 covers vs. not,
given the /usr/lib/libc++.so and /usr/lib/libcxxrt.so
paths.

I'm not using anything with /usr/lib/ being on a different
file system than /lib so I'll definitely not observe any
problems. And it might be a waste to try to clear my
confusions at this point, given how the day is going.

> /* $FreeBSD$ */
> GROUP ( /lib/libc++.so.1 /usr/lib/libcxxrt.so )
> 
> So /usr/lib/ still has to be available (so, possibly, mounted)
> for C++ because of the /usr/lib/libcxxrt.so reference? If so,
> why the move of libc++.so.1 to /lib/ ?
> 
>> I have finally reproduced Cy's build error locally and am testing my fix.  
>> If it
>> works I'll commit it.
>> 
> 


===
Mark Millard
marklmi at yahoo.com




Re: git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib [add /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1 ?]

2021-12-31 Thread Mark Millard via freebsd-current



On 2021-Dec-30, at 14:04, John Baldwin  wrote:

> On 12/30/21 1:09 PM, Mark Millard wrote:
>> On 2021-Dec-30, at 13:05, Mark Millard  wrote:
>>> This asks a question in a different direction that my prior
>>> reports about my builds vs. Cy's reported build.
>>> 
>>> Background:
>>> 
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so:GROUP
>>>  ( /lib/libc++.so.1 /usr/lib/libcxxrt.so
>>> and:
>>> lrwxr-xr-x  1 root  wheel23 Dec 29 13:17:01 2021 
>>> /usr/lib/libcxxrt.so -> ../../lib/libcxxrt.so.1
>>> 
>>> Why did libc++.so.1 not get a:
>>> 
>>> /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1
>> I forgot to remove the .1 on the left hand side:
>> /usr/lib/libc++.so -> ../../lib/libc++.so.1
> 
> Because for libc++.so we don't just symlink to the current version of the 
> library
> (as we do for most other shared libraries) to tell the compiler what to link 
> against
> for -lc++, instead we use a linker script that tells the compiler to link 
> against
> both of those libraries when -lc++ is encountered.

A better identification of what looks odd to me is the
path variations in:

# more /usr/lib/libc++.so
/* $FreeBSD$ */
GROUP ( /lib/libc++.so.1 /usr/lib/libcxxrt.so )

So /usr/lib/ still has to be available (so, possibly, mounted)
for C++ because of the /usr/lib/libcxxrt.so reference? If so,
why the move of libc++.so.1 to /lib/ ?

> I have finally reproduced Cy's build error locally and am testing my fix.  If 
> it
> works I'll commit it.
> 


===
Mark Millard
marklmi at yahoo.com




Re: git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib

2021-12-31 Thread Mark Millard via freebsd-current
On 2021-Dec-30, at 19:15, Mark Millard  wrote:
> 
>> On 2021-Dec-30, at 15:14, Cy Schubert  wrote:
>> 
>> In message <3140c5f6-495f-441c-aa6b-542f3bc53...@yahoo.com>, Mark Millard 
>> write
>> s:
>>> On 2021-Dec-30, at 11:52, Mark Millard  wrote:
>> . . .
>> It was a NO_CLEAN build. A CLEAN build resolved it.
>> 
>> There were no mods to this, my prod tree, except for some upcoming ipfilter 
>> commits intended for the new year.
>> 
>> One would think a META_MODE build would also fail if NO_CLEAN fails.
> 
> In the following, the file path of the text was found in comes after
> the line(s) with the found text in that file:
> 
> CMD @rm -f libc++.so.1 libc++.so
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/lib/libc++/libc++.so.1.full.meta
> 
> CMD install -U  -S -C -o root -g wheel -m 444   libc++.ld  
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so
> R 74586 
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/lib/libc++/_libinstall.meta
> 
> I expect those suggest that META_MODE tracks the file's status and
> the status of related files enough --and so it leads to the update
> that NO_CLEAN did not do.
> 
> Overall you basically reported that NO_CLEAN did not do the rm
> of libc++.so --so it apparently did not do some of the related
> lib/libc++/libc++.so.1.full.meta activity that involved that
> remove.
> 
> Given the removal happened under META_MODE, it also lead to the
> install happening to re-create the file.
> 
> Such is what I would expect (or hope) for META_MODE use.
> 

Dumb mistake on my part above: I paid only attention to the build,
and not to what was installed (or left in place) by installworld .
Despite the buildworld activity indicated above, installworld had
left in place:

# more /usr/lib/libc++.so
/* $FreeBSD$ */
GROUP ( /usr/lib/libc++.so.1 /usr/lib/libcxxrt.so )

(with a modification date back on 2021-Aug-18).

(I'm dealing with updating to more recent commits now. So
hopefully an update will happen this time.)

===
Mark Millard
marklmi at yahoo.com




Re: Problems compiling kernel

2021-12-31 Thread Mark Millard via freebsd-current
On 2021-Dec-30, at 13:27, Mark Millard  wrote:

>> Dear all,
>> 
>> on a system updated yesterday I get
>> 
>> tuexen_at_head:~/freebsd-src % git branch
>> * main
>> tuexen_at_head:~/freebsd-src % git pull
>> Already up to date.
>> tuexen_at_head:~/freebsd-src % uname -a
>> FreeBSD head 14.0-CURRENT FreeBSD 14.0-CURRENT #1 main-n252035-63f7f3921bd: 
>> Thu Dec 30 11:33:16 CET 2021 
>> root_at_head:/usr/obj/usr/home/tuexen/freebsd-src/amd64.amd64/sys/TCP  amd64
>> tuexen_at_head:~/freebsd-src % sudo make -j 4 kernel KERNCONF=TCP
>> ld-elf.so.1: Shared object "libc++.so.1" not found, required by "cc"
>> make: "/usr/home/tuexen/freebsd-src/share/mk/bsd.compiler.mk" line 201: 
>> warning: "cc -v 2>&1 | grep "gcc version"" returned non-zero status
>> make: "/usr/home/tuexen/freebsd-src/share/mk/bsd.compiler.mk" line 205: 
>> Unable to determine compiler type for CC=cc.  Consider setting COMPILER_TYPE.
>> 
>> make: stopped in /usr/home/tuexen/freebsd-src
>> tuexen_at_head:~/freebsd-src % 
>> 
>> any idea what I did wrong and how to fix it?
> 
> The problem is in FreeeBSD itself from:
> 
> git: 6b1c5775d1c2 - main - Move libc++ from /usr/lib to /lib Ed Maste
> (2021-Dec-29)
> 
> until the revert:
> 
> git: b6f7942cbcbd - main - Revert "Move libc++ from /usr/lib to /lib" Ed Maste
> 
> or, the fixed commit, if you want /lib/libc++.so.1 :
> 
> git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib Dimitry 
> Andric
> (2021-Dec-30)
> 
> 6b1c5775d1c2 did not actually cause /lib/libc++.so.1 to be installed
> but still put it at /usr/lib/libc++.so.1 . But its delete-old-libs
> did remove /usr/lib/libc++.so.1 . (I suffered this too.)
> 
> A solution is to find libc++.so.1 in your build tree and to
> copy it to one of the two places (old or new). So, for example:

Just correcting an error in what I wrote above.

The "old or new" part of this was wrong: the system
still had . . .

# more /usr/lib/libc++.so
/* $FreeBSD$ */
GROUP ( /usr/lib/libc++.so.1 /usr/lib/libcxxrt.so )

So only "old" was the fully correct place to copy
libc++.so.1 to, presuming that /usr/lib/libc++.so
was left as above.

> .../amd64.amd64/lib/libc++/libc++.so.1
> 
> or, may be:
> 
> .../amd64.amd64/tmp/lib/libc++.so.1
> 
> Some old trees used for chroot use or the like can also
> be a source for doing such a copy. (That is what I did.)
> 
> There will likely be another commit making it nicer
> for NO_CLEAN style builds. 5e6a2d6eb220 is okay for
> META_MODE builds or complete rebuilds.
> 
> I also wonder if they will create a:
> 
> /usr/lib/libc++.so -> ../../lib/libc++.so.1
> 
> or not, analogous to the existing:
> 
> /usr/lib/libcxxrt.so -> ../../lib/libcxxrt.so.1
> 


===
Mark Millard
marklmi at yahoo.com




Re: git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib

2021-12-30 Thread Mark Millard via freebsd-current
> On 2021-Dec-30, at 15:14, Cy Schubert  wrote:
> 
> In message <3140c5f6-495f-441c-aa6b-542f3bc53...@yahoo.com>, Mark Millard 
> write
> s:
>> On 2021-Dec-30, at 11:52, Mark Millard  wrote:
>> 
 This commit results in a different error.
 
 ld: error: /export/obj/opt/src/git-src/amd64.amd64/tmp/usr/lib/libc++.so:2
>> : 
 cannot find /usr/lib/libc++.so.1 inside /export/obj/opt/src/git-src/amd64.
>> am
 d64/tmp
>>> GROUP ( /usr/lib/libc++.so.1 /usr/lib/libcxxrt.so )
>>>   ^
 c++: error: linker command failed with exit code 1 (use -v to see 
 invocation)
 *** [libclang_rt.asan-x86_64.so.full] Error code 1
 
 make[6]: stopped in /opt/src/git-src/lib/libclang_rt/asan_dynamic
>>> 
>>> Working in a system that had the file removed and then
>>> manually put back after the upgrade, what I see after this
>>> new rebuild (not installed) is:
>>> 
>>> # grep -r 'GROUP.*/lib.*/libc++.so' /usr/obj/BUILDs/main-amd64-nodbg-clang/
>> usr/main-src/amd64.amd64/
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/lib/libc++/
>> libc++.ld:GROUP ( /lib/libc++.so.1 /usr/lib/libcxxrt.so )
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/obj-lib32/t
>> mp/usr/lib32/libc++.so:GROUP ( /usr/lib32/libc++.so.1 /usr/lib32/libcxxrt.so 
>> )
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/obj-lib32/l
>> ib/libc++/libc++.ld:GROUP ( /usr/lib32/libc++.so.1 /usr/lib32/libcxxrt.so )
>>> grep: /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/G
>> ENERIC-NODBG/modules/usr/main-src/sys/modules/twa/opt_twa.h: No such file or 
>> directory
>>> grep: /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/u
>> sr/include/dev/ic/esp.h: No such file or directory
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib
>> /libc++.so:GROUP ( /lib/libc++.so.1 /usr/lib/libcxxrt.so
>>> 
>>> That has /lib/libc++.so.1 (outside lib32 materials).
>>> 
>>> But it also has: /tmp/usr/lib/libc++.so and is that a problem?
>>> 
>>> And, checking on when the files were modified:
>>> 
>>> # ls -Tld `grep -rl 'GROUP.*/lib.*/libc++.so' /usr/obj/BUILDs/main-amd64-no
>> dbg-clang/usr/main-src/amd64.amd64/`
>>> grep: /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/G
>> ENERIC-NODBG/modules/usr/main-src/sys/modules/twa/opt_twa.h: No such file or 
>> directory
>>> grep: /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/u
>> sr/include/dev/ic/esp.h: No such file or directory
>>> -rw-r--r--  1 root  wheel  64 Dec 30 08:30:43 2021 /usr/obj/BUILDs/main-amd
>> 64-nodbg-clang/usr/main-src/amd64.amd64/lib/libc++/libc++.ld
>>> -rw-r--r--  1 root  wheel  72 Dec 30 08:22:11 2021 /usr/obj/BUILDs/main-amd
>> 64-nodbg-clang/usr/main-src/amd64.amd64/obj-lib32/lib/libc++/libc++.ld
>>> -r--r--r--  1 root  wheel  72 Aug 19 03:09:03 2021 /usr/obj/BUILDs/main-amd
>> 64-nodbg-clang/usr/main-src/amd64.amd64/obj-lib32/tmp/usr/lib32/libc++.so
>>> -r--r--r--  1 root  wheel  64 Dec 30 08:30:43 2021 /usr/obj/BUILDs/main-amd
>> 64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so
>>> 
>>> So lib/libc++/libc++.ld and tmp/usr/lib/libc++.so both had been
>>> updated.
>>> 
>>> I used META_MODE.
>>> 
>>> So I do not get a full match to what is reported but the use of
>>> the tmp/usr/lib/libc++.so path does seem odd.
>>> 
>>> I've not looked at what a system from before the first move of
>>> libc++.so.1 does. I may be able to check that in a while.
>> 
>> So I've now looked at a build (not installed) that was done on:
>> 
>> # uname -apKU
>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #29 main-n252010-254e
>> 4e5b77d7-dirty: Tue Dec 28 16:04:12 PST 2021 root@CA72_16Gp_ZFS:/usr/obj/
>> BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA7
>> 2  arm64 aarch64 1400045 1400045
>> 
>> which is before the original attempt to move libc++.so.1 . It shows:
>> 
>> # grep -r 'GROUP.*/lib.*/libc++.so' /usr/obj/BUILDs/main-CA72-nodbg-clang/usr
>> /main-src/arm64.aarch64/ | more
>> grep: /usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/tmp/us
>> r/include/dev/ic/esp.h: No such file or directory
>> /usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/lib/libc++/l
>> ibc++.ld:GROUP ( /lib/libc++.so.1 /usr/lib/libcxxrt.so )
>> /usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/tmp/usr/lib/
>> libc++.so:GROUP ( /lib/libc++.so.1 /usr/lib/libcxxrt.so )
>> 
>> Again the tmp/usr/lib/libc++.so path but the content has /lib/libc++.so.1 .
>> 
>> Again it was a META_MODE build.
>> 
>> https://ci.freebsd.org and https://ci.freebsd.org show
>> successful builds at this point.
>> 
>> 
>> It looks like Cy may need to report more about the context
>> for the reported build failure.
> 
> It was a NO_CLEAN build. A CLEAN build resolved it.
> 
> There were no mods to this, my prod tree, except for some upcoming ipfilter 
> commits intended for the new year.

Re: Problems compiling kernel

2021-12-30 Thread Mark Millard via freebsd-current
> Dear all,
> 
> on a system updated yesterday I get
> 
> tuexen_at_head:~/freebsd-src % git branch
> * main
> tuexen_at_head:~/freebsd-src % git pull
> Already up to date.
> tuexen_at_head:~/freebsd-src % uname -a
> FreeBSD head 14.0-CURRENT FreeBSD 14.0-CURRENT #1 main-n252035-63f7f3921bd: 
> Thu Dec 30 11:33:16 CET 2021 
> root_at_head:/usr/obj/usr/home/tuexen/freebsd-src/amd64.amd64/sys/TCP  amd64
> tuexen_at_head:~/freebsd-src % sudo make -j 4 kernel KERNCONF=TCP
> ld-elf.so.1: Shared object "libc++.so.1" not found, required by "cc"
> make: "/usr/home/tuexen/freebsd-src/share/mk/bsd.compiler.mk" line 201: 
> warning: "cc -v 2>&1 | grep "gcc version"" returned non-zero status
> make: "/usr/home/tuexen/freebsd-src/share/mk/bsd.compiler.mk" line 205: 
> Unable to determine compiler type for CC=cc.  Consider setting COMPILER_TYPE.
> 
> make: stopped in /usr/home/tuexen/freebsd-src
> tuexen_at_head:~/freebsd-src % 
> 
> any idea what I did wrong and how to fix it?

The problem is in FreeeBSD itself from:

git: 6b1c5775d1c2 - main - Move libc++ from /usr/lib to /lib Ed Maste
(2021-Dec-29)

until the revert:

git: b6f7942cbcbd - main - Revert "Move libc++ from /usr/lib to /lib" Ed Maste

or, the fixed commit, if you want /lib/libc++.so.1 :

git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib Dimitry 
Andric
(2021-Dec-30)

6b1c5775d1c2 did not actually cause /lib/libc++.so.1 to be installed
but still put it at /usr/lib/libc++.so.1 . But its delete-old-libs
did remove /usr/lib/libc++.so.1 . (I suffered this too.)

A solution is to find libc++.so.1 in your build tree and to
copy it to one of the two places (old or new). So, for example:

.../amd64.amd64/lib/libc++/libc++.so.1

or, may be:

.../amd64.amd64/tmp/lib/libc++.so.1

Some old trees used for chroot use or the like can also
be a source for doing such a copy. (That is what I did.)

There will likely be another commit making it nicer
for NO_CLEAN style builds. 5e6a2d6eb220 is okay for
META_MODE builds or complete rebuilds.

I also wonder if they will create a:

/usr/lib/libc++.so -> ../../lib/libc++.so.1

or not, analogous to the existing:

/usr/lib/libcxxrt.so -> ../../lib/libcxxrt.so.1



===
Mark Millard
marklmi at yahoo.com




Re: git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib [add /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1 ?]

2021-12-30 Thread Mark Millard via freebsd-current



On 2021-Dec-30, at 13:05, Mark Millard  wrote:

> This asks a question in a different direction that my prior
> reports about my builds vs. Cy's reported build.
> 
> Background:
> 
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so:GROUP
>  ( /lib/libc++.so.1 /usr/lib/libcxxrt.so
> and:
> lrwxr-xr-x  1 root  wheel23 Dec 29 13:17:01 2021 /usr/lib/libcxxrt.so 
> -> ../../lib/libcxxrt.so.1
> 
> Why did libc++.so.1 not get a:
> 
> /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1

I forgot to remove the .1 on the left hand side:

/usr/lib/libc++.so -> ../../lib/libc++.so.1

> ? Why is it that only libcxxrt.so.1 has such?
> 
> Seems odd to me that the structure of things would
> be this different between libcxxrt.so.1 and
> libc++.so.1 .
> 



===
Mark Millard
marklmi at yahoo.com




Re: git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib [add /usr/lib/libc++.so.1 -> ../../lib/libc++.so.1 ?]

2021-12-30 Thread Mark Millard via freebsd-current
This asks a question in a different direction that my prior
reports about my builds vs. Cy's reported build.

Background:

/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so:GROUP
 ( /lib/libc++.so.1 /usr/lib/libcxxrt.so
and:
lrwxr-xr-x  1 root  wheel23 Dec 29 13:17:01 2021 /usr/lib/libcxxrt.so 
-> ../../lib/libcxxrt.so.1

Why did libc++.so.1 not get a:

/usr/lib/libc++.so.1 -> ../../lib/libc++.so.1

? Why is it that only libcxxrt.so.1 has such?

Seems odd to me that the structure of things would
be this different between libcxxrt.so.1 and
libc++.so.1 .

===
Mark Millard
marklmi at yahoo.com




Re: git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib

2021-12-30 Thread Mark Millard via freebsd-current
On 2021-Dec-30, at 11:52, Mark Millard  wrote:

>> This commit results in a different error.
>> 
>> ld: error: /export/obj/opt/src/git-src/amd64.amd64/tmp/usr/lib/libc++.so:2: 
>> cannot find /usr/lib/libc++.so.1 inside /export/obj/opt/src/git-src/amd64.am
>> d64/tmp
> GROUP ( /usr/lib/libc++.so.1 /usr/lib/libcxxrt.so )
>^
>> c++: error: linker command failed with exit code 1 (use -v to see 
>> invocation)
>> *** [libclang_rt.asan-x86_64.so.full] Error code 1
>> 
>> make[6]: stopped in /opt/src/git-src/lib/libclang_rt/asan_dynamic
> 
> Working in a system that had the file removed and then
> manually put back after the upgrade, what I see after this
> new rebuild (not installed) is:
> 
> # grep -r 'GROUP.*/lib.*/libc++.so' 
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/lib/libc++/libc++.ld:GROUP
>  ( /lib/libc++.so.1 /usr/lib/libcxxrt.so )
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/obj-lib32/tmp/usr/lib32/libc++.so:GROUP
>  ( /usr/lib32/libc++.so.1 /usr/lib32/libcxxrt.so )
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/obj-lib32/lib/libc++/libc++.ld:GROUP
>  ( /usr/lib32/libc++.so.1 /usr/lib32/libcxxrt.so )
> grep: 
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG/modules/usr/main-src/sys/modules/twa/opt_twa.h:
>  No such file or directory
> grep: 
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/include/dev/ic/esp.h:
>  No such file or directory
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so:GROUP
>  ( /lib/libc++.so.1 /usr/lib/libcxxrt.so
> 
> That has /lib/libc++.so.1 (outside lib32 materials).
> 
> But it also has: /tmp/usr/lib/libc++.so and is that a problem?
> 
> And, checking on when the files were modified:
> 
> # ls -Tld `grep -rl 'GROUP.*/lib.*/libc++.so' 
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/`
> grep: 
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG/modules/usr/main-src/sys/modules/twa/opt_twa.h:
>  No such file or directory
> grep: 
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/include/dev/ic/esp.h:
>  No such file or directory
> -rw-r--r--  1 root  wheel  64 Dec 30 08:30:43 2021 
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/lib/libc++/libc++.ld
> -rw-r--r--  1 root  wheel  72 Dec 30 08:22:11 2021 
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/obj-lib32/lib/libc++/libc++.ld
> -r--r--r--  1 root  wheel  72 Aug 19 03:09:03 2021 
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/obj-lib32/tmp/usr/lib32/libc++.so
> -r--r--r--  1 root  wheel  64 Dec 30 08:30:43 2021 
> /usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so
> 
> So lib/libc++/libc++.ld and tmp/usr/lib/libc++.so both had been
> updated.
> 
> I used META_MODE.
> 
> So I do not get a full match to what is reported but the use of
> the tmp/usr/lib/libc++.so path does seem odd.
> 
> I've not looked at what a system from before the first move of
> libc++.so.1 does. I may be able to check that in a while.

So I've now looked at a build (not installed) that was done on:

# uname -apKU
FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #29 
main-n252010-254e4e5b77d7-dirty: Tue Dec 28 16:04:12 PST 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
  arm64 aarch64 1400045 1400045

which is before the original attempt to move libc++.so.1 . It shows:

# grep -r 'GROUP.*/lib.*/libc++.so' 
/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/ | more
grep: 
/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/tmp/usr/include/dev/ic/esp.h:
 No such file or directory
/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/lib/libc++/libc++.ld:GROUP
 ( /lib/libc++.so.1 /usr/lib/libcxxrt.so )
/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/tmp/usr/lib/libc++.so:GROUP
 ( /lib/libc++.so.1 /usr/lib/libcxxrt.so )

Again the tmp/usr/lib/libc++.so path but the content has /lib/libc++.so.1 .

Again it was a META_MODE build.

https://ci.freebsd.org and https://ci.freebsd.org show
successful builds at this point.


It looks like Cy may need to report more about the context
for the reported build failure.


===
Mark Millard
marklmi at yahoo.com




Re: git: 5e6a2d6eb220 - main - Reapply: move libc++ from /usr/lib to /lib

2021-12-30 Thread Mark Millard via freebsd-current
> This commit results in a different error.
> 
> ld: error: /export/obj/opt/src/git-src/amd64.amd64/tmp/usr/lib/libc++.so:2: 
> cannot find /usr/lib/libc++.so.1 inside /export/obj/opt/src/git-src/amd64.am
> d64/tmp
> >>> GROUP ( /usr/lib/libc++.so.1 /usr/lib/libcxxrt.so )
> >>> ^
> c++: error: linker command failed with exit code 1 (use -v to see 
> invocation)
> *** [libclang_rt.asan-x86_64.so.full] Error code 1
> 
> make[6]: stopped in /opt/src/git-src/lib/libclang_rt/asan_dynamic

Working in a system that had the file removed and then
manually put back after the upgrade, what I see after this
new rebuild (not installed) is:

# grep -r 'GROUP.*/lib.*/libc++.so' 
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/lib/libc++/libc++.ld:GROUP
 ( /lib/libc++.so.1 /usr/lib/libcxxrt.so )
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/obj-lib32/tmp/usr/lib32/libc++.so:GROUP
 ( /usr/lib32/libc++.so.1 /usr/lib32/libcxxrt.so )
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/obj-lib32/lib/libc++/libc++.ld:GROUP
 ( /usr/lib32/libc++.so.1 /usr/lib32/libcxxrt.so )
grep: 
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG/modules/usr/main-src/sys/modules/twa/opt_twa.h:
 No such file or directory
grep: 
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/include/dev/ic/esp.h:
 No such file or directory
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so:GROUP
 ( /lib/libc++.so.1 /usr/lib/libcxxrt.so

That has /lib/libc++.so.1 (outside lib32 materials).

But it also has: /tmp/usr/lib/libc++.so and is that a problem?

And, checking on when the files were modified:

# ls -Tld `grep -rl 'GROUP.*/lib.*/libc++.so' 
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/`
grep: 
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG/modules/usr/main-src/sys/modules/twa/opt_twa.h:
 No such file or directory
grep: 
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/include/dev/ic/esp.h:
 No such file or directory
-rw-r--r--  1 root  wheel  64 Dec 30 08:30:43 2021 
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/lib/libc++/libc++.ld
-rw-r--r--  1 root  wheel  72 Dec 30 08:22:11 2021 
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/obj-lib32/lib/libc++/libc++.ld
-r--r--r--  1 root  wheel  72 Aug 19 03:09:03 2021 
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/obj-lib32/tmp/usr/lib32/libc++.so
-r--r--r--  1 root  wheel  64 Dec 30 08:30:43 2021 
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/usr/lib/libc++.so

So lib/libc++/libc++.ld and tmp/usr/lib/libc++.so both had been
updated.

I used META_MODE.

So I do not get a full match to what is reported but the use of
the tmp/usr/lib/libc++.so path does seem odd.

I've not looked at what a system from before the first move of
libc++.so.1 does. I may be able to check that in a while.


===
Mark Millard
marklmi at yahoo.com




Re: git: 6b1c5775d1c2 - main - Move libc++ from /usr/lib to /lib

2021-12-29 Thread Mark Millard via freebsd-current
Building 33812d60b960 ( so after 6b1c5775d1c2 ) and installing
and rebooting did not put in place a /lib/libc++.so.1 but
delete-old-libs removed the /usr/lib/libc++.so.1 .

(Luckily my environment has sufficient recent near-redundancy
to recover easily by putting in place a /usr/lib/libc++.so.1 .)

===
Mark Millard
marklmi at yahoo.com




Is amd64 buildworld supposed to be working for WITH_ASAN= WITH_UBSAN= (both used)? [It fails to link various things.]

2021-12-29 Thread Mark Millard via freebsd-current
In order to avoid the following for WITH_ASAN= WITH_UBSAN= (both used),
so also WITH_LLVM_BINUTILS= in use:

--- all_subdir_usr.bin/clang ---
--- llvm-as.full ---
ld: error: undefined symbol: compressBound
>>> referenced by Compression.cpp:51 
>>> (/usr/main-src/contrib/llvm-project/llvm/lib/Support/Compression.cpp:51)
>>>   Compression.o:(llvm::zlib::compress(llvm::StringRef, 
>>> llvm::SmallVectorImpl&, int)) in archive 
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang-alt/usr/main-src/amd64.amd64/lib/clang/libllvm/libllvm.a
. . .
ld: error: undefined symbol: compress2
. . .
ld: error: undefined symbol: uncompress
. . .
ld: error: undefined symbol: crc32
>>> referenced by Compression.cpp:85 
>>> (/usr/main-src/contrib/llvm-project/llvm/lib/Support/Compression.cpp:85)
>>>   Compression.o:(llvm::zlib::crc32(llvm::StringRef)) in archive 
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang-alt/usr/main-src/amd64.amd64/lib/clang/libllvm/libllvm.a

I hacked in:

# git -C /usr/main-src/ diff /usr/main-src/usr.bin/clang/
diff --git a/usr.bin/clang/llvm.prog.mk b/usr.bin/clang/llvm.prog.mk
index 3a708805d3ea..74bed2ecd314 100644
--- a/usr.bin/clang/llvm.prog.mk
+++ b/usr.bin/clang/llvm.prog.mk
@@ -25,6 +25,7 @@ PACKAGE=  clang
 .if ${.MAKE.OS} == "FreeBSD" || !defined(BOOTSTRAPPING)
 LIBADD+=   execinfo
 LIBADD+=   tinfow
+LIBADD+=   z
 .endif
 LIBADD+=   pthread

which avoided the specific problem.

But the next build attempt then got for missing Apple ObjC stuff
and something involving the name renderscript :

--- lldb.full ---
ld: error: undefined symbol: 
lldb_private::formatters::CMTimeSummaryProvider(lldb_private::ValueObject&, 
lldb_private::Stream&, lldb_private::TypeSummaryOptions const&)
>>> referenced by compressed_pair.h:61 
>>> (/usr/obj/BUILDs/main-amd64-nodbg-clang-alt/usr/main-src/amd64.amd64/tmp/usr/include/c++/v1/__memory/compressed_pair.h:61)
>>>   ObjCLanguage.o:(void 
>>> std::__1::__call_once_proxy
>>>  >(void*)) in archive 
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang-alt/usr/main-src/amd64.amd64/lib/clang/liblldb/liblldb.a
ld: error: undefined symbol: lldb_private::AppleObjCRuntimeV2::Initialize()
. . .
ld: error: undefined symbol: lldb_private::CFBasicHash::IsValid() const
>>> referenced by NSDictionary.cpp:715 
>>> (/usr/main-src/contrib/llvm-project/lldb/source/Plugins/Language/ObjC/NSDictionary.cpp:715)
>>>   
>>> NSDictionary.o:(lldb_private::formatters::NSCFDictionarySyntheticFrontEnd::CalculateNumChildren())
>>>  in archive 
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang-alt/usr/main-src/amd64.amd64/lib/clang/liblldb/liblldb.a
>>> referenced by NSSet.cpp:564 
>>> (/usr/main-src/contrib/llvm-project/lldb/source/Plugins/Language/ObjC/NSSet.cpp:564)
>>>   
>>> NSSet.o:(lldb_private::formatters::NSCFSetSyntheticFrontEnd::CalculateNumChildren())
>>>  in archive 
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang-alt/usr/main-src/amd64.amd64/lib/clang/liblldb/liblldb.a
. . .
ld: error: undefined symbol: 
lldb_private::lldb_renderscript::fixupX86FunctionCalls(llvm::Module&)
. . .

Is the "ObjC" related material and the "renderscript" related
material even supposed to be referenced or linked in? If yes,
what is missing to allow the links to complete.

I've not tried to get past this. There could be more that
fails to build after lldb.


I'm not sure if WITH_ASAN= WITH_UBSAN= is supposed to do
anything for buildkernel but I've not managed to get a
buildworld to finish everything yet.


May be src.conf is just ahead of what the build environment
is set up for?


For reference, at the time I was having the following build
itself with the WITH_ASAN= WITH_UBSAN= added (manually
line-split for readability):

# uname -apKU
FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #21
main-n252010-254e4e5b77d7-dirty: Tue Dec 28 15:54:08 PST 2021
root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
amd64 amd64 1400045 1400045



===
Mark Millard
marklmi at yahoo.com




Re: Panic: Page Fault in Kernel: Yesterday's CURRENT

2021-12-21 Thread Michael Butler via freebsd-current

Confirmed. The kernel at ..

FreeBSD 14.0-CURRENT #0 f06f1d1fdb9: Mon Dec 20 12:24:51 EST 2021

 .. boots successfully.

The kernel at ..

FreeBSD 14.0-CURRENT #1 553af8f1ec7: Tue Dec 21 15:16:10 EST 2021

 .. fails immediately after printing something like ..

Timecounters tick every 1.000 msec
Timecounter "TSC" frequency 701570048 Hz quality 800

 .. but before initializing ipfw as it used to,

Michael

On 12/21/21 12:01, Michael Butler via freebsd-current wrote:

I have an old pentium-3 that also won't boot kernels built after Dec 6th.

I suspect the commits listed below but, with the device being remote and 
having no DRAC, I'm struggling to test this theory.


The relevant commits ..

commit 553af8f1ec71d397b5b4fd5876622b9269936e63
Author: Mark Johnston 
Date:   Mon Dec 6 10:42:19 2021 -0500

     x86: Perform late TSC calibration before LAPIC timer calibration

commit 62d09b46ad7508ae74d462e49234f0a80f91ff69
Author: Mark Johnston 
Date:   Mon Dec 6 10:42:10 2021 -0500

     x86: Defer LAPIC calibration until after timecounters are available

It's currently running git rev e43d081f352 and I have a kernel at git 
rev f06f1d1fdb969fa7a0a6eefa030d8536f365eb6e to test later this evening,


 Michael


On 12/17/21 15:07, Larry Rosenman wrote:

On 12/17/2021 1:36 pm, Mark Johnston wrote:

On Fri, Dec 10, 2021 at 10:43:19AM -0600, Larry Rosenman wrote:

14-2021_12_07-1217 -  -  1.87G 2021-12-07 12:17
14-2021_12_09-1957 NR /  121G  2021-12-09 19:57

If that's any help


I can't tell what this is saying.  A kernel built on the 7th does not
crash, or...?  Which revision did you update from before you started
seeing crashes?

From a kgdb session it'd be useful to see output from

(kgdb) frame 8
(kgdb) p/x *tmp

to start.



Correct, the 7th didn't panic, but the 9th did, and yesterday's too.

Grrr
ler in borg in /mnt🔒 on ☁️  (us-east-1)
❯ kgdb -c /var/crash/vmcore.0  /mnt/boot/kernel/kernel
GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD]
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
 <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /mnt/boot/kernel/kernel...
(No debugging symbols found in /mnt/boot/kernel/kernel)
Failed to open vmcore: /var/crash/vmcore.0: Permission denied
(kgdb) bt
No stack.
quitb)

ler in borg in /mnt🔒 on ☁️  (us-east-1) took 6s
❯ sudo chmod +r /var/crash/*

ler in borg in /mnt🔒 on ☁️  (us-east-1)
❯ kgdb -c /var/crash/vmcore.0  /mnt/boot/kernel/kernel
GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD]
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
 <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /mnt/boot/kernel/kernel...
(No debugging symbols found in /mnt/boot/kernel/kernel)
/wrkdirs/usr/ports/devel/gdb/work-py37/gdb-11.1/gdb/thread.c:1345: 
internal-error: void switch_to_thread(thread_info *): Assertion `thr 
!= NULL' failed.

A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) n

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

/wrkdirs/usr/ports/devel/gdb/work-py37/gdb-11.1/gdb/thread.c:1345: 
internal-error: void switch_to_thread(thread_info *): Assertion `thr 
!= NULL' failed.

A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) n
Command aborted.
(kgdb) bt
No thread selected.
(kgdb) fr 8
No thread selected.
(kgdb)


On 12/10/2021 10:36 am, Alexander Motin wrote:
&

Re: Panic: Page Fault in Kernel: Yesterday's CURRENT

2021-12-21 Thread Michael Butler via freebsd-current

I have an old pentium-3 that also won't boot kernels built after Dec 6th.

I suspect the commits listed below but, with the device being remote and 
having no DRAC, I'm struggling to test this theory.


The relevant commits ..

commit 553af8f1ec71d397b5b4fd5876622b9269936e63
Author: Mark Johnston 
Date:   Mon Dec 6 10:42:19 2021 -0500

x86: Perform late TSC calibration before LAPIC timer calibration

commit 62d09b46ad7508ae74d462e49234f0a80f91ff69
Author: Mark Johnston 
Date:   Mon Dec 6 10:42:10 2021 -0500

x86: Defer LAPIC calibration until after timecounters are available

It's currently running git rev e43d081f352 and I have a kernel at git 
rev f06f1d1fdb969fa7a0a6eefa030d8536f365eb6e to test later this evening,


Michael


On 12/17/21 15:07, Larry Rosenman wrote:

On 12/17/2021 1:36 pm, Mark Johnston wrote:

On Fri, Dec 10, 2021 at 10:43:19AM -0600, Larry Rosenman wrote:

14-2021_12_07-1217 -  -  1.87G 2021-12-07 12:17
14-2021_12_09-1957 NR /  121G  2021-12-09 19:57

If that's any help


I can't tell what this is saying.  A kernel built on the 7th does not
crash, or...?  Which revision did you update from before you started
seeing crashes?

From a kgdb session it'd be useful to see output from

(kgdb) frame 8
(kgdb) p/x *tmp

to start.



Correct, the 7th didn't panic, but the 9th did, and yesterday's too.

Grrr
ler in borg in /mnt🔒 on ☁️  (us-east-1)
❯ kgdb -c /var/crash/vmcore.0  /mnt/boot/kernel/kernel
GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD]
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 


This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
     .

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /mnt/boot/kernel/kernel...
(No debugging symbols found in /mnt/boot/kernel/kernel)
Failed to open vmcore: /var/crash/vmcore.0: Permission denied
(kgdb) bt
No stack.
quitb)

ler in borg in /mnt🔒 on ☁️  (us-east-1) took 6s
❯ sudo chmod +r /var/crash/*

ler in borg in /mnt🔒 on ☁️  (us-east-1)
❯ kgdb -c /var/crash/vmcore.0  /mnt/boot/kernel/kernel
GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD]
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 


This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
     .

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /mnt/boot/kernel/kernel...
(No debugging symbols found in /mnt/boot/kernel/kernel)
/wrkdirs/usr/ports/devel/gdb/work-py37/gdb-11.1/gdb/thread.c:1345: 
internal-error: void switch_to_thread(thread_info *): Assertion `thr != 
NULL' failed.

A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) n

This is a bug, please report it.  For instructions, see:
.

/wrkdirs/usr/ports/devel/gdb/work-py37/gdb-11.1/gdb/thread.c:1345: 
internal-error: void switch_to_thread(thread_info *): Assertion `thr != 
NULL' failed.

A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) n
Command aborted.
(kgdb) bt
No thread selected.
(kgdb) fr 8
No thread selected.
(kgdb)


On 12/10/2021 10:36 am, Alexander Motin wrote:
> Hi Larry,
>
> This looks like some use-after-free or otherwise corrupted callout
> structure.  Unfortunately the backtrace does not tell what was the
> callout.  When was the previous update to look what could change?
>
> On 10.12.2021 11:24, Larry Rosenman wrote:
>> FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #15
>> main-n251537-ab639f2398b: Thu Dec  9 19:45:37 CST 2021
>> r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL
>> amd64
>>
>> VMCORE *IS* available.
>>
>>
>>
>>
>> Unread portion of the kernel message buffer:
>> kernel trap 12 with interrupts disabled
>>
>>
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 0; apic id = 20
>> fault virtual address   = 0x0
>> fault code 

Re: git: 30780c3f584a - stable/13 - README.md: correct GPL expansion

2021-12-18 Thread Mark Millard via freebsd-current
On 2021-Dec-18, at 09:30, Ed Maste  wrote:

> On Fri, 17 Dec 2021 at 11:09, Mark Millard  wrote:
>> 
>> I'm confused, beyond just LGPL claims in the (fairly
>> current) source code, but GPL more generally:
>> 
>> # grep -rl "SPDX.*GPL" /usr/main-src/
> 
> You need to exclude the ones with SPDX tags like:
> SPDX-License-Identifier: BSD-2-Clause OR GPL-2.0
> 
> but also note that this text in README.md is just documenting the
> top-level gnu/ subdirectory.

# grep -r "SPDX.*GPL" /usr/main-src/ | egrep -vi "(mit|bsd|Linux-OpenIB)" | 
grep -v sys/contrib/device-tree/ | more
/usr/main-src/sys/gnu/gcov/gcc_4_7.c:// SPDX-License-Identifier: GPL-2.0
/usr/main-src/sys/gnu/gcov/gcov_fs.c:// SPDX-License-Identifier: GPL-2.0
/usr/main-src/sys/dts/include/dt-bindings/soc/qcom,tcsr.h:/* 
SPDX-License-Identifier: GPL-2.0 */

But . . .

# grep -r "SPDX.*GPL" /usr/main-src/ | egrep -vi "(mit|bsd|Linux-OpenIB)" | 
grep sys/contrib/device-tree/ | wc
31049958  345089

===
Mark Millard
marklmi at yahoo.com




Re: git: 30780c3f584a - stable/13 - README.md: correct GPL expansion

2021-12-17 Thread Mark Millard via freebsd-current
From: Ed Maste  wrote on
Date: Fri, 17 Dec 2021 08:42:49 -0500 :

> On Fri, 17 Dec 2021 at 05:12, Baptiste Daroussin  wrote:
> >
> > > -gnu  Various commands and libraries under the GNU Public License.
> > > - Please see gnu/COPYING* for more information.
> > > +gnu  Various commands and libraries under the GNU General Public
> > > + License.  Please see gnu/COPYING* for more information.
> >
> > Which is wrong ;)
> >
> > There no library left under the GNU general public license, only one under 
> > LGPL
> > which will be soon gone.
> >
> > As for the commands, various is now a bit overrated, only diff3 is under GPL
> 
> Good point, in main right now we have LGPL dialog and libdialog, and
> GPL diff3, with additional ones in the stable branches.

I'm confused, beyond just LGPL claims in the (fairly
current) source code, but GPL more generally:

# grep -rl "SPDX.*GPL" /usr/main-src/
/usr/main-src/sys/compat/linuxkpi/common/include/linux/net_dim.h
/usr/main-src/sys/ofed/include/rdma/signature.h
/usr/main-src/sys/ofed/include/rdma/ib_smi.h
/usr/main-src/sys/ofed/include/rdma/rdmavt_qp.h
/usr/main-src/sys/ofed/include/rdma/opa_smi.h
. . .
/usr/main-src/sys/ofed/drivers/infiniband/util/madeye.c
/usr/main-src/sys/ofed/drivers/infiniband/core/ib_cq.c
/usr/main-src/sys/ofed/drivers/infiniband/core/ib_uverbs_uapi.c
/usr/main-src/sys/ofed/drivers/infiniband/core/ib_umem.c
. . .
/usr/main-src/sys/dev/isci/scil/sati_types.h
/usr/main-src/sys/dev/isci/scil/sati_mode_sense_10.c
/usr/main-src/sys/dev/isci/scil/sati_unmap.c
/usr/main-src/sys/dev/isci/scil/scic_sds_stp_packet_request.c
/usr/main-src/sys/dev/isci/scil/scic_sds_smp_request.h
/usr/main-src/sys/dev/isci/scil/sati_translator_sequence.h
/usr/main-src/sys/dev/isci/scil/scic_sds_stp_request.c
/usr/main-src/sys/dev/isci/scil/sati_mode_sense.c
. . .
/usr/main-src/sys/contrib/device-tree/Bindings/chrome/google,cros-ec-typec.yaml
/usr/main-src/sys/contrib/device-tree/Bindings/dsp/fsl,dsp.yaml
/usr/main-src/sys/contrib/device-tree/Bindings/reset/allwinner,sun6i-a31-clock-reset.yaml
/usr/main-src/sys/contrib/device-tree/Bindings/reset/brcm,bcm7216-pcie-sata-rescal.yaml
/usr/main-src/sys/contrib/device-tree/Bindings/reset/brcm,bcm6345-reset.yaml
/usr/main-src/sys/contrib/device-tree/Bindings/reset/brcm,bcm4908-misc-pcie-reset.yaml
. . .
/usr/main-src/sys/contrib/device-tree/include/dt-bindings/input/ti-drv260x.h
/usr/main-src/sys/contrib/device-tree/include/dt-bindings/input/gpio-keys.h
/usr/main-src/sys/contrib/device-tree/include/dt-bindings/input/cros-ec-keyboard.h
/usr/main-src/sys/contrib/device-tree/include/dt-bindings/input/input.h
/usr/main-src/sys/contrib/device-tree/include/dt-bindings/input/atmel-maxtouch.h
/usr/main-src/sys/contrib/device-tree/include/dt-bindings/input/linux-event-codes.h
. . .
/usr/main-src/sys/contrib/device-tree/COPYING
/usr/main-src/sys/contrib/device-tree/src/openrisc/or1klitex.dts
/usr/main-src/sys/contrib/device-tree/src/openrisc/or1ksim.dts
/usr/main-src/sys/contrib/device-tree/src/xtensa/lx200mx.dts
/usr/main-src/sys/contrib/device-tree/src/xtensa/kc705_nommu.dts
/usr/main-src/sys/contrib/device-tree/src/xtensa/ml605.dts
/usr/main-src/sys/contrib/device-tree/src/xtensa/virt.dts
/usr/main-src/sys/contrib/device-tree/src/xtensa/csp.dts
. . .
/usr/main-src/sys/contrib/dev/iwlwifi/iwl-op-mode.h
/usr/main-src/sys/contrib/dev/iwlwifi/iwl-io.h
/usr/main-src/sys/contrib/dev/iwlwifi/iwl-fh.h
/usr/main-src/sys/contrib/dev/iwlwifi/iwl-eeprom-read.c
/usr/main-src/sys/contrib/dev/iwlwifi/fw/dbg.h
/usr/main-src/sys/contrib/dev/iwlwifi/fw/pnvm.c
. . .
/usr/main-src/sys/fs/fuse/fuse_kernel.h
/usr/main-src/sys/gnu/gcov/gcc_4_7.c
/usr/main-src/sys/gnu/gcov/gcov_fs.c
/usr/main-src/sys/dts/arm/qcom-ipq4018-rt-ac58u.dts
/usr/main-src/sys/dts/include/dt-bindings/soc/qcom,tcsr.h
/usr/main-src/sys/cam/scsi/scsi_ses.h
/usr/main-src/sys/cam/scsi/scsi_enc.h
/usr/main-src/share/man/man4/pvscsi.4
/usr/main-src/share/man/man4/vmci.4


For reference:

# cd /usr/main-src/
# ~/fbsd-based-on-what-commit.sh 
branch: main
merge-base: 22c4ab6cb015dc99eb82504e5fd957662cded3c3
merge-base: CommitDate: 2021-12-07 19:29:26 +
22c4ab6cb015 (HEAD -> main, freebsd/main, freebsd/HEAD) sys/_bitset.h: Fix 
fall-out from commit 5e04571cf3c
n251456 (--first-parent --count for merge-base)

===
Mark Millard
marklmi at yahoo.com




On amd64 main-n251456-22c4ab6cb015-dirty (Dec.-7): /boot/kernel/ng_ubt.ko is getting "symbol sysctl___net_bluetooth undefined"

2021-12-15 Thread Mark Millard via freebsd-current
Back when I upgraded the ThreadRipper 1950X amd64 system to (line split for 
readability):

# uname -apKU
FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #25 
main-n251456-22c4ab6cb015-dirty:
Tue Dec  7 19:38:53 PST 2021
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
 
arm64 aarch64 1400043 1400043

I started getting notices like:

Dec  7 18:38:57 amd64_ZFS kernel: link_elf_obj: symbol sysctl___net_bluetooth 
undefined
Dec  7 18:38:57 amd64_ZFS kernel: linker_load_file: /boot/kernel/ng_ubt.ko - 
unsupported file type

(Not that I use the bluetooth on the system.)

Is this expected for a kernel of that vintage?

For reference:

# more /usr/main-src/sys/amd64/conf/GENERIC-NODBG
#
# GENERIC -- Custom configuration for the amd64/amd64
#

include "GENERIC"

ident   GENERIC-NODBG

makeoptions DEBUG=-g# Build kernel with gdb(1) debug symbols
makeoptions WITH_CTF=1  # Run ctfconvert(1) for DTrace support

options NUMA
options MAXMEMDOM=2

#optionsALT_BREAK_TO_DEBUGGER

options KDB # Enable kernel debugger support

# For minimum debugger support (stable branch) use:
options KDB_TRACE   # Print a stack trace for a panic
options DDB # Enable the kernel debugger

# Extra stuff:
#optionsVERBOSE_SYSINIT=0   # Enable verbose sysinit messages
#optionsBOOTVERBOSE=1
#optionsBOOTHOWTO=RB_VERBOSE
#optionsKTR
#optionsKTR_MASK=KTR_TRAP
##options   KTR_CPUMASK=0xF
#optionsKTR_VERBOSE
#optionsACPI_DEBUG

# Disable any extra checking for. . .
nooptions   DEADLKRES   # Would enable the deadlock resolver
nooptions   INVARIANTS  # Would enable calls of extra sanity 
checking
nooptions   INVARIANT_SUPPORT   # Would enable extra sanity checks of 
internal structures, required by INVARIANTS
nooptions   WITNESS # Would enable checks to detect 
deadlocks and cycles
nooptions   WITNESS_SKIPSPIN# Would enable running witness on 
spinlocks for speed
nooptions   DIAGNOSTIC
nooptions   MALLOC_DEBUG_MAXZONES

# Kernel Sanitizers
nooptions   COVERAGE# Would enable generic kernel coverage. 
Used by KCOV
nooptions   KCOV# Would enable Kernel Coverage Sanitizer
# Warning: KUBSAN can result in a kernel too large for loader to load
nooptions   KUBSAN  # Would enable Kernel Undefined 
Behavior Sanitizer

device  iwm
device  iwmfw


sysctl___net_bluetooth seems to be from one or more of:

# grep -r "net_bluetooth[^_]" /usr/main-src/sys/ | more
/usr/main-src/sys/netgraph/bluetooth/common/ng_bluetooth.c:SYSCTL_INT(_net_bluetooth,
 OID_AUTO, version,
/usr/main-src/sys/netgraph/bluetooth/common/ng_bluetooth.c:SYSCTL_NODE(_net_bluetooth,
 OID_AUTO, hci, CTLFLAG_RW | CTLFLAG_MPSAFE, 0,
/usr/main-src/sys/netgraph/bluetooth/common/ng_bluetooth.c:SYSCTL_NODE(_net_bluetooth,
 OID_AUTO, l2cap, CTLFLAG_RW | CTLFLAG_MPSAFE, 0,
/usr/main-src/sys/netgraph/bluetooth/common/ng_bluetooth.c:SYSCTL_NODE(_net_bluetooth,
 OID_AUTO, rfcomm, CTLFLAG_RW | CTLFLAG_MPSAFE, 0,
/usr/main-src/sys/netgraph/bluetooth/common/ng_bluetooth.c:SYSCTL_NODE(_net_bluetooth,
 OID_AUTO, sco, CTLFLAG_RW | CTLFLAG_MPSAFE, 0, 
/usr/main-src/sys/netgraph/bluetooth/drivers/ubt/ng_ubt.c:SYSCTL_INT(_net_bluetooth,
 OID_AUTO, usb_isoc_enable, CTLFLAG_RWTUN | CTLFLAG_MPSAFE,
/usr/main-src/sys/netgraph/bluetooth/include/ng_bluetooth.h:SYSCTL_DECL(_net_bluetooth);


===
Mark Millard
marklmi at yahoo.com




Re: CURRENT: ZFS freezes system beyond reboot

2021-12-15 Thread Mark Millard via freebsd-current
From: FreeBSD User  wrote on
Date: Wed, 15 Dec 2021 18:55:09 +0100 :

> . . .
> 
> It is spooky, if not to say "buggy", if ZFS is capable of freezing the whole 
> box even if
> the essential operating system stuff is isolated on a dedicated UFS 
> filesystem.

I would guess that, for ZFS being in use, everything related to,
for example, the ARC is "essential operating system stuff", given
its tie to wired memory usage and the like that greatly changes
the wired memory usage pattern/sizing compared to ZFS not being
involved on the system (UFS only).

(I only use ZFS in a simpler context, however.)

===
Mark Millard
marklmi at yahoo.com




Re: question on socket server

2021-12-15 Thread Ronald Klop via freebsd-current

Hi,

Your program first waits for the first client to connect. So nothing is written 
anywhere.
You can check by running "nc -v localhost " in another terminal.
After the first client disconnects it keeps looping in the while and the print 
will return 0 which means failure.

Something like this will improve things.

   if (0 == print "test\015\012") {
   return;
   }


Regards and happy hacking,
Ronald.

PS: I think this does not have to do a lot with freebsd-current. Might move it 
to https://lists.freebsd.org/subscription/freebsd-perl or some generic perl 
forum/ML.


Van: Piper H 
Datum: woensdag, 15 december 2021 11:55
Aan: Ronald Klop 
CC: freebsd-current@freebsd.org
Onderwerp: Re: question on socket server


But I write this program to listen on port  who sends a random str to the 
socket every 0.25 second. And there is no client connecting to the port. The 
server just runs there without problem. :( So I am not sure enough...
 
use strict;


package MyPackage;
use base qw(Net::Server);


my @fruit=qw(
...
);


sub process_request {
my $self = shift;
$| = 1;
my $max = scalar @fruit;

while (1) {
my $id1 = int(rand($max));
my $str = $fruit[$id1];

print "$str\015\012";
select(undef, undef, undef, 0.25);
}
}
 
MyPackage->run(port => , ipv => '*');
 
On Wed, Dec 15, 2021 at 6:51 PM Ronald Klop  wrote:


Hi,

Just try it!

I think you will get an error that you are writing to a not-connected socket.
From "man 2 write":
" [EPIPE]An attempt is made to write to a socket of type SOCK_STREAM 
that is not connected to a peer socket."

See also "man 2 send"  and "man 2 socket" for a lot more information.

So it depends a bit on the type of socket you created.

Regards and happy hacking,
Ronald.

 
Van: Piper H 

Datum: woensdag, 15 december 2021 07:52
Aan: freebsd-current@freebsd.org
Onderwerp: question on socket server


Hello

I have little knowledge about socket programming.
I have a question that, if I have made a socket server, listening on a
port. The server prints data to the socket, but there is never a client
connection to the port, and the data is never consumed. What will happen to
the server then? will the OS kernel be flushed by junk bytes?

Thanks for your help.
Piper







Re: question on socket server

2021-12-15 Thread Ronald Klop via freebsd-current

Hi,

Just try it!

I think you will get an error that you are writing to a not-connected socket.

From "man 2 write":

" [EPIPE]An attempt is made to write to a socket of type SOCK_STREAM 
that is not connected to a peer socket."

See also "man 2 send"  and "man 2 socket" for a lot more information.

So it depends a bit on the type of socket you created.

Regards and happy hacking,
Ronald.


Van: Piper H 
Datum: woensdag, 15 december 2021 07:52
Aan: freebsd-current@freebsd.org
Onderwerp: question on socket server


Hello

I have little knowledge about socket programming.
I have a question that, if I have made a socket server, listening on a
port. The server prints data to the socket, but there is never a client
connection to the port, and the data is never consumed. What will happen to
the server then? will the OS kernel be flushed by junk bytes?

Thanks for your help.
Piper




Re: /usr/share/zfs/compatibility.d/openzfs-2.1-freebsd vs. openzfs-2.1-linux vs. FreeBSD main [so: 14]: edonr status

2021-12-14 Thread Mark Millard via freebsd-current
On 2021-Dec-14, at 17:35, Alexander Motin  wrote:

> On 14.12.2021 20:21, Mark Millard wrote:
>> I presume that because of FreeBSD's releng/13.0 and stable/13 (and
>> releng/13.? futures) that:
>> 
>> /usr/share/zfs/compatibility.d/openzfs-2.1-freebsd
>> 
>> will never have edonr added to the file. Sound right?
> 
> FreeBSD stable/13 is tracking still alive upstream zfs-2.1-release
> branch.  It is still updated periodically, but primarily with bug fixes.

I infer from the above that:

/usr/share/zfs/compatibility.d/openzfs-2.1-freebsd

is unlikely to be changed to be inaccurate relative to releng/13.0 , at
least as long as 13.0 is a supported FreeBSD release, but probably for
all the releng/13.? .

>> Is there going to be a /usr/share/zfs/compatibility.d/openzfs-2.*-freebsd*
>> that has edonr as well (instead of using a openzfs-2.1-linux file for
>> such)? If yes, when does the file show up? Does main get drafts of the
>> file over time until there is a releng/14.0 that would have the final
>> version?
> 
> FreeBSD main though tracks openzfs master branch, and as a moving target
> it has no compatibility definitions.  I'd expect by the time of FreeBSD
> stable/14 branching there to be some new openzfs branch it could switch
> to, but so far AFAIK there were no specific announcements yet.  And
> enabled edonr is a step toward not differentiating FreeBSD and Linux
> compatibility settings any more.

I infer from the above that it will be much closer to releng/14.0 's
time frame before there will be an additional:

/usr/share/zfs/compatibility.d/openzfs-*-freebsd*

( or a name that does not even mention freebsd or linux but
applies to releng/14.0 ). Good to know.



I could imagine FreeBSD having links with names making it
clear what each FreeBSD release should use for a matching
feature-list file when one is desired. For example:

# ln -s openzfs-2.1-freebsd openzfs-freebsd-13.0-RELEASE

I had to do multiple comparisons to know for sure what
file to use to have a match: it was not obvious from my
background knowledge.

(From what you have reported, I'd not expect stable/* or
main to have such links.)



Thanks for the information. I know better what to do now.

>>> On 14.12.2021 19:36, Mark Millard via freebsd-current wrote:
>>>> I just noticed that main reports that my pools were created
>>>> implicitly matching openzfs-2.1-freebsd (and without
>>>> an explicit compatibility assignment) but, under main, zpool
>>>> import and zpool status for those pools report a new, disabled
>>>> feature. Turns out the issue matches what the diff below shows
>>>> as present for openzfs-2.1-linux but not for
>>>> openzfs-2.1-freebsd :
>>>> 
>>>> # diff -u /usr/share/zfs/compatibility.d/openzfs-2.1-[fl]*
>>>> --- /usr/share/zfs/compatibility.d/openzfs-2.1-freebsd  2021-12-07 
>>>> 21:23:21.573542000 -0800
>>>> +++ /usr/share/zfs/compatibility.d/openzfs-2.1-linux2021-12-07 
>>>> 21:23:21.581738000 -0800
>>>> @@ -1,4 +1,4 @@
>>>> -# Features supported by OpenZFS 2.1 on FreeBSD
>>>> +# Features supported by OpenZFS 2.1 on Linux
>>>> allocation_classes
>>>> async_destroy
>>>> bookmark_v2
>>>> @@ -7,6 +7,7 @@
>>>> device_rebuild
>>>> device_removal
>>>> draid
>>>> +edonr
>>>> embedded_data
>>>> empty_bpobj
>>>> enabled_txg
>>>> 
>>>> So I've taken to updating my existing zpool's via:
>>>> 
>>>> zpool set compatibility=openzfs-2.1-freebsd NAME
>>>> 
>>>> because I use them under releng/13 and stable/13 and main
>>>> and do not want edonr accidentally enabled.
>>>> 
>>>> It is not obvious to me if edonr being present for main
>>>> is deliberate or not.
>>>> 
>>>> For reference:
>>>> 
>>>> # grep edonr /usr/share/zfs/compatibility.d/*
>>>> /usr/share/zfs/compatibility.d/openzfs-2.0-linux:edonr
>>>> /usr/share/zfs/compatibility.d/openzfs-2.1-linux:edonr
>>>> /usr/share/zfs/compatibility.d/openzfsonosx-1.7.0:edonr
>>>> /usr/share/zfs/compatibility.d/openzfsonosx-1.8.1:edonr
>>>> /usr/share/zfs/compatibility.d/openzfsonosx-1.9.3:edonr
>>>> /usr/share/zfs/compatibility.d/openzfsonosx-1.9.4:edonr
>>>> /usr/share/zfs/compatibility.d/ubuntu-18.04:edonr
>>>> /usr/share/zfs/compatibility.d/ubuntu-20.04:edonr
>>>> /usr/share/zfs/compatibility.d/zol-0.7:edonr
>>>> /usr/share/zfs/compatibility.d/zol-0.8:edonr
>>>> 
>>>> I happened to do this activity in a aarch64 context, in
>>>> case that matters.
>>> 
>> 
> 

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: /usr/share/zfs/compatibility.d/openzfs-2.1-freebsd vs. openzfs-2.1-linux vs. FreeBSD main [so: 14]: edonr status

2021-12-14 Thread Mark Millard via freebsd-current
On 2021-Dec-14, at 16:54, Alexander Motin  wrote:

> Mark,
> 
> Support for edonr checksums was added to FreeBSD main about a month ago:
> https://github.com/openzfs/zfs/pull/12735 .  In 13 it is indeed still
> not supported.  But you should not worry too much about it, since even
> enabled but not activated feature should not cause problems with pool
> import by older versions.  And activated it will bcomee only when you
> explicitly set for some dataset with checksum=edonr.  Some other
> features though activate immediately on enable, but compression and
> checksuming algorithms generally should not, with exception to lz4,
> which was optional originally, but become default later.

I presume that because of FreeBSD's releng/13.0 and stable/13 (and
releng/13.? futures) that:

/usr/share/zfs/compatibility.d/openzfs-2.1-freebsd

will never have edonr added to the file. Sound right?

Is there going to be a /usr/share/zfs/compatibility.d/openzfs-2.*-freebsd*
that has edonr as well (instead of using a openzfs-2.1-linux file for
such)? If yes, when does the file show up? Does main get drafts of the
file over time until there is a releng/14.0 that would have the final
version?

> On 14.12.2021 19:36, Mark Millard via freebsd-current wrote:
>> I just noticed that main reports that my pools were created
>> implicitly matching openzfs-2.1-freebsd (and without
>> an explicit compatibility assignment) but, under main, zpool
>> import and zpool status for those pools report a new, disabled
>> feature. Turns out the issue matches what the diff below shows
>> as present for openzfs-2.1-linux but not for
>> openzfs-2.1-freebsd :
>> 
>> # diff -u /usr/share/zfs/compatibility.d/openzfs-2.1-[fl]*
>> --- /usr/share/zfs/compatibility.d/openzfs-2.1-freebsd  2021-12-07 
>> 21:23:21.573542000 -0800
>> +++ /usr/share/zfs/compatibility.d/openzfs-2.1-linux2021-12-07 
>> 21:23:21.581738000 -0800
>> @@ -1,4 +1,4 @@
>> -# Features supported by OpenZFS 2.1 on FreeBSD
>> +# Features supported by OpenZFS 2.1 on Linux
>> allocation_classes
>> async_destroy
>> bookmark_v2
>> @@ -7,6 +7,7 @@
>> device_rebuild
>> device_removal
>> draid
>> +edonr
>> embedded_data
>> empty_bpobj
>> enabled_txg
>> 
>> So I've taken to updating my existing zpool's via:
>> 
>> zpool set compatibility=openzfs-2.1-freebsd NAME
>> 
>> because I use them under releng/13 and stable/13 and main
>> and do not want edonr accidentally enabled.
>> 
>> It is not obvious to me if edonr being present for main
>> is deliberate or not.
>> 
>> For reference:
>> 
>> # grep edonr /usr/share/zfs/compatibility.d/*
>> /usr/share/zfs/compatibility.d/openzfs-2.0-linux:edonr
>> /usr/share/zfs/compatibility.d/openzfs-2.1-linux:edonr
>> /usr/share/zfs/compatibility.d/openzfsonosx-1.7.0:edonr
>> /usr/share/zfs/compatibility.d/openzfsonosx-1.8.1:edonr
>> /usr/share/zfs/compatibility.d/openzfsonosx-1.9.3:edonr
>> /usr/share/zfs/compatibility.d/openzfsonosx-1.9.4:edonr
>> /usr/share/zfs/compatibility.d/ubuntu-18.04:edonr
>> /usr/share/zfs/compatibility.d/ubuntu-20.04:edonr
>> /usr/share/zfs/compatibility.d/zol-0.7:edonr
>> /usr/share/zfs/compatibility.d/zol-0.8:edonr
>> 
>> I happened to do this activity in a aarch64 context, in
>> case that matters.
> 
> 

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: /usr/share/zfs/compatibility.d/openzfs-2.1-freebsd vs. openzfs-2.1-linux vs. FreeBSD main [so: 14]: edonr status

2021-12-14 Thread Mark Millard via freebsd-current
On 2021-Dec-14, at 16:36, Mark Millard  wrote:

> I just noticed that main reports that my pools were created
> implicitly matching openzfs-2.1-freebsd (and without
> an explicit compatibility assignment) but, under main, zpool
> import and zpool status for those pools report a new, disabled
> feature. Turns out the issue matches what the diff below shows
> as present for openzfs-2.1-linux but not for
> openzfs-2.1-freebsd :
> 
> # diff -u /usr/share/zfs/compatibility.d/openzfs-2.1-[fl]*
> --- /usr/share/zfs/compatibility.d/openzfs-2.1-freebsd  2021-12-07 
> 21:23:21.573542000 -0800
> +++ /usr/share/zfs/compatibility.d/openzfs-2.1-linux2021-12-07 
> 21:23:21.581738000 -0800
> @@ -1,4 +1,4 @@
> -# Features supported by OpenZFS 2.1 on FreeBSD
> +# Features supported by OpenZFS 2.1 on Linux
> allocation_classes
> async_destroy
> bookmark_v2
> @@ -7,6 +7,7 @@
> device_rebuild
> device_removal
> draid
> +edonr
> embedded_data
> empty_bpobj
> enabled_txg
> 
> So I've taken to updating my existing zpool's via:
> 
> zpool set compatibility=openzfs-2.1-freebsd NAME
> 
> because I use them under releng/13 and stable/13 and main
> and do not want edonr accidentally enabled.
> 
> It is not obvious to me if edonr being present for main
> is deliberate or not.
> 
> For reference:
> 
> # grep edonr /usr/share/zfs/compatibility.d/*
> /usr/share/zfs/compatibility.d/openzfs-2.0-linux:edonr
> /usr/share/zfs/compatibility.d/openzfs-2.1-linux:edonr
> /usr/share/zfs/compatibility.d/openzfsonosx-1.7.0:edonr
> /usr/share/zfs/compatibility.d/openzfsonosx-1.8.1:edonr
> /usr/share/zfs/compatibility.d/openzfsonosx-1.9.3:edonr
> /usr/share/zfs/compatibility.d/openzfsonosx-1.9.4:edonr
> /usr/share/zfs/compatibility.d/ubuntu-18.04:edonr
> /usr/share/zfs/compatibility.d/ubuntu-20.04:edonr
> /usr/share/zfs/compatibility.d/zol-0.7:edonr
> /usr/share/zfs/compatibility.d/zol-0.8:edonr
> 
> I happened to do this activity in a aarch64 context, in
> case that matters.
> 

Hmm. After (re-)import zpool status seems to track the
compatibility assignment: no complaint in . . .

# zpool import -f -N -R /zamd64-mnt -t zamd64 zpamd64
# zpool status zpamd64
  pool: zpamd64
 state: ONLINE
config:

NAMESTATE READ WRITE CKSUM
zpamd64 ONLINE   0 0 0
  gpt/amd64zfs  ONLINE   0 0 0

errors: No known data errors

However there is the following (done after the above):

# zpool export zpamd64
# zpool import
. . .

   pool: zamd64
 id: 4513815084006659826
  state: ONLINE
status: Some supported features are not enabled on the pool.
(Note that they may be intentionally disabled if the
'compatibility' property is set.)
 action: The pool can be imported using its name or numeric identifier, though
some features will not be available without an explicit 'zpool upgrade'.
 config:

zamd64  ONLINE
  gpt/amd64zfs  ONLINE


This may be expected/intentional but was not obvious expectation
to me.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




/usr/share/zfs/compatibility.d/openzfs-2.1-freebsd vs. openzfs-2.1-linux vs. FreeBSD main [so: 14]: edonr status

2021-12-14 Thread Mark Millard via freebsd-current
I just noticed that main reports that my pools were created
implicitly matching openzfs-2.1-freebsd (and without
an explicit compatibility assignment) but, under main, zpool
import and zpool status for those pools report a new, disabled
feature. Turns out the issue matches what the diff below shows
as present for openzfs-2.1-linux but not for
openzfs-2.1-freebsd :

# diff -u /usr/share/zfs/compatibility.d/openzfs-2.1-[fl]*
--- /usr/share/zfs/compatibility.d/openzfs-2.1-freebsd  2021-12-07 
21:23:21.573542000 -0800
+++ /usr/share/zfs/compatibility.d/openzfs-2.1-linux2021-12-07 
21:23:21.581738000 -0800
@@ -1,4 +1,4 @@
-# Features supported by OpenZFS 2.1 on FreeBSD
+# Features supported by OpenZFS 2.1 on Linux
 allocation_classes
 async_destroy
 bookmark_v2
@@ -7,6 +7,7 @@
 device_rebuild
 device_removal
 draid
+edonr
 embedded_data
 empty_bpobj
 enabled_txg

So I've taken to updating my existing zpool's via:

zpool set compatibility=openzfs-2.1-freebsd NAME

because I use them under releng/13 and stable/13 and main
and do not want edonr accidentally enabled.

It is not obvious to me if edonr being present for main
is deliberate or not.

For reference:

# grep edonr /usr/share/zfs/compatibility.d/*
/usr/share/zfs/compatibility.d/openzfs-2.0-linux:edonr
/usr/share/zfs/compatibility.d/openzfs-2.1-linux:edonr
/usr/share/zfs/compatibility.d/openzfsonosx-1.7.0:edonr
/usr/share/zfs/compatibility.d/openzfsonosx-1.8.1:edonr
/usr/share/zfs/compatibility.d/openzfsonosx-1.9.3:edonr
/usr/share/zfs/compatibility.d/openzfsonosx-1.9.4:edonr
/usr/share/zfs/compatibility.d/ubuntu-18.04:edonr
/usr/share/zfs/compatibility.d/ubuntu-20.04:edonr
/usr/share/zfs/compatibility.d/zol-0.7:edonr
/usr/share/zfs/compatibility.d/zol-0.8:edonr

I happened to do this activity in a aarch64 context, in
case that matters.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: 14-current: unable to boot after upgrade (installworld)

2021-12-10 Thread Toomas Soome via freebsd-current



> On 9. Dec 2021, at 20:54, Sergey Dyatko  wrote:
> 
> tiger@dl:~ % gpart show
> 
>   |
> =>40  3907029088  da4  GPT  (1.8T)
>  4010241  freebsd-boot  (512K)
>1064 984   - free -  (492K)
>2048  39070269442  freebsd-zfs  (1.8T)
>  3907028992 136   - free -  (68K)
> 
> =>40  3907029088  da5  GPT  (1.8T)
>  4010241  freebsd-boot  (512K)
>1064 984   - free -  (492K)
>2048  39070269442  freebsd-zfs  (1.8T)
>  3907028992 136   - free -  (68K)
> 
> =>40  3907029088  da6  GPT  (1.8T)
>  4010241  freebsd-boot  (512K)
>1064 984   - free -  (492K)
>2048  39070269442  freebsd-zfs  (1.8T)
>  3907028992 136   - free -  (68K)
> 
> =>40  3907029088  da7  GPT  (1.8T)
>  4010241  freebsd-boot  (512K)
>1064 984   - free -  (492K)
>2048  39070269442  freebsd-zfs  (1.8T)
>  3907028992 136   - free -  (68K)
> 
> i'm not sure about video, everything happens faster than I can see :-) but
> sometimes the system does not freeze and I can enter commands. Can this
> help in some way?
> 

maybe, maybe not; from one hand, BTX register dump may help us to identify 
possible location or give other clues - eip=004cfrom your screenshot is 
telling us that some structure with function pointers must have been corrupted, 
seems like NULL pointer derefernce caused by this corruption. So the 
investigation should try to identify what is causing such corruption…. 

Since it was booting before, does the old loader start? I see the iKVM windo 
does have record menu entry, can it be used to record whole incident?

rgds,
toomas


> чт, 9 дек. 2021 г. в 18:19, Toomas Soome :
> 
>> 
>> 
>>> On 9. Dec 2021, at 20:06, Sergey Dyatko  wrote:
>>> 
>>> I was sure the installer did it when I reinstalled the system from
>> scratch. I
>>> can load 14-current successfully after boot via PXE and installworld with
>>> 13-current
>>> now I did the following:
>>> 1) boot from HDDs FreeBSD 14.0-CURRENT #0 main-n251494-f953785b3df (with
>>> 'old' world)
>>> 2)run installworld (f953785b3df)
>>> 3) run `gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da${D}`
>>> command , where D=4..7
>>> root@dl:/usr/src # zpool status
>>> pool: zroot
>>> state: ONLINE
>>> config:
>>> 
>>>   NAMESTATE READ WRITE CKSUM
>>>   zroot   ONLINE   0 0 0
>>> mirror-0  ONLINE   0 0 0
>>>   da4p2   ONLINE   0 0 0
>>>   da5p2   ONLINE   0 0 0
>>> mirror-1  ONLINE   0 0 0
>>>   da6p2   ONLINE   0 0 0
>>>   da7p2   ONLINE   0 0 0
>>> errors: No known data errors
>>> 
>>> after `shutdown -r now` system still doesn't boot  with the same error.
>> As
>>> far I can see, there is /boot/lua/config.lua present, but when I try to
>> run
>>> command more /boot/lua/config.lua system hangs with following error:
>>> https://imgur.com/5p0xu6W.png
>>> 
>> 
>> You seem to get 2x BTX panic, could you try to create video from console,
>> so we could get register dumps?
>> 
>> can this system do UEFI boot and if so, can you test it? Could you post
>> partition tables?
>> 
>> rgds,
>> toomas
>> 
>> 
>>> 
>>> чт, 9 дек. 2021 г. в 15:56, Warner Losh :
>>> 
 On Thu, Dec 9, 2021 at 7:58 AM Tomoaki AOKI 
 wrote:
 
> On Thu, 9 Dec 2021 13:36:10 +
> "Sergey V. Dyatko"  wrote:
> 
>> Hi,
>> 
>> Yesterday I tried to upgrade old 13-current (svn rev r368473) to fresh
>> 14-current from git,it looked like this:
>> 1) git pull https://git.freebsd.org/src.git /usr/src
>> 2) cd /usr/src ; make buildworld; make kernel
>> 3) shutdown -r now
>> after that I _successfully_ booted into 14-current and continued with
>> etcupdate -p
>> make installworld
>> etcupdate -B
>> shutdown -r now
>> 
>> but after that server doesn't come back. After I conneted to this
 server
> via
>> IPMI ip-kvm I saw following (sorry for external link):
>> https://i.imgur.com/jH6MHd2.png
>> 
>> Well. There was a migration to zol between r368473 and current 'main'
> branch so
>> I decided to install fresh 14-current from snapshot
>> FreeBSD-14.0-CURRENT-amd64-20211202-610d908f8a6-251253 in order to
 avoid
>> possible problems
>> 
>> and again, after make kernel and reboot OS runs, but after
>> installworld
> I ended up in the same situation
>> 
>> thoughts ?
>> 
>> --
>> wbr, Sergey
>> 
>> 
> 
> Bootcode should be updated.
> The procedure you wrote doesn't seem to update it.
> 
 
 The posted error is one you get when you can't read the filesystem,
>> which
 is why you need to update

Re: HEADS-UP: ASLR for 64-bit executables enabled by default on main

2021-12-10 Thread Daniel O'Connor via freebsd-current



> On 17 Nov 2021, at 09:00, Marcin Wojtas  wrote:
> As of b014e0f15bc7 the ASLR (Address Space Layout
> Randomization) feature becomes enabled for the all 64-bit
> binaries by default.

Firstly, thank your for your efforts here, it is appreciated :)

I am finding that the lang/sdcc port is crashing with a seg fault and the core 
dump is no help to me at all:
[freebsd14 7:06] /usr/ports/lang/sdcc/work/sdcc-4.0.0/device/lib >sudo gdb 
../../bin/sdcc sdcc.core
GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD]

Reading symbols from ../../bin/sdcc...
[New LWP 100122]
Core was generated by `../../bin/sdcc -I../../device/include 
-I../../device/include/mcs51 -mds390 --nos'.
Program terminated with signal SIGSEGV, Segmentation fault.
Invalid permissions for mapped object.
#0  0x000804e3fbc0 in setrlimit () from /lib/libc.so.7
(gdb) info thread
  Id   Target Id Frame
* 1LWP 1001220x000804e3fbc0 in setrlimit () from /lib/libc.so.7
(gdb) bt
#0  0x000804e3fbc0 in setrlimit () from /lib/libc.so.7
Backtrace stopped: Cannot access memory at address 0x7f87fd08

If I disable ASLR (via proccontrol) then it does not crash, but I am not sure 
how I can debug it further.

I've raised a bug https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260303 if 
you (or anyone else) has suggestions for what to try.

Thanks.

--
Daniel O'Connor
"The nice thing about standards is that there
are so many of them to choose from."
 -- Andrew Tanenbaum




Re: CURRENT: llvm13 seem to miscompile dns/bind916 (9.16.23)

2021-12-10 Thread Daniel O'Connor via freebsd-current



> On 25 Nov 2021, at 18:50, FreeBSD User  wrote:
> 
> Running CURRENT (FreeBSD 14.0-CURRENT #7 main-n250911-a11983366ea7: Mon Nov 
> 22 18:17:54
> CET 2021 amd64) troubles me with our DNS server/service.
> Aproximately the same time we switched on CURRENT to the CURRENT LLVM13 
> version and also,
> after the compilation of a fresh OS with LLVM13, the upgrade from 
> bind-9.16.22 to
> bind-9.16.23 took place as well as ASLR being the default.
> 
> Since then named is crashing with a mysterious segmentation fault (see PR 
> 259921,
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=259921).
> 
> Disabling ASLR as recommended to check whether ASLR triggers the SegFault did 
> not solve
> the problem, so I suspect a miscompilation due to llvm13.
> 
> On 13-STABLE bind-9.16.23 seem not to have this behaviour.
> 
> I'm floating like a dead man in the water, can someone help?

lang/sdcc also seg faults 
(https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260303), although there 
disabling ASLR via proccontrol does fix it.

Can you show a stacktrace for your seg fault?
--
Daniel O'Connor
"The nice thing about standards is that there
are so many of them to choose from."
 -- Andrew Tanenbaum




Re: 14-current: unable to boot after upgrade (installworld)

2021-12-09 Thread Toomas Soome via freebsd-current



> On 9. Dec 2021, at 20:06, Sergey Dyatko  wrote:
> 
> I was sure the installer did it when I reinstalled the system from scratch. I
> can load 14-current successfully after boot via PXE and installworld with
> 13-current
> now I did the following:
> 1) boot from HDDs FreeBSD 14.0-CURRENT #0 main-n251494-f953785b3df (with
> 'old' world)
> 2)run installworld (f953785b3df)
> 3) run `gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da${D}`
> command , where D=4..7
> root@dl:/usr/src # zpool status
>  pool: zroot
> state: ONLINE
> config:
> 
>NAMESTATE READ WRITE CKSUM
>zroot   ONLINE   0 0 0
>  mirror-0  ONLINE   0 0 0
>da4p2   ONLINE   0 0 0
>da5p2   ONLINE   0 0 0
>  mirror-1  ONLINE   0 0 0
>da6p2   ONLINE   0 0 0
>da7p2   ONLINE   0 0 0
> errors: No known data errors
> 
> after `shutdown -r now` system still doesn't boot  with the same error. As
> far I can see, there is /boot/lua/config.lua present, but when I try to run
> command more /boot/lua/config.lua system hangs with following error:
> https://imgur.com/5p0xu6W.png
> 

You seem to get 2x BTX panic, could you try to create video from console, so we 
could get register dumps?

can this system do UEFI boot and if so, can you test it? Could you post 
partition tables?

rgds,
toomas


> 
> чт, 9 дек. 2021 г. в 15:56, Warner Losh :
> 
>> On Thu, Dec 9, 2021 at 7:58 AM Tomoaki AOKI 
>> wrote:
>> 
>>> On Thu, 9 Dec 2021 13:36:10 +
>>> "Sergey V. Dyatko"  wrote:
>>> 
 Hi,
 
 Yesterday I tried to upgrade old 13-current (svn rev r368473) to fresh
 14-current from git,it looked like this:
 1) git pull https://git.freebsd.org/src.git /usr/src
 2) cd /usr/src ; make buildworld; make kernel
 3) shutdown -r now
 after that I _successfully_ booted into 14-current and continued with
 etcupdate -p
 make installworld
 etcupdate -B
 shutdown -r now
 
 but after that server doesn't come back. After I conneted to this
>> server
>>> via
 IPMI ip-kvm I saw following (sorry for external link):
 https://i.imgur.com/jH6MHd2.png
 
 Well. There was a migration to zol between r368473 and current 'main'
>>> branch so
 I decided to install fresh 14-current from snapshot
 FreeBSD-14.0-CURRENT-amd64-20211202-610d908f8a6-251253 in order to
>> avoid
 possible problems
 
 and again, after make kernel and reboot OS runs, but after installworld
>>> I ended up in the same situation
 
 thoughts ?
 
 --
 wbr, Sergey
 
 
>>> 
>>> Bootcode should be updated.
>>> The procedure you wrote doesn't seem to update it.
>>> 
>> 
>> The posted error is one you get when you can't read the filesystem, which
>> is why you need to update
>> boot blocks across the OpenZFS change.
>> 
>> Warner
>> 




Rock64 configuration fails to boot for main 22c4ab6cb015 but worked for main 06bd74e1e39c (Nov 21): e.MMC mishandled?

2021-12-08 Thread Mark Millard via freebsd-current
[ Note: w...@freebsd.org is only a guess, based on:
https://lists.freebsd.org/archives/dev-commits-src-main/2021-December/001931.html
 ]

Attempting to update to:

main-n251456-22c4ab6cb015-dirty: Tue Dec  7 19:38:53 PST 2021

resulted in boot failure (showing some boot -v output):

. . .
mmc0: Probing bus
. . .
mmc0: SD probe: failed
mmc0: MMC probe: OK (OCR: 0x40ff8080)
mmc0: Current OCR: 0x00ff8080
mmc0: Probing cards
mmc0: New card detected (CID 150100444a4e423452079f43b2ae6313)
mmc0: New card detected (CSD d02701320f5903fff6dbffef8e40400d)
mmc0: Card at relative address 0x0002 added:
mmc0:  card: MMCHC DJNB4R 0.7 SN REPLACED MFG 06/2016 by 21 0x
mmc0:  quirks: 0
mmc0:  bus: 8bit, 200MHz (HS400 with enhanced strobe timing)
mmc0:  memory: 244277248 blocks, erase sector 1024 blocks
mmc0: setting transfer rate to 150.000MHz (HS200 timing)
mmcsd0: taking advantage of TRIM
mmcsd0: cache size 65536KB
mmcsd0: 125GB  at mmc0 
150.0MHz/8bit/1016-block
mmcsd0boot0: 4MB partition 1 at mmcsd0
mmcsd0boot1: 4MB partition 2 at mmcsd0
mmcsd0rpmb: 4MB partition 3 at mmcsd0
. . .
Release APs...done
regulator: shutting down unused regulators
GEOM: new disk mmcsd0
regulator: shutting down vcc_sd... GEOM: new disk mmcsd0boot0
busy
GEOM: new disk mmcsd0boot1
Trying to mount root from ufs:/dev/gpt/Rock64root []...
Unresolved linked clock found: hdmi_phy
Unresolved linked clock found: usb480m_phy
mmcsd0: Error indicated: 4 Failed

Note the the above line. It seems to be unique to
the failure. Continuing the output . . .

uhub2: 1 port with 1 removable, self powered
uhub1: 2 ports with 2 removable, self powered
uhub0: 1 port with 1 removable, self powered
uhub3: 1 port with 1 removable, self powered
ugen4.2:  at usbus4
umass0 on uhub1
umass0:  on 
usbus4
umass0:  SCSI over Bulk-Only; quirks = 0x
umass0:0:0: Attached to scbus0
pass0 at umass-sim0 bus 0 scbus0 target 0 lun 0
pass0:  Fixed Direct Access SPC-4 SCSI device
pass0: Serial Number REPLACED
pass0: 400.000MB/s transfers
da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
da0:  Fixed Direct Access SPC-4 SCSI device
da0: Serial Number REPLACED
da0: 400.000MB/s transfers
da0: 953869MB (1953525168 512 byte sectors)
da0: quirks=0x2
da0: Delete methods: 

Nothing more after that.

An older kernel (1400042) that happened to be available boots
the same configuration when used instead (same world) . . .

main-n250903-06bd74e1e39c-dirty: Sun Nov 21 23:02:57 PST 2021 got:

mmc0: Probing bus
. . .
mmc0: SD probe: failed
mmc0: MMC probe: OK (OCR: 0x40ff8080)
mmc0: Current OCR: 0x00ff8080
mmc0: Probing cards
mmc0: New card detected (CID 150100444a4e423452079f43b2ae6313)
mmc0: New card detected (CSD d02701320f5903fff6dbffef8e40400d)
mmc0: Card at relative address 0x0002 added:
mmc0:  card: MMCHC DJNB4R 0.7 SN REPLACED MFG 06/2016 by 21 0x
mmc0:  quirks: 0
mmc0:  bus: 8bit, 200MHz (HS400 with enhanced strobe timing)
mmc0:  memory: 244277248 blocks, erase sector 1024 blocks
mmc0: setting transfer rate to 52.000MHz (high speed timing)

Note the lack of trying "150.000MHz (HS200 timing)".  Continuing
the output . . .

mmc0: setting bus width to 8 bits high speed timing
mmcsd0: taking advantage of TRIM
mmcsd0: cache size 65536KB
mmcsd0: 125GB  at mmc0 
52.0MHz/8bit/1016-block
mmcsd0boot0: 4MB partition 1 at mmcsd0
mmcsd0boot1: 4MB partition 2 at mmcsd0
mmcsd0rpmb: 4MB partition 3 at mmcsd0

Note: The media is actually an e.MMC . Continuing the output . . .

. . .
Release APs...done
regulator: shutting down unused regulators
GEOM: new disk mmcsd0
regulator: shutting down vcc_sd... Trying to mount root from 
ufs:/dev/gpt/Rock64root []...
GEOM: new disk mmcsd0boot0
busy
GEOM: new disk mmcsd0boot1
Unresolved linked clock found: hdmi_phy
Unresolved linked clock found: usb480m_phy
Root mount waiting for: usbus1 usbus2 usbus3 usbus4 CAM
uhub1: 1 port with 1 removable, self powered
uhub0: 2 ports with 2 removable, self powered
uhub3: 1 port with 1 removable, self powered
uhub2: 1 port with 1 removable, self powered
ugen4.2:  at usbus4
umass0 on uhub0
umass0:  on 
usbus4
umass0:  SCSI over Bulk-Only; quirks = 0x
umass0:0:0: Attached to scbus0
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
GEOM: new disk da0
pass0 at umass-sim0 bus 0 scbus0 target 0 lun 0
pass0:  Fixed Direct Access SPC-4 SCSI device
pass0: Serial Number REPLACED
pass0: 400.000MB/s transfers
da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
da0:  Fixed Direct Access SPC-4 SCSI device
da0: Serial Number REPLACED
da0: 400.000MB/s transfers
da0: 953869MB (1953525168 512 byte sectors)
da0: quirks=0x2
da0: Delete methods: 
random: unblocking device.
Warning: bad time from time-of-day clock, system time will not be set accurately
Dual Console: Serial Primary, Video Secondary
start_init: trying /sbin/init
. . .

(I'll stop with that.

new boot message: "/etc/rc: WARNING: $zfskeys_enable is not set properly - see rc.conf(5)."?

2021-12-08 Thread Mark Millard via freebsd-current
As of my update to (some line splitting applied):

# uname -apKU
FreeBSD CA72_UFS 14.0-CURRENT FreeBSD 14.0-CURRENT #25 
main-n251456-22c4ab6cb015-dirty:
Tue Dec  7 19:38:53 PST 2021
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
arm64 aarch64 1400043 1400043

my boot sequences are getting a report of:

# dmesg -a | grep zfsk
/etc/rc: WARNING: $zfskeys_enable is not set properly - see rc.conf(5).

for the likes of a system (aarch64 example) based on:

# gpart show
=>40  2000409184  ada0  GPT  (954G)
  40  409600 1  efi  (200M)
  409640  1740636160 2  freebsd-ufs  (830G)
  1741045800   117440512 3  freebsd-swap  (56G)
  1858486312   134217728 4  freebsd-swap  (64G)
  1992704040 7705184- free -  (3.7G)

But I also get the notice for a system (aarch64 again) based on:

# gpart show
=>40  1875384928  nda1  GPT  (894G)
  40  532480 1  efi  (260M)
  5325202008- free -  (1.0M)
  534528   515899392 2  freebsd-swap  (246G)
   51643392020971520- free -  (10G)
   537405440  1337979528 3  freebsd-zfs  (638G)

The amd64 system gets the same message.

The note to see rc.conf(5) is misleading for main 22c4ab6cb015 :

# man 5 rc.conf | grep zfsk
#

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: git: 5e04571cf3cf - main - sys/bitset.h: reduce visibility of BIT_* macros

2021-12-06 Thread Mark Millard via freebsd-current
On 2021-Dec-6, at 14:19, Mark Millard  wrote:

> This broke building lang/gcc11 so may be a exp run is appropriate:
> 
> In file included from /usr/include/sys/cpuset.h:39,
> from /usr/include/sched.h:36,
> from /usr/include/pthread.h:48,
> from 
> /wrkdirs/usr/ports/lang/gcc11/work/gcc-11.2.0/gcc/jit/libgccjit.c:27:
> /usr/include/sys/bitset.h:314:36: error: attempt to use poisoned "malloc"
>  314 | #define __BITSET_ALLOC(_s, mt, mf) malloc(__BITSET_SIZE((_s)), mt, 
> (mf))
>  |^
> . . .
> 
> In file included from /usr/include/sys/cpuset.h:39,
> from /usr/include/sched.h:36,
> from /usr/include/pthread.h:48,
> from 
> /wrkdirs/usr/ports/lang/gcc11/work/gcc-11.2.0/gcc/jit/jit-recording.c:28:
> /usr/include/sys/bitset.h:314:36: error: attempt to use poisoned "malloc"
>  314 | #define __BITSET_ALLOC(_s, mt, mf) malloc(__BITSET_SIZE((_s)), mt, 
> (mf))
>  |^
> . . .
> gmake[4]: *** [Makefile:1142: jit/libgccjit.o] Error 1
> gmake[4]: *** Waiting for unfinished jobs
> In file included from /usr/include/sys/cpuset.h:39,
> from /usr/include/sched.h:36,
> from /usr/include/pthread.h:48,
> from 
> /wrkdirs/usr/ports/lang/gcc11/work/gcc-11.2.0/gcc/jit/jit-playback.c:44:
> /usr/include/sys/bitset.h:314:36: error: attempt to use poisoned "malloc"
>  314 | #define __BITSET_ALLOC(_s, mt, mf) malloc(__BITSET_SIZE((_s)), mt, 
> (mf))
>  |^
> gmake[4]: *** [Makefile:1142: jit/jit-recording.o] Error 1
> gmake[4]: *** [Makefile:1142: jit/jit-playback.o] Error 1
> rm gcc.pod gfortran.pod
> gmake[4]: Leaving directory '/wrkdirs/usr/ports/lang/gcc11/work/.build/gcc'
> gmake[3]: *** [Makefile:4817: all-stage2-gcc] Error 2
> 
> 
> For reference:
> 
> # uname -apKU
> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #24 
> main-n251362-9f32cb5b1c81-dirty: Sun Dec  5 21:16:30 PST 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 1400043 1400043
> 
> ~/fbsd-based-on-what-commit.sh 
> branch: main
> merge-base: 9bed79e869721b4ca8a15c527db8d40969867c2c
> merge-base: CommitDate: 2021-12-06 03:45:47 +
> 9bed79e86972 (HEAD -> main, freebsd/main, freebsd/HEAD) www/drupal9: update 
> to 9.2.10
> n567671 (--first-parent --count for merge-base)
> 
> My test of building on amd64 is still in progress.

Just like the poudriere-devel based build on aarch64,
amd64's poudriere-devel based build got:

In file included from /usr/include/sys/cpuset.h:39,
 from /usr/include/sched.h:36,
 from /usr/include/pthread.h:48,
 from 
/wrkdirs/usr/ports/lang/gcc11/work/gcc-11.2.0/gcc/jit/libgccjit.c:27:
/usr/include/sys/bitset.h:314:36: error: attempt to use poisoned "malloc"
  314 | #define __BITSET_ALLOC(_s, mt, mf) malloc(__BITSET_SIZE((_s)), mt, (mf))
  |^
. . .
In file included from /usr/include/sys/cpuset.h:39,
 from /usr/include/sched.h:36,
 from /usr/include/pthread.h:48,
 from 
/wrkdirs/usr/ports/lang/gcc11/work/gcc-11.2.0/gcc/jit/jit-recording.c:28:
/usr/include/sys/bitset.h:314:36: error: attempt to use poisoned "malloc"
  314 | #define __BITSET_ALLOC(_s, mt, mf) malloc(__BITSET_SIZE((_s)), mt, (mf))
  |^
. . .
gmake[4]: *** [Makefile:1142: jit/libgccjit.o] Error 1
gmake[4]: *** Waiting for unfinished jobs
In file included from /usr/include/sys/cpuset.h:39,
 from /usr/include/sched.h:36,
 from /usr/include/pthread.h:48,
 from 
/wrkdirs/usr/ports/lang/gcc11/work/gcc-11.2.0/gcc/jit/jit-playback.c:44:
/usr/include/sys/bitset.h:314:36: error: attempt to use poisoned "malloc"
  314 | #define __BITSET_ALLOC(_s, mt, mf) malloc(__BITSET_SIZE((_s)), mt, (mf))
  |^
gmake[4]: *** [Makefile:1142: jit/jit-recording.o] Error 1
gmake[4]: *** [Makefile:1142: jit/jit-playback.o] Error 1
rm gcc.pod gfortran.pod
gmake[4]: Leaving directory '/wrkdirs/usr/ports/lang/gcc11/work/.build/gcc'
gmake[3]: *** [Makefile:4819: all-stage2-gcc] Error 2
. . .

For reference (same sources as used to build for aarch64):

# uname -apKU
FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #15 
main-n251362-9f32cb5b1c81-dirty: Sun Dec  5 18:17:51 PST 2021 
root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
  amd64 amd64 1400043 1400043

# ~/fbsd-based-on-what-commit.sh
branch: main
merge-base: 9bed79e869721b4ca8a15c527db8d40969867c2c
merge-base: CommitDate: 2021-12-06 03:45:47 +
9bed79e86972 (HEAD -> main, freebsd/main, freebsd/HEAD) www/drupal9: update to 
9.2.10
n567671 (--first-parent --count for merge-base)



Re: git: 5e04571cf3cf - main - sys/bitset.h: reduce visibility of BIT_* macros

2021-12-06 Thread Mark Millard via freebsd-current
This broke building lang/gcc11 so may be a exp run is appropriate:

In file included from /usr/include/sys/cpuset.h:39,
 from /usr/include/sched.h:36,
 from /usr/include/pthread.h:48,
 from 
/wrkdirs/usr/ports/lang/gcc11/work/gcc-11.2.0/gcc/jit/libgccjit.c:27:
/usr/include/sys/bitset.h:314:36: error: attempt to use poisoned "malloc"
  314 | #define __BITSET_ALLOC(_s, mt, mf) malloc(__BITSET_SIZE((_s)), mt, (mf))
  |^
. . .

In file included from /usr/include/sys/cpuset.h:39,
 from /usr/include/sched.h:36,
 from /usr/include/pthread.h:48,
 from 
/wrkdirs/usr/ports/lang/gcc11/work/gcc-11.2.0/gcc/jit/jit-recording.c:28:
/usr/include/sys/bitset.h:314:36: error: attempt to use poisoned "malloc"
  314 | #define __BITSET_ALLOC(_s, mt, mf) malloc(__BITSET_SIZE((_s)), mt, (mf))
  |^
. . .
gmake[4]: *** [Makefile:1142: jit/libgccjit.o] Error 1
gmake[4]: *** Waiting for unfinished jobs
In file included from /usr/include/sys/cpuset.h:39,
 from /usr/include/sched.h:36,
 from /usr/include/pthread.h:48,
 from 
/wrkdirs/usr/ports/lang/gcc11/work/gcc-11.2.0/gcc/jit/jit-playback.c:44:
/usr/include/sys/bitset.h:314:36: error: attempt to use poisoned "malloc"
  314 | #define __BITSET_ALLOC(_s, mt, mf) malloc(__BITSET_SIZE((_s)), mt, (mf))
  |^
gmake[4]: *** [Makefile:1142: jit/jit-recording.o] Error 1
gmake[4]: *** [Makefile:1142: jit/jit-playback.o] Error 1
rm gcc.pod gfortran.pod
gmake[4]: Leaving directory '/wrkdirs/usr/ports/lang/gcc11/work/.build/gcc'
gmake[3]: *** [Makefile:4817: all-stage2-gcc] Error 2


For reference:

# uname -apKU
FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #24 
main-n251362-9f32cb5b1c81-dirty: Sun Dec  5 21:16:30 PST 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
  arm64 aarch64 1400043 1400043

 ~/fbsd-based-on-what-commit.sh 
branch: main
merge-base: 9bed79e869721b4ca8a15c527db8d40969867c2c
merge-base: CommitDate: 2021-12-06 03:45:47 +
9bed79e86972 (HEAD -> main, freebsd/main, freebsd/HEAD) www/drupal9: update to 
9.2.10
n567671 (--first-parent --count for merge-base)

My test of building on amd64 is still in progress.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: lib/libufs: build regressed after b366ee486

2021-12-03 Thread Evgeniy Khramtsov via FreeBSD-CURRENT
> dropping https://github.com/DankBSD/ports/commit/56cb9dc72 and

Err, the right commit: https://github.com/DankBSD/base/commit/52ff26f21



Re: lib/libufs: build regressed after b366ee486

2021-12-03 Thread Evgeniy Khramtsov via FreeBSD-CURRENT
> It appears that your build environment does not have the b366ee486 version
> of /usr/src/sys/ufs/ffs/fs.h installed in /usr/include/ufs/ffs/fs.h.
> 
> That would normally happen after your did a buildworld and installworld.
> You should be able to fix your current compile failure by doing:
> 
>   cp /usr/src/sys/ufs/ffs/fs.h /usr/include/ufs/ffs/fs.h.
> 
> Let me know if this fails to correct your problem.
> 
>   Kirk McKusick

Thanks, Kirk.

I was bisecting local enironment, and my issue was related due to
dropping https://github.com/DankBSD/ports/commit/56cb9dc72 and
overriding sys.mk variables in make.conf instead, which led to
include bootstrap being skipped for some reason.



lib/libufs: build regressed after b366ee486

2021-12-03 Thread Evgeniy Khramtsov via FreeBSD-CURRENT
I was updating from

commit 20aa359773befc8182f6b5dcb5aad7390cab6c26
Author: Dimitry Andric 
Date:   Sat Nov 13 21:02:29 2021 +0100

Bump __FreeBSD_version for llvm-project 13.0.0 merge

PR: 258209
MFC after:  2 weeks

to

commit b366ee4868bca2b3ebe4bb29c9590a29b6cecc29 (main)
Author: Kirk McKusick 
Date:   Sun Nov 14 22:09:06 2021 -0800

Consolodate four copies of the STDSB define into a single place.

The STDSB macro is passed to the ffs_sbget() routine to fetch a
UFS/FFS superblock "from the stadard place". It was identically defined
in lib/libufs/libufs.h, stand/libsa/ufs.c, sys/ufs/ffs/ffs_extern.h,
and sys/ufs/ffs/ffs_subr.c. Delete it from these four files and
define it instead in sys/ufs/ffs/fs.h. All existing uses of this macro
already include sys/ufs/ffs/fs.h so no include changes need to be made.

No functional change intended.

Sponsored by: Netflix

$ cd lib/libufs
$ make
[...]
/usr/local/llvm13/bin/clang  -O2 -pipe -fno-common -D_LIBUFS 
-I/usr/src/lib/libufs   -g -gz=zlib -MD  -MF.depend.sblock.o -MTsblock.o 
-std=gnu99 -Wno-format-zero-length -fstack-protector-strong -Wsystem-headers 
-Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign 
-Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable 
-Wno-error=unused-but-set-variable -Wno-tautological-compare -Wno-unused-value 
-Wno-parentheses-equality -Wno-unused-function -Wno-enum-conversion 
-Wno-unused-local-typedef -Wno-address-of-packed-member -Wno-switch 
-Wno-switch-enum -Wno-knr-promoted-parameter  -Qunused-arguments-c sblock.c 
-o sblock.o
sblock.c:59:38: error: use of undeclared identifier 'STDSB'
if ((errno = sbget(disk->d_fd, &fs, STDSB)) != 0) {
^
1 error generated.
*** Error code 1

Stop.
make: stopped in /usr/src/lib/libufs
[...]

My environment is custom (hard to bisect at the moment), but the most
major local modification to base that I have is devel/llvm13 as toolchain.

make.conf:

LLVM=/usr/local/llvm13/bin
ADDR2LINE=${LLVM}/llvm-addr2line
AR=${LLVM}/llvm-ar
AS=${LLVM}/llvm-as
CC=${LLVM}/clang
CPP=${LLVM}/clang-cpp
CPPFILT=${LLVM}/llvm-cxxfilt
CXX=${LLVM}/clang++
DTRACEFLAGS+=-x cpppath=${CPP} -x ldpath=${LD}
LD=${LLVM}/ld
LLVM_LINK=${LLVM}/llvm-link
NM=${LLVM}/llvm-nm
OBJC=${LLVM}/clang
OBJCOPY=${LLVM}/llvm-objcopy
OBJDUMP=${LLVM}/llvm-objdump
RANLIB=${LLVM}/llvm-ranlib
READELF=${LLVM}/llvm-readelf
SIZE=${LLVM}/llvm-size
STRINGS=${LLVM}/llvm-strings
STRIPBIN=${LLVM}/llvm-strip
STRIP_CMD=${STRIPBIN}

src.conf:

WITHOUT_CLANG=yes
WITHOUT_CLANG_BOOTSTRAP=yes
# elftoolchain utils manually deleted, todo: have a knob to turn off build
WITHOUT_ELFTOOLCHAIN_BOOTSTRAP=yes
WITHOUT_LLD=yes
WITHOUT_LLDB=yes
WITHOUT_LLD_BOOTSTRAP=yes
WITHOUT_LLVM_COV=yes



amd64 (example) main [so: 14]: delete-old check-old delete-old-libs missing a bunch of files?

2021-11-30 Thread Mark Millard via freebsd-current
/usr/obj/DESTDIRs/main-amd64-poud/ is a buildworld installation
for poudriere-devel use that I've been updating on occasion for
a while.

Despite:

>>> Checking for old files
>>> Checking for old libraries
>>> Checking for old directories
To remove old files and directories run 'make delete-old'.
To remove old libraries run 'make delete-old-libs'.

in /usr/obj/DESTDIRs/main-amd64-poud for:

# chroot /usr/obj/DESTDIRs/main-amd64-poud uname -apKU
FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #14 
main-n250972-319e9fc642a1-dirty: Tue Nov 23 11:43:26 PST 2021 
root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
  amd64 amd64 1400042 1400042

installing a new directory tree:

# chroot /usr/obj/DESTDIRs/main-amd64-chroot uname -apKU
FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #14 
main-n250972-319e9fc642a1-dirty: Tue Nov 23 11:43:26 PST 2021 
root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
  amd64 amd64 1400042 1400042and diff

ends up with diff -rq between the trees findings many old
files only under /usr/obj/DESTDIRs/main-amd64-poud/ . Various
.old's and .bak's and such are likely expected but most of
the following would seem to not be expected. Checking the
dates indicates files from August and the like (dates not
shown) and the matches being Nov 23.

# diff -rq /usr/obj/DESTDIRs/main-amd64-chroot 
/usr/obj/DESTDIRs/main-amd64-poud | more
diff: /usr/obj/DESTDIRs/main-amd64-chroot/etc/os-release: No such file or 
directory
Only in /usr/obj/DESTDIRs/main-amd64-poud/boot: loader_4th.old
Only in /usr/obj/DESTDIRs/main-amd64-poud/boot: loader_lua.old
Only in /usr/obj/DESTDIRs/main-amd64-poud/boot: loader_simp.old
Only in /usr/obj/DESTDIRs/main-amd64-poud/etc/rc.d: sppp
Only in /usr/obj/DESTDIRs/main-amd64-poud/libexec: ld-elf.so.1.old
Only in /usr/obj/DESTDIRs/main-amd64-poud/libexec: ld-elf32.so.1.old
Only in /usr/obj/DESTDIRs/main-amd64-poud/rescue: spppcontrol
Only in /usr/obj/DESTDIRs/main-amd64-poud/sbin: init.bak
Only in 
/usr/obj/DESTDIRs/main-amd64-poud/usr/include/netgraph/bluetooth/include: 
ng_h4.h
Only in 
/usr/obj/DESTDIRs/main-amd64-poud/usr/lib/include/netgraph/bluetooth/include: 
ng_h4.h
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: lib9p_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libicp_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libicp_rescue_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libnetmap_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libprivateatf-c++_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libprivateatf-c_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libprivateauditd_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libprivateevent1_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libprivategmock_main_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libprivategmock_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libprivategtest_main_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libprivategtest_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libspl_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libstats_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libtpool_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libzfsbootenv_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib: libzutil_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: lib80211_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: lib9p_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libBlocksRuntime_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libalias_dummy_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libalias_ftp_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libalias_irc_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libalias_nbt_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libalias_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libalias_pptp_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libalias_skinny_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libalias_smedia_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libarchive_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libasn1_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libavl_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libbe_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libbegemot_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libblacklist_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libbluetooth_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libbsdxml_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libbsm_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libbsnmp_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libbz2_p.a
Only in /usr/obj/DESTDIRs/main-amd64-poud/usr/lib32: libc++_p.a
Only in /usr/obj/DESTDIRs/m

Re: pkg sqlite database borked ( again ). How to restore?

2021-11-29 Thread Dennis Clarke via freebsd-current
On 11/29/21 06:22, Jamie Landeg-Jones wrote:
> Dennis Clarke via freebsd-current  wrote:
> 
>> europa# xz -dc /var/backups/pkg.sql.xz.3 > /var/db/pkg/local.sqlite.dump
>>
>> europa#
>> europa# pkg backup -r /var/db/pkg/local.sqlite.dump
>> Restoring database:
>> Restoring: 100%
>> pkg: sqlite error while executing backup step in file backup.c:98: not
>> an error
> 
> The backup file consists of sql statements, the pkg backup -r I think
> requires a binary db file.
> 
> I think you need to do this:
> 
> pkg shell < /var/db/pkg/local.sqlite.dump
> 
> Cheers, Jamie
> 

Ah well ... that seems to toss a ton of errors and yet works ?
 europa#
europa# pkg shell < /var/db/pkg/local.sqlite.dump
Error: near line 4: table packages already exists
Error: near line 212: UNIQUE constraint failed: packages.name
Error: near line 246: table mtree already exists
Error: near line 247: table pkg_script already exists
Error: near line 611: table script already exists
Error: near line 612: UNIQUE constraint failed: script.script_id
Error: near line 684: table option already exists
Error: near line 685: UNIQUE constraint failed: option.option_id
Error: near line 1049: table option_desc already exists
Error: near line 1050: table pkg_option already exists
Error: near line 1591: table pkg_option_desc already exists
Error: near line 1592: table pkg_option_default already exists
Error: near line 1593: table deps already exists
Error: near line 2393: table files already exists
Error: near line 61890: UNIQUE constraint failed: files.path
Error: near line 61891: UNIQUE constraint failed: files.path
Error: near line 61892: UNIQUE constraint failed: files.path
Error: near line 61893: UNIQUE constraint failed: files.path
Error: near line 61894: UNIQUE constraint failed: files.path
Error: near line 61895: UNIQUE constraint failed: files.path
Error: near line 61896: UNIQUE constraint failed: files.path
Error: near line 61897: UNIQUE constraint failed: files.path
Error: near line 61898: UNIQUE constraint failed: files.path
Error: near line 61899: UNIQUE constraint failed: files.path
Error: near line 61900: UNIQUE constraint failed: files.path
Error: near line 61901: UNIQUE constraint failed: files.path
Error: near line 61902: UNIQUE constraint failed: files.path
Error: near line 61903: UNIQUE constraint failed: files.path
Error: near line 61904: UNIQUE constraint failed: files.path
Error: near line 61905: UNIQUE constraint failed: files.path
Error: near line 61906: UNIQUE constraint failed: files.path
Error: near line 61907: UNIQUE constraint failed: files.path
Error: near line 61908: UNIQUE constraint failed: files.path
Error: near line 61909: UNIQUE constraint failed: files.path
Error: near line 61910: UNIQUE constraint failed: files.path
Error: near line 61911: UNIQUE constraint failed: files.path
Error: near line 61912: UNIQUE constraint failed: files.path
Error: near line 61913: UNIQUE constraint failed: files.path
Error: near line 61914: UNIQUE constraint failed: files.path
Error: near line 61915: UNIQUE constraint failed: files.path
Error: near line 61916: UNIQUE constraint failed: files.path
Error: near line 61917: UNIQUE constraint failed: files.path
Error: near line 61918: UNIQUE constraint failed: files.path
Error: near line 61919: UNIQUE constraint failed: files.path
Error: near line 61920: UNIQUE constraint failed: files.path
Error: near line 61921: UNIQUE constraint failed: files.path
Error: near line 61922: UNIQUE constraint failed: files.path
Error: near line 61923: UNIQUE constraint failed: files.path
Error: near line 61924: UNIQUE constraint failed: files.path
Error: near line 61925: UNIQUE constraint failed: files.path
Error: near line 61926: UNIQUE constraint failed: files.path
Error: near line 61927: UNIQUE constraint failed: files.path
Error: near line 61928: UNIQUE constraint failed: files.path
Error: near line 61929: UNIQUE constraint failed: files.path
Error: near line 61930: UNIQUE constraint failed: files.path
Error: near line 61931: UNIQUE constraint failed: files.path
Error: near line 61932: UNIQUE constraint failed: files.path
Error: near line 61933: UNIQUE constraint failed: files.path
Error: near line 61934: UNIQUE constraint failed: files.path
Error: near line 61935: UNIQUE constraint failed: files.path
Error: near line 61936: UNIQUE constraint failed: files.path
Error: near line 61937: UNIQUE constraint failed: files.path
Error: near line 61938: UNIQUE constraint failed: files.path
Error: near line 61939: UNIQUE constraint failed: files.path
Error: near line 61940: UNIQUE constraint failed: files.path
Error: near line 61941: UNIQUE constraint failed: files.path
Error: near line 61942: UNIQUE constraint failed: files.path
Error: near line 61943: UNIQUE constraint failed: files.path
Error: near line 61944: UNIQUE constraint failed: files.path
Error: near line 61945: UNIQUE constraint failed: files.path
Error: near line

pkg sqlite database borked ( again ). How to restore?

2021-11-29 Thread Dennis Clarke via freebsd-current


I had another kernel panic on an AMD64 server. This has resulted in the
pkg database being messed up. Also I was running a QEMU instance for
aarch64 and that ended up with a messed up pkg database also.

I saw some docs in section 4.4.8. Restoring the Package Database here:

https://docs.freebsd.org/en/books/handbook/ports/#pkgng-intro

However that does not work and issues a truely worthless error :

europa# uname -apKU
FreeBSD europa 14.0-CURRENT FreeBSD 14.0-CURRENT #6
main-n250839-be60d8f276f: Fri Nov 19 00:02:38 GMT 2021
root@europa:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64 amd64
1400042 1400042
europa#
europa# ls -lap /var/backups/pkg*
-rw-r--r--  1 root  wheel  2714084 Nov 29 03:04 /var/backups/pkg.sql.xz
-rw-r--r--  1 root  wheel  2714084 Nov 28 03:20 /var/backups/pkg.sql.xz.1
-rw-r--r--  1 root  wheel  2714084 Nov 27 03:03 /var/backups/pkg.sql.xz.2
-rw-r--r--  1 root  wheel  2714084 Nov 26 03:03 /var/backups/pkg.sql.xz.3
-rw-r--r--  1 root  wheel  2714084 Nov 25 03:29 /var/backups/pkg.sql.xz.4
-rw-r--r--  1 root  wheel  2712568 Nov 24 03:04 /var/backups/pkg.sql.xz.5
-rw-r--r--  1 root  wheel  2712568 Nov 23 03:03 /var/backups/pkg.sql.xz.6
-rw-r--r--  1 root  wheel  2711928 Nov 22 03:54 /var/backups/pkg.sql.xz.7
europa#

So I took a backup from there that looked reasonable :

europa# xz -dc /var/backups/pkg.sql.xz.3 > /var/db/pkg/local.sqlite.dump

europa#
europa# pkg backup -r /var/db/pkg/local.sqlite.dump
Restoring database:
Restoring: 100%
pkg: sqlite error while executing backup step in file backup.c:98: not
an error
pkg: sqlite error -- (null)
europa# echo $?
1
europa#

I don't know what to make of that mess.

Can I create a new blank sqlite3 database and then restore from that
dump file or is there a method that works better?

Also is there a blank sqlite3 database for pkg on the install media?

-- 
Dennis Clarke
RISC-V/SPARC/PPC/ARM/CISC
UNIX and Linux spoken
GreyBeard and suspenders optional



Re: problem with re(4) interface

2021-11-26 Thread Anthony Jenkins via freebsd-current




On 11/22/21 12:55 PM, Warner Losh wrote:

On Mon, Nov 22, 2021 at 10:51 AM Chuck Tuffli  wrote:


On Mon, Nov 22, 2021 at 9:34 AM Chris  wrote:

On 2021-11-22 08:47, Chuck Tuffli wrote:

Running on a recent-ish -current
# uname -a
FreeBSD stargate.tuffli.net 14.0-CURRENT FreeBSD 14.0-CURRENT
main-81b22a9892 GENERIC  amd64

I'm having trouble using the second NIC interface in a bridge to

provide

network connectivity to bhyve VMs and need some help figuring out what

is

wrong.

...

Because there's subtle differences between them; are you using the re

driver

from base, or from ports?

The driver is from base. Didn't realize there was one in ports.


The ports driver is tricky... It's an older, buggier version of the base
driver... *BUT*
a number of issues that aren't fixed in base are fixed in it (mostly
dealing better with
errata)...  Ideally, we'd pull in the actual fixes from this driver, but
it's a huge patch-set
where it's unclear which bits are for what thing fixed, so nobody (that I
know of) has
gone through and even come up with an ugly patch for -current.

Warner

I use the Realtek BSD driver; it supports one of their newer 2.5GbE 
Ethernet chips on my motherboard.


Aug 22 19:37:29 vickie kernel: re1: Controller> port 0xc000-0xc0ff mem 
0xfc20-0xfc20,0xfc21-0xfc213fff at device 0.0 on pci7

Aug 22 19:37:29 vickie kernel: re1: Using Memory Mapping!
Aug 22 19:37:29 vickie kernel: re1: attempting to allocate 1 MSI-X 
vectors (32 supported)
Aug 22 19:37:29 vickie kernel: msi: routing MSI-X IRQ 84 to local APIC 2 
vector 51

Aug 22 19:37:29 vickie kernel: re1: using IRQ 84 for MSI-X
Aug 22 19:37:29 vickie kernel: re1: Using 1 MSI-X message
Aug 22 19:37:29 vickie kernel: re1: version:1.96.04
Aug 22 19:37:29 vickie kernel: re1: Ethernet address: 2c:f0:5d:**:**:**
Aug 22 19:37:29 vickie kernel:
Aug 22 19:37:29 vickie kernel: This product is covered by one or more of 
the following patents:

Aug 22 19:37:29 vickie kernel: US6,570,884, US6,115,776, and US6,327,625.
Aug 22 19:37:29 vickie kernel: re1: bpf attached
Aug 22 19:37:29 vickie kernel: re1: Ethernet address: 2c:f0:5d:**:**:**

The stock re(4) driver doesn't detect it.

The Realtek driver sources are here

https://www.realtek.com/en/component/zoo/category/network-interface-controllers-10-100-1000m-gigabit-ethernet-pci-express-software

but they're for FreeBSD 7.x and 8.0; I had to patch the driver for my 
14.0-CURRENT  box (panic on an mtx_lock(9) call; adding flag MTX_RECURSE 
to the mtx_init(9) call "fixes" it).


diff --git a/if_rereg.h b/if_rereg.h
index 18592a7..4885063 100755
--- a/if_rereg.h
+++ b/if_rereg.h
@@ -1016,7 +1016,7 @@ enum bits {

 #define RE_LOCK(_sc)   mtx_lock(&(_sc)->mtx)
 #define RE_UNLOCK(_sc) mtx_unlock(&(_sc)->mtx)
-#define RE_LOCK_INIT(_sc,_name) 
mtx_init(&(_sc)->mtx,_name,MTX_NETWORK_LOCK,MTX_DEF)
+#define RE_LOCK_INIT(_sc,_name) 
mtx_init(&(_sc)->mtx,_name,MTX_NETWORK_LOCK,MTX_DEF | MTX_RECURSE)

 #define RE_LOCK_DESTROY(_sc)   mtx_destroy(&(_sc)->mtx)
 #define RE_LOCK_ASSERT(_sc) mtx_assert(&(_sc)->mtx,MA_OWNED)

Maybe I can try making this into a port - oh great, someone beat me to it!

https://www.freshports.org/net/realtek-re-kmod

Looks like they "properly" fix the locking isue - 
https://bugs.freebsd.org/bugzilla/attachment.cgi?id=225980&action=diff


Anthony





Re: aarch64(?) poudiere-devel based builds seem to get fairly-rare corrupted files after recent system update(s?)

2021-11-23 Thread Mark Millard via freebsd-current
On 2021-Nov-21, at 07:50, Mark Millard  wrote:

> On 2021-Nov-20, at 11:54, Mark Millard  wrote:
> 
>> On 2021-Nov-19, at 22:20, Mark Millard  wrote:
>> 
>>> On 2021-Nov-18, at 12:15, Mark Millard  wrote:
>>> 
 On 2021-Nov-17, at 11:17, Mark Millard  wrote:
 
> On 2021-Nov-15, at 15:43, Mark Millard  wrote:
> 
>> On 2021-Nov-15, at 13:13, Mark Millard  wrote:
>> 
>>> On 2021-Nov-15, at 12:51, Mark Millard  wrote:
>>> 
 On 2021-Nov-15, at 11:31, Mark Millard  wrote:
 
> I updated from (shown a system that I've not updated yet):
> 
> # uname -apKU
> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
> main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 
> 1400040 1400040
> 
> to:
> 
> # uname -apKU
> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
> main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 1400042 1400042
> 
> and then updated /usr/ports/ and started poudriere-devel based builds 
> of
> the ports I's set up to use. However my last round of port builds from
> a general update of /usr/ports/ were on 2021-10-23 before either of 
> the
> above.
> 
> I've had at least two files that seem to be corrupted, where a later 
> part
> of the build hits problematical file(s) from earlier build activity. 
> For
> example:
> 
> /usr/local/include/X11/extensions/XvMC.h:1:1: warning: null character 
> ignored [-Wnull-character]
>  
> ^
> /usr/local/include/X11/extensions/XvMC.h:1:2: warning: null character 
> ignored [-Wnull-character]
> 
> ^
> /usr/local/include/X11/extensions/XvMC.h:1:3: warning: null character 
> ignored [-Wnull-character]
>  
>^   
> /usr/local/include/X11/extensions/XvMC.h:1:4: warning: null character 
> ignored [-Wnull-character]
> 
>^
> . . .
> 
> Removing the xorgproto-2021.4 package and rebuilding via
> poudiere-devel did not get a failure of any ports dependent
> on it.
> 
> This was from a use of:
> 
> # poudriere jail -j13_0R-CA7 -i
> Jail name: 13_0R-CA7
> Jail version:  13.0-RELEASE-p5
> Jail arch: arm.armv7
> Jail method:   null
> Jail mount:/usr/obj/DESTDIRs/13_0R-CA7-poud
> Jail fs:   
> Jail updated:  2021-11-04 01:48:49
> Jail pkgbase:  disabled
> 
> but another not-investigated example was from:
> 
> # poudriere jail -j13_0R-CA72 -i
> Jail name: 13_0R-CA72
> Jail version:  13.0-RELEASE-p5
> Jail arch: arm64.aarch64
> Jail method:   null
> Jail mount:/usr/obj/DESTDIRs/13_0R-CA72-poud
> Jail fs:   
> Jail updated:  2021-11-04 01:48:01
> Jail pkgbase:  disabled
> 
> (so no 32-bit COMPAT involved). The apparent corruption
> was in a different port (autoconfig, noticed by the
> build of automake failing via config reporting
> /usr/local/share/autoconf-2.69/autoconf/autoconf.m4f
> being rejected).
> 
> /usr/obj/DESTDIRs/13_0R-CA7-poud/ and
> /usr/obj/DESTDIRs/13_0R-CA72-poud/ and the like track the
> system versions.
> 
> The media is an Optane 960 in the PCIe slot of a HoneyComb
> (16 Cortex-A72's). The context is a root on ZFS one, ZFS
> used in order to have bectl, not redundancy.
> 
> The ThreadRipper 1950X (so amd64) port builds did not give
> evidence of such problems based on the updated system. (Also
> Optane media in a PCIe slot, also root on ZFS.) But the
> errors seem rare enough to not be able to conclude much.
 
 For aarch64 targeting aarch64 there was also this
 explicit corruption notice during the poudriere(-devel)
 bulk build:
 
 . . .
 [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3: .
 pkg-static: Fail to extract 
 /usr/local/libexec/gcc/arm-none-eabi/8.4.0/lto1 from package: Lzma 
 library error: Corrupted input data
 [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3... done
 
 Failed to install the following 1 package(s): 
 /packages/All/arm-no

Re: aarch64(?) poudiere-devel based builds seem to get fairly-rare corrupted files after recent system update(s?)

2021-11-21 Thread Mark Millard via freebsd-current



On 2021-Nov-20, at 11:54, Mark Millard  wrote:

> On 2021-Nov-19, at 22:20, Mark Millard  wrote:
> 
>> On 2021-Nov-18, at 12:15, Mark Millard  wrote:
>> 
>>> On 2021-Nov-17, at 11:17, Mark Millard  wrote:
>>> 
 On 2021-Nov-15, at 15:43, Mark Millard  wrote:
 
> On 2021-Nov-15, at 13:13, Mark Millard  wrote:
> 
>> On 2021-Nov-15, at 12:51, Mark Millard  wrote:
>> 
>>> On 2021-Nov-15, at 11:31, Mark Millard  wrote:
>>> 
 I updated from (shown a system that I've not updated yet):
 
 # uname -apKU
 FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
 main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
   arm64 aarch64 
 1400040 1400040
 
 to:
 
 # uname -apKU
 FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
 main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
   arm64 aarch64 1400042 1400042
 
 and then updated /usr/ports/ and started poudriere-devel based builds 
 of
 the ports I's set up to use. However my last round of port builds from
 a general update of /usr/ports/ were on 2021-10-23 before either of the
 above.
 
 I've had at least two files that seem to be corrupted, where a later 
 part
 of the build hits problematical file(s) from earlier build activity. 
 For
 example:
 
 /usr/local/include/X11/extensions/XvMC.h:1:1: warning: null character 
 ignored [-Wnull-character]
  
 ^
 /usr/local/include/X11/extensions/XvMC.h:1:2: warning: null character 
 ignored [-Wnull-character]
 
 ^
 /usr/local/include/X11/extensions/XvMC.h:1:3: warning: null character 
 ignored [-Wnull-character]
  
 ^   
 /usr/local/include/X11/extensions/XvMC.h:1:4: warning: null character 
 ignored [-Wnull-character]
 
 ^
 . . .
 
 Removing the xorgproto-2021.4 package and rebuilding via
 poudiere-devel did not get a failure of any ports dependent
 on it.
 
 This was from a use of:
 
 # poudriere jail -j13_0R-CA7 -i
 Jail name: 13_0R-CA7
 Jail version:  13.0-RELEASE-p5
 Jail arch: arm.armv7
 Jail method:   null
 Jail mount:/usr/obj/DESTDIRs/13_0R-CA7-poud
 Jail fs:   
 Jail updated:  2021-11-04 01:48:49
 Jail pkgbase:  disabled
 
 but another not-investigated example was from:
 
 # poudriere jail -j13_0R-CA72 -i
 Jail name: 13_0R-CA72
 Jail version:  13.0-RELEASE-p5
 Jail arch: arm64.aarch64
 Jail method:   null
 Jail mount:/usr/obj/DESTDIRs/13_0R-CA72-poud
 Jail fs:   
 Jail updated:  2021-11-04 01:48:01
 Jail pkgbase:  disabled
 
 (so no 32-bit COMPAT involved). The apparent corruption
 was in a different port (autoconfig, noticed by the
 build of automake failing via config reporting
 /usr/local/share/autoconf-2.69/autoconf/autoconf.m4f
 being rejected).
 
 /usr/obj/DESTDIRs/13_0R-CA7-poud/ and
 /usr/obj/DESTDIRs/13_0R-CA72-poud/ and the like track the
 system versions.
 
 The media is an Optane 960 in the PCIe slot of a HoneyComb
 (16 Cortex-A72's). The context is a root on ZFS one, ZFS
 used in order to have bectl, not redundancy.
 
 The ThreadRipper 1950X (so amd64) port builds did not give
 evidence of such problems based on the updated system. (Also
 Optane media in a PCIe slot, also root on ZFS.) But the
 errors seem rare enough to not be able to conclude much.
>>> 
>>> For aarch64 targeting aarch64 there was also this
>>> explicit corruption notice during the poudriere(-devel)
>>> bulk build:
>>> 
>>> . . .
>>> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3: .
>>> pkg-static: Fail to extract 
>>> /usr/local/libexec/gcc/arm-none-eabi/8.4.0/lto1 from package: Lzma 
>>> library error: Corrupted input data
>>> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3... done
>>> 
>>> Failed to install the following 1 package(s): 
>>> /packages/All/arm-none-eabi-gcc-8.4.0_3.pkg
>>> *** Error code 1
>>> Stop.
>>> make: stopped in /usr/ports/sysutils/u-boot-orangepi-plus-2e
>>> 
>>> I'm not yet to the point of re

Re: 14.0-CURRENT panic in early boot

2021-11-20 Thread Daniel Morante via freebsd-current

I've seen it on an arm64 system:

KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x174
panic() at panic+0x44
do_el1h_sync() at do_el1h_sync+0x184
handle_el1h_sync() at handle_el1h_sync+0x78
--- exception, esr 0x200
vt_conswindow() at vt_conswindow+0x10
(null)() at -0x4
(null)() at 0x0001
Uptime: 3s

The full log is here: 
http://venus.morante.net/downloads/unibia/screenshots/freebsd/R281-T91/14-CURRENT-4082b189d2c/failed-boot1.txt


On 11/18/2021 12:22 AM, Dustin Marquess wrote:

I just updated a machine from a build that was ~2 weeks old. The
latest commit when I built it was 2e946f87055.

The system boots using UEFI, if that matters. The system is panicking
pretty early in the boot, however:

real memory  = 137438953472 (131072 MB)
avail memory = 133651496960 (127460 MB)
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 32 CPUs
FreeBSD/SMP: 2 package(s) x 8 core(s) x 2 hardware threads
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
random: unblocking device.
kernel trap 12 with interrupts disabled
panic: vm_fault_lookup: fault on nofault entry, addr: 0x81e1d000
cpuid = 0
time = 1


The backtrace shows:

KDB: stack backtrace:
#0 0x806deb5b at kdb_backtrace+0x6b
#1 0x80693b44 at vpanic+0x184
#2 0x806939b3 at panic+0x43
#3 0x8091d4b3 at vm_fault+0x1423
#4 0x8091bfb0 at vm_fault_trap+0xb0
#5 0x809c0902 at trap_pfault+0x1f2
#6 0x809992b8 at calltrap+0x8
#7 0x806ebcc1 at vsscanf+0x31
#8 0x806ebc7f at sscanf+0x3f
#9 0x806bd9ab at validate_uuid+0x8b
#10 0x80655be0 at prison0_init+0x90
#11 0x80623aba at proc0_init+0x29a
#12 0x80623689 at mi_startup+0xe9
#13 0x802e3062 at btext+0x22
Uptime: 1s

Compared to a boot using the old working kernel:

real memory  = 137438953472 (131072 MB)
avail memory = 133651505152 (127460 MB)
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 32 CPUs
FreeBSD/SMP: 2 package(s) x 8 core(s) x 2 hardware threads
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
random: unblocking device.
ioapic0  irqs 0-23
ioapic1  irqs 24-47
ioapic2  irqs 48-71
Launching APs: 1 11 28 15 18 6 29 4 16 9 24 7 3 10 27 22 14 13 12 23
25 20 26 30 17 5 2 21 19 8 31
Timecounter "TSC-low" frequency 1197250876 Hz quality 1000

Has anybody else seen this?
-Dustin



smime.p7s
Description: S/MIME Cryptographic Signature


Re: aarch64(?) poudiere-devel based builds seem to get fairly-rare corrupted files after recent system update(s?)

2021-11-20 Thread Mark Millard via freebsd-current
On 2021-Nov-19, at 22:20, Mark Millard  wrote:

> On 2021-Nov-18, at 12:15, Mark Millard  wrote:
> 
>> On 2021-Nov-17, at 11:17, Mark Millard  wrote:
>> 
>>> On 2021-Nov-15, at 15:43, Mark Millard  wrote:
>>> 
 On 2021-Nov-15, at 13:13, Mark Millard  wrote:
 
> On 2021-Nov-15, at 12:51, Mark Millard  wrote:
> 
>> On 2021-Nov-15, at 11:31, Mark Millard  wrote:
>> 
>>> I updated from (shown a system that I've not updated yet):
>>> 
>>> # uname -apKU
>>> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
>>> main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
>>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>>>   arm64 aarch64 
>>> 1400040 1400040
>>> 
>>> to:
>>> 
>>> # uname -apKU
>>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
>>> main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
>>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>>>   arm64 aarch64 1400042 1400042
>>> 
>>> and then updated /usr/ports/ and started poudriere-devel based builds of
>>> the ports I's set up to use. However my last round of port builds from
>>> a general update of /usr/ports/ were on 2021-10-23 before either of the
>>> above.
>>> 
>>> I've had at least two files that seem to be corrupted, where a later 
>>> part
>>> of the build hits problematical file(s) from earlier build activity. For
>>> example:
>>> 
>>> /usr/local/include/X11/extensions/XvMC.h:1:1: warning: null character 
>>> ignored [-Wnull-character]
>>>  
>>> ^
>>> /usr/local/include/X11/extensions/XvMC.h:1:2: warning: null character 
>>> ignored [-Wnull-character]
>>> 
>>>  ^
>>> /usr/local/include/X11/extensions/XvMC.h:1:3: warning: null character 
>>> ignored [-Wnull-character]
>>>  
>>>  ^   
>>> /usr/local/include/X11/extensions/XvMC.h:1:4: warning: null character 
>>> ignored [-Wnull-character]
>>> 
>>>  ^
>>> . . .
>>> 
>>> Removing the xorgproto-2021.4 package and rebuilding via
>>> poudiere-devel did not get a failure of any ports dependent
>>> on it.
>>> 
>>> This was from a use of:
>>> 
>>> # poudriere jail -j13_0R-CA7 -i
>>> Jail name: 13_0R-CA7
>>> Jail version:  13.0-RELEASE-p5
>>> Jail arch: arm.armv7
>>> Jail method:   null
>>> Jail mount:/usr/obj/DESTDIRs/13_0R-CA7-poud
>>> Jail fs:   
>>> Jail updated:  2021-11-04 01:48:49
>>> Jail pkgbase:  disabled
>>> 
>>> but another not-investigated example was from:
>>> 
>>> # poudriere jail -j13_0R-CA72 -i
>>> Jail name: 13_0R-CA72
>>> Jail version:  13.0-RELEASE-p5
>>> Jail arch: arm64.aarch64
>>> Jail method:   null
>>> Jail mount:/usr/obj/DESTDIRs/13_0R-CA72-poud
>>> Jail fs:   
>>> Jail updated:  2021-11-04 01:48:01
>>> Jail pkgbase:  disabled
>>> 
>>> (so no 32-bit COMPAT involved). The apparent corruption
>>> was in a different port (autoconfig, noticed by the
>>> build of automake failing via config reporting
>>> /usr/local/share/autoconf-2.69/autoconf/autoconf.m4f
>>> being rejected).
>>> 
>>> /usr/obj/DESTDIRs/13_0R-CA7-poud/ and
>>> /usr/obj/DESTDIRs/13_0R-CA72-poud/ and the like track the
>>> system versions.
>>> 
>>> The media is an Optane 960 in the PCIe slot of a HoneyComb
>>> (16 Cortex-A72's). The context is a root on ZFS one, ZFS
>>> used in order to have bectl, not redundancy.
>>> 
>>> The ThreadRipper 1950X (so amd64) port builds did not give
>>> evidence of such problems based on the updated system. (Also
>>> Optane media in a PCIe slot, also root on ZFS.) But the
>>> errors seem rare enough to not be able to conclude much.
>> 
>> For aarch64 targeting aarch64 there was also this
>> explicit corruption notice during the poudriere(-devel)
>> bulk build:
>> 
>> . . .
>> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3: .
>> pkg-static: Fail to extract 
>> /usr/local/libexec/gcc/arm-none-eabi/8.4.0/lto1 from package: Lzma 
>> library error: Corrupted input data
>> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3... done
>> 
>> Failed to install the following 1 package(s): 
>> /packages/All/arm-none-eabi-gcc-8.4.0_3.pkg
>> *** Error code 1
>> Stop.
>> make: stopped in /usr/ports/sysutils/u-boot-orangepi-plus-2e
>> 
>> I'm not yet to the point of retrying after removing
>> arm-none-eabi-gcc-8.4.0_3 : other things are being built.
> 
> 
> Another context with my prior general update of /usr/ports/
> and the matching port

Re: aarch64(?) poudiere-devel based builds seem to get fairly-rare corrupted files after recent system update(s?)

2021-11-19 Thread Mark Millard via freebsd-current
On 2021-Nov-18, at 12:15, Mark Millard  wrote:

> On 2021-Nov-17, at 11:17, Mark Millard  wrote:
> 
>> On 2021-Nov-15, at 15:43, Mark Millard  wrote:
>> 
>>> On 2021-Nov-15, at 13:13, Mark Millard  wrote:
>>> 
 On 2021-Nov-15, at 12:51, Mark Millard  wrote:
 
> On 2021-Nov-15, at 11:31, Mark Millard  wrote:
> 
>> I updated from (shown a system that I've not updated yet):
>> 
>> # uname -apKU
>> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
>> main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>>   arm64 aarch64 
>> 1400040 1400040
>> 
>> to:
>> 
>> # uname -apKU
>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
>> main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>>   arm64 aarch64 1400042 1400042
>> 
>> and then updated /usr/ports/ and started poudriere-devel based builds of
>> the ports I's set up to use. However my last round of port builds from
>> a general update of /usr/ports/ were on 2021-10-23 before either of the
>> above.
>> 
>> I've had at least two files that seem to be corrupted, where a later part
>> of the build hits problematical file(s) from earlier build activity. For
>> example:
>> 
>> /usr/local/include/X11/extensions/XvMC.h:1:1: warning: null character 
>> ignored [-Wnull-character]
>>  
>> ^
>> /usr/local/include/X11/extensions/XvMC.h:1:2: warning: null character 
>> ignored [-Wnull-character]
>> 
>>   ^
>> /usr/local/include/X11/extensions/XvMC.h:1:3: warning: null character 
>> ignored [-Wnull-character]
>>  
>>   ^   
>> /usr/local/include/X11/extensions/XvMC.h:1:4: warning: null character 
>> ignored [-Wnull-character]
>> 
>>   ^
>> . . .
>> 
>> Removing the xorgproto-2021.4 package and rebuilding via
>> poudiere-devel did not get a failure of any ports dependent
>> on it.
>> 
>> This was from a use of:
>> 
>> # poudriere jail -j13_0R-CA7 -i
>> Jail name: 13_0R-CA7
>> Jail version:  13.0-RELEASE-p5
>> Jail arch: arm.armv7
>> Jail method:   null
>> Jail mount:/usr/obj/DESTDIRs/13_0R-CA7-poud
>> Jail fs:   
>> Jail updated:  2021-11-04 01:48:49
>> Jail pkgbase:  disabled
>> 
>> but another not-investigated example was from:
>> 
>> # poudriere jail -j13_0R-CA72 -i
>> Jail name: 13_0R-CA72
>> Jail version:  13.0-RELEASE-p5
>> Jail arch: arm64.aarch64
>> Jail method:   null
>> Jail mount:/usr/obj/DESTDIRs/13_0R-CA72-poud
>> Jail fs:   
>> Jail updated:  2021-11-04 01:48:01
>> Jail pkgbase:  disabled
>> 
>> (so no 32-bit COMPAT involved). The apparent corruption
>> was in a different port (autoconfig, noticed by the
>> build of automake failing via config reporting
>> /usr/local/share/autoconf-2.69/autoconf/autoconf.m4f
>> being rejected).
>> 
>> /usr/obj/DESTDIRs/13_0R-CA7-poud/ and
>> /usr/obj/DESTDIRs/13_0R-CA72-poud/ and the like track the
>> system versions.
>> 
>> The media is an Optane 960 in the PCIe slot of a HoneyComb
>> (16 Cortex-A72's). The context is a root on ZFS one, ZFS
>> used in order to have bectl, not redundancy.
>> 
>> The ThreadRipper 1950X (so amd64) port builds did not give
>> evidence of such problems based on the updated system. (Also
>> Optane media in a PCIe slot, also root on ZFS.) But the
>> errors seem rare enough to not be able to conclude much.
> 
> For aarch64 targeting aarch64 there was also this
> explicit corruption notice during the poudriere(-devel)
> bulk build:
> 
> . . .
> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3: .
> pkg-static: Fail to extract 
> /usr/local/libexec/gcc/arm-none-eabi/8.4.0/lto1 from package: Lzma 
> library error: Corrupted input data
> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3... done
> 
> Failed to install the following 1 package(s): 
> /packages/All/arm-none-eabi-gcc-8.4.0_3.pkg
> *** Error code 1
> Stop.
> make: stopped in /usr/ports/sysutils/u-boot-orangepi-plus-2e
> 
> I'm not yet to the point of retrying after removing
> arm-none-eabi-gcc-8.4.0_3 : other things are being built.
 
 
 Another context with my prior general update of /usr/ports/
 and the matching port builds: Back then I used USE_TMPFS=all
 but the failure is based on USE_TMPFS-"data" instead. So:
 lots more I/O.
 
>>> 
>>> None of the 3 corruptions repeated during bu

Re: FYI: aarch64 main [so: 14] system hung up with a large amount of memory in use (given the RAM+SWAP configuration) but lots of swap left

2021-11-19 Thread Mark Millard via freebsd-current
On 2021-Nov-13, at 03:40, Mark Millard  wrote:

> On 2021-Nov-13, at 03:20, Mark Millard  wrote:
> 
> 
>> While attempting to see if I could repeat a bugzilla report in a
>> somewhat different context, I has the system hang up to the
>> point that ^C and ^Z did not work and ^T did not echo out what
>> would be expected for poudriere (or even the kernel backtrace).
>> I was able to escape to ddb.
>> 
>> The context was Cortex-A72 based aarch64 system using:
>> 
>> # poudriere jail -jmain-CA7 -i
>> Jail name: main-CA7
>> Jail version:  14.0-CURRENT
>> Jail arch: arm.armv7
>> Jail method:   null
>> Jail mount:/usr/obj/DESTDIRs/main-CA7-poud
>> Jail fs:   
>> Jail updated:  2021-06-27 17:58:33
>> Jail pkgbase:  disabled
>> 
>> # uname -apKU
>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
>> main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>>   arm64 aarch64 1400040 1400040
>> 
>> It is a non-debug build (but with symbols).
>> 
>> 16 cortex-A72 cores, 64 GiBytes RAM, root on ZFS, 251904Mi swap,
>> USE_TMPFS=all in use. ALLOW_PARALLEL_JOBS= in use too.
>> (Mentioned only for context: I've no specific evidence if other
>> contexts would also have failed, say, USE+TMPFS="data" or UFS.)

Of course not a "+": USE_TMPFS="data"

>> When I looked around at the db> prompts I noticed one
>> oddity (I'm no expert at such inspections):
>> 
>> db> show allchains
>> . . .
>> chain 92:
>> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
>> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
>> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
>> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
>> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
>> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
>> . . . (thousands of more instances of that line content,
>>  I never found the last) . . .
>> 
>> My patched top (that reports some "maximum observed" (MaxObs???)
>> figures) was showing (having hung up with the system):
>> 
>> last pid: 18816;  load averages: 10.11, 16.76, 18.73 MaxObs: 115.65, 103.13, 
>> 96.36
>>   up 8+06:52:04  20:30:57
>> 324 threads:   17 running, 305 sleeping, 2 waiting, 147 MaxObsRunning
>> CPU:  2.8% user,  0.0% nice, 97.1% system,  0.0% interrupt,  0.0% idle
>> Mem: 19044Ki Active, 331776B Inact, 73728B Laundry, 6950Mi Wired, 69632B 
>> Buf, 558860Ki Free, 47709Mi MaxObsActive, 12556Mi MaxObsWired, 59622Mi 
>> MaxObs(Act+Wir+Lndry)
>> ARC: 2005Mi Total, 623319Ki MFU, 654020Ki MRU, 2048Ki Anon, 27462Ki Header, 
>> 745685Ki Other
>>783741Ki Compressed, 3981Mi Uncompressed, 5.20:1 Ratio
>> Swap: 251904Mi Total, 101719Mi Used, 150185Mi Free, 40% Inuse, 3432Ki In, 
>> 3064Ki Out, 101719Mi MaxObsUsed, 101737Mi MaxObs(Act+Lndry+SwapUsed), 
>> 109816Mi MaxObs(Act+Wir+Lndry+SwapUsed)
>> 
>> (Based on the 20:30:57 time shown, it had been hung up for over
>> 2 hours when I got to it.)
>> 
>> There were no console messages. /var/log/messages had its
>> last message at 18:57:52. No out-of-swap or such
>> messages.
>> 
>> 
>> I did get a dump via the db> prompt.
>> 
> 
> In retrying the poudriere-devel run expiriment I'm
> getting various builds that are generating
> multi-GiByte log files (and growing) that have
> lines like:
> 
> thread 'rustc' panicked at 'capacity overflow', 
> library/alloc/src/raw_vec.rs:559:5
> stack backtrace:
> note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose 
> backtrace.
> 
> error: internal compiler error: unexpected panic
> 
> note: the compiler unexpectedly panicked. this is a bug.
> 
> note: we would appreciate a bug report: 
> https://github.com/rust-lang/rust/issues/new?labels=C-bug%2C+I-ICE%2C+T-compiler&template=ice.md
> 
> 
> note: rustc 1.55.0 running on armv7-unknown-freebsd
> 
> note: compiler flags: -C embed-bitcode=no -C debuginfo=2 -C linker=cc 
> --crate-type lib
> 
> note: some of the compiler flags provided by cargo are hidden
> 
> query stack during panic:
> #0 [trimmed_def_paths] calculating trimmed def paths
> #1 [lint_mod] linting module `transitions`
> #2 [analysis] running analysis passes on this crate
> end of query stack
> thread 'rustc' panicked at 'cannot panic during the backtrace function', 
> library/std/src/../../backtrace/src/lib.rs:147:13
> stack backtrace:
>   0: 0x4710076c -  core::fmt::Display>::fmt::h4428caffcb182c5b
>   1: 0x471c9d00 - core::fmt::write::h91f4a7678561fd61
>   2: 0x470e2180 - 
>   3: 0x470ebd40 - 
>   4: 0x470eb824 - 
>   5: 0x41ed4848 - 
>   6: 0x470ec690 - std::panicking::rust_panic_with_hook::h6bc4b7e83060df25
>   7: 0x47100f0c - 
>   8: 0x47100900 - 
>   9: 0x470ec374 - 
> . . .
>  65: 0x470ee71c - 
>  66: 0x401361bc - 
>  67: 0x40135c

Re: FYI: amd64 system: "error: fileid changed. fsid 0:0: expected fileid 0xa, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE)"?

2021-11-18 Thread Mark Millard via freebsd-current
On 2021-Nov-18, at 15:54, Mark Millard  wrote:

> The ThreadRipper 1950X FreeBSD system is reporting a message at
> 03:01:10 or so for the last 3 days. The .148 is a aarch64 system
> (HoneyComb). None of the other systems (aarch64 Small Board
> Computers and a armv7 SBC) are reporting such.
> 
> Nov 16 03:01:10 amd64_ZFS kernel: newnfs: server '192.168.1.148' error: 
> fileid changed. fsid 0:0: expected fileid 0xa, got 0x2. (BROKEN NFS SERVER OR 
> MIDDLEWARE)
> Nov 17 03:01:10 amd64_ZFS kernel: newnfs: server '192.168.1.148' error: 
> fileid changed. fsid 0:0: expected fileid 0xa, got 0x2. (BROKEN NFS SERVER OR 
> MIDDLEWARE)
> Nov 18 03:01:09 amd64_ZFS kernel: newnfs: server '192.168.1.148' error: 
> fileid changed. fsid 0:0: expected fileid 0xa, got 0x2. (BROKEN NFS SERVER OR 
> MIDDLEWARE)
> 
> # uname -apKU
> FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #10 
> main-n250667-20aa359773be-dirty: Sun Nov 14 00:24:51 PST 2021 
> root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
>   amd64 amd64 1400042 1400042
> 
> The .148 :
> 
> # uname -apKU
> FreeBSD HC_CA72_UFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
> main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 1400042 1400042
> 
> (At times it is booted from a root-on-zfs media or a USB3 SSD UFS media
> instead but it is the same system build that is running for all those.)
> 


For reference:

I forgot to mention that the other aarch systems are running:

# uname -apKU
FreeBSD FBSDmacch 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
  arm64 aarch64 1400040 1400040

(all installed from the same build, so the example is sufficient).
The armv7 system is running:

# uname -apKU
FreeBSD OPiP2E_RPi2v11 14.0-CURRENT FreeBSD 14.0-CURRENT #14 
main-n250455-890cae197737-dirty: Thu Nov  4 16:13:56 PDT 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA7-nodbg-clang/usr/main-src/arm.armv7/sys/GENERIC-NODBG-CA7
  arm armv7 1400040 1400040

(At some point they will all be updated.)

This vintage difference might be part of why they do not
report anything.



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




FYI: amd64 system: "error: fileid changed. fsid 0:0: expected fileid 0xa, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE)"?

2021-11-18 Thread Mark Millard via freebsd-current
The ThreadRipper 1950X FreeBSD system is reporting a message at
03:01:10 or so for the last 3 days. The .148 is a aarch64 system
(HoneyComb). None of the other systems (aarch64 Small Board
Computers and a armv7 SBC) are reporting such.

Nov 16 03:01:10 amd64_ZFS kernel: newnfs: server '192.168.1.148' error: fileid 
changed. fsid 0:0: expected fileid 0xa, got 0x2. (BROKEN NFS SERVER OR 
MIDDLEWARE)
Nov 17 03:01:10 amd64_ZFS kernel: newnfs: server '192.168.1.148' error: fileid 
changed. fsid 0:0: expected fileid 0xa, got 0x2. (BROKEN NFS SERVER OR 
MIDDLEWARE)
Nov 18 03:01:09 amd64_ZFS kernel: newnfs: server '192.168.1.148' error: fileid 
changed. fsid 0:0: expected fileid 0xa, got 0x2. (BROKEN NFS SERVER OR 
MIDDLEWARE)

# uname -apKU
FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #10 
main-n250667-20aa359773be-dirty: Sun Nov 14 00:24:51 PST 2021 
root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
  amd64 amd64 1400042 1400042

The .148 :

# uname -apKU
FreeBSD HC_CA72_UFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
  arm64 aarch64 1400042 1400042

(At times it is booted from a root-on-zfs media or a USB3 SSD UFS media
instead but it is the same system build that is running for all those.)


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: "Khelp module "ertt" can't unload until its refcount drops from 1 to 0." after "All buffers synced."?

2021-11-18 Thread Mark Millard via freebsd-current
> On 2021-Nov-18, at 12:31, tue...@freebsd.org  wrote:
> 
>> On 17. Nov 2021, at 21:13, Mark Millard via freebsd-current 
>>  wrote:
>> 
>> I've not noticed the ertt message before in:
>> 
>> . . .
>> Waiting (max 60 seconds) for system thread `bufspacedaemon-1' to stop... done
>> All buffers synced.
>> Uptime: 1d9h57m18s
>> Khelp module "ertt" can't unload until its refcount drops from 1 to 0.
> Hi Mark,
> 
> what kernel configuration are you using? What kernel modules are loaded?

The shutdown was of my ZFS boot media but the machine is
currently doing builds on the UFS media. (The ZFS media is
present but not mounted). For now I provide information
from the booted UFS system. The UFS context is intended
to be nearly a copy of the brctl selection for main [so: 14]
from the ZFS media. Both systems have been doing the same
poudriere builds for various comparison/contrast purposes.
The current build activity will likely take 16+ hrs.

For reference for now (from UFS context):

# uname -apKU
FreeBSD HC_CA72_UFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
  arm64 aarch64 1400042 1400042

This environment was built with -mcpu=cortex-a72 in use and the
system is a 16 core Cortex-A72 system. (I've caught a FreeBSD
weak memory model error in the past with such -mcpu=cortext-a72
use on a Cortext-A72 system, not that I have evidence of such
here.)

# kldstat
Id Refs AddressSize Name
 1   18 0x  12b1660 kernel
 21 0x012b2000   41e070 zfs.ko
 31 0x016d100026af8 cryptodev.ko
 41 0x00019a4027000 nullfs.ko
 51 0x00019a42700025000 fdescfs.ko

# more /usr/main-src/sys/arm64/conf/GENERIC-NODBG-CA72
include "GENERIC-NODBG"

ident   GENERIC-NODBG-CA72

makeoptions CONF_CFLAGS="-mcpu=cortex-a72"

# more /usr/main-src/sys/arm64/conf/GENERIC-NODBG
#
# GENERIC -- Custom configuration for the arm64/aarch64
#

include "GENERIC"

ident   GENERIC-NODBG

makeoptions DEBUG=-g# Build kernel with gdb(1) debug symbols

options ALT_BREAK_TO_DEBUGGER

options KDB # Enable kernel debugger support

# For minimum debugger support (stable branch) use:
#optionsKDB_TRACE   # Print a stack trace for a panic
options DDB # Enable the kernel debugger

# Extra stuff:
#optionsVERBOSE_SYSINIT=0   # Enable verbose sysinit messages
#optionsBOOTVERBOSE=1
#optionsBOOTHOWTO=RB_VERBOSE
#optionsKTR
#optionsKTR_MASK=KTR_TRAP
##options   KTR_CPUMASK=0xF
#optionsKTR_VERBOSE

# Disable any extra checking for. . .
nooptions   DEADLKRES   # Enable the deadlock resolver
nooptions   INVARIANTS  # Enable calls of extra sanity checking
nooptions   INVARIANT_SUPPORT   # Extra sanity checks of internal 
structures, required by INVARIANTS
nooptions   WITNESS # Enable checks to detect deadlocks and 
cycles
nooptions   WITNESS_SKIPSPIN# Don't run witness on spinlocks for 
speed
nooptions   DIAGNOSTIC
nooptions   MALLOC_DEBUG_MAXZONES   # Separate malloc(9) zones
nooptions   BUF_TRACKING
nooptions   FULL_BUF_TRACKING


The builds do include a bulk for targeting armv7
(so 32-bit compatibility) but that bulk has not
started yet in the UFS context.

# poudriere jail -jmain-CA7 -i
Jail name: main-CA7
Jail version:  14.0-CURRENT
Jail arch: arm.armv7
Jail method:   null
Jail mount:/usr/obj/DESTDIRs/main-CA7-poud
Jail fs:   
Jail updated:  2021-11-16 02:24:57
Jail pkgbase:  disabled

My networking context is simple context does not provide
services outside the local network. The services are normal
ones, such as nfs, ssh, and whatever default things. ntpd
is in use. distfiles are downloaded during builds. It is not
a desktop environment, not even a video card.

I've noticed a 2021-Feb-22 report in:

https://forums.freebsd.org/threads/whats-the-error-khelp-module-ertt-unload.79009/

so getting the message does not appear to be unique to my
context.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: aarch64(?) poudiere-devel based builds seem to get fairly-rare corrupted files after recent system update(s?)

2021-11-18 Thread Mark Millard via freebsd-current
On 2021-Nov-17, at 11:17, Mark Millard  wrote:

> On 2021-Nov-15, at 15:43, Mark Millard  wrote:
> 
>> On 2021-Nov-15, at 13:13, Mark Millard  wrote:
>> 
>>> On 2021-Nov-15, at 12:51, Mark Millard  wrote:
>>> 
 On 2021-Nov-15, at 11:31, Mark Millard  wrote:
 
> I updated from (shown a system that I've not updated yet):
> 
> # uname -apKU
> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
> main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 
> 1400040 1400040
> 
> to:
> 
> # uname -apKU
> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
> main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 1400042 1400042
> 
> and then updated /usr/ports/ and started poudriere-devel based builds of
> the ports I's set up to use. However my last round of port builds from
> a general update of /usr/ports/ were on 2021-10-23 before either of the
> above.
> 
> I've had at least two files that seem to be corrupted, where a later part
> of the build hits problematical file(s) from earlier build activity. For
> example:
> 
> /usr/local/include/X11/extensions/XvMC.h:1:1: warning: null character 
> ignored [-Wnull-character]
>  
> ^
> /usr/local/include/X11/extensions/XvMC.h:1:2: warning: null character 
> ignored [-Wnull-character]
> 
>^
> /usr/local/include/X11/extensions/XvMC.h:1:3: warning: null character 
> ignored [-Wnull-character]
>  
>^   
> /usr/local/include/X11/extensions/XvMC.h:1:4: warning: null character 
> ignored [-Wnull-character]
> 
>^
> . . .
> 
> Removing the xorgproto-2021.4 package and rebuilding via
> poudiere-devel did not get a failure of any ports dependent
> on it.
> 
> This was from a use of:
> 
> # poudriere jail -j13_0R-CA7 -i
> Jail name: 13_0R-CA7
> Jail version:  13.0-RELEASE-p5
> Jail arch: arm.armv7
> Jail method:   null
> Jail mount:/usr/obj/DESTDIRs/13_0R-CA7-poud
> Jail fs:   
> Jail updated:  2021-11-04 01:48:49
> Jail pkgbase:  disabled
> 
> but another not-investigated example was from:
> 
> # poudriere jail -j13_0R-CA72 -i
> Jail name: 13_0R-CA72
> Jail version:  13.0-RELEASE-p5
> Jail arch: arm64.aarch64
> Jail method:   null
> Jail mount:/usr/obj/DESTDIRs/13_0R-CA72-poud
> Jail fs:   
> Jail updated:  2021-11-04 01:48:01
> Jail pkgbase:  disabled
> 
> (so no 32-bit COMPAT involved). The apparent corruption
> was in a different port (autoconfig, noticed by the
> build of automake failing via config reporting
> /usr/local/share/autoconf-2.69/autoconf/autoconf.m4f
> being rejected).
> 
> /usr/obj/DESTDIRs/13_0R-CA7-poud/ and
> /usr/obj/DESTDIRs/13_0R-CA72-poud/ and the like track the
> system versions.
> 
> The media is an Optane 960 in the PCIe slot of a HoneyComb
> (16 Cortex-A72's). The context is a root on ZFS one, ZFS
> used in order to have bectl, not redundancy.
> 
> The ThreadRipper 1950X (so amd64) port builds did not give
> evidence of such problems based on the updated system. (Also
> Optane media in a PCIe slot, also root on ZFS.) But the
> errors seem rare enough to not be able to conclude much.
 
 For aarch64 targeting aarch64 there was also this
 explicit corruption notice during the poudriere(-devel)
 bulk build:
 
 . . .
 [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3: .
 pkg-static: Fail to extract 
 /usr/local/libexec/gcc/arm-none-eabi/8.4.0/lto1 from package: Lzma library 
 error: Corrupted input data
 [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3... done
 
 Failed to install the following 1 package(s): 
 /packages/All/arm-none-eabi-gcc-8.4.0_3.pkg
 *** Error code 1
 Stop.
 make: stopped in /usr/ports/sysutils/u-boot-orangepi-plus-2e
 
 I'm not yet to the point of retrying after removing
 arm-none-eabi-gcc-8.4.0_3 : other things are being built.
>>> 
>>> 
>>> Another context with my prior general update of /usr/ports/
>>> and the matching port builds: Back then I used USE_TMPFS=all
>>> but the failure is based on USE_TMPFS-"data" instead. So:
>>> lots more I/O.
>>> 
>> 
>> None of the 3 corruptions repeated during bulk builds that
>> retried the builds that generated the files. All of the
>> ports that failed by hitting the corruptions in what they
>> depended on, built fine in teh retries

"Khelp module "ertt" can't unload until its refcount drops from 1 to 0." after "All buffers synced."?

2021-11-17 Thread Mark Millard via freebsd-current
I've not noticed the ertt message before in:

. . .
Waiting (max 60 seconds) for system thread `bufspacedaemon-1' to stop... done
All buffers synced.
Uptime: 1d9h57m18s
Khelp module "ertt" can't unload until its refcount drops from 1 to 0.



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: aarch64(?) poudiere-devel based builds seem to get fairly-rare corrupted files after recent system update(s?)

2021-11-17 Thread Mark Millard via freebsd-current
On 2021-Nov-15, at 15:43, Mark Millard  wrote:

> On 2021-Nov-15, at 13:13, Mark Millard  wrote:
> 
>> On 2021-Nov-15, at 12:51, Mark Millard  wrote:
>> 
>>> On 2021-Nov-15, at 11:31, Mark Millard  wrote:
>>> 
 I updated from (shown a system that I've not updated yet):
 
 # uname -apKU
 FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
 main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
   arm64 aarch64 
 1400040 1400040
 
 to:
 
 # uname -apKU
 FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
 main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
   arm64 aarch64 1400042 1400042
 
 and then updated /usr/ports/ and started poudriere-devel based builds of
 the ports I's set up to use. However my last round of port builds from
 a general update of /usr/ports/ were on 2021-10-23 before either of the
 above.
 
 I've had at least two files that seem to be corrupted, where a later part
 of the build hits problematical file(s) from earlier build activity. For
 example:
 
 /usr/local/include/X11/extensions/XvMC.h:1:1: warning: null character 
 ignored [-Wnull-character]
  
 ^
 /usr/local/include/X11/extensions/XvMC.h:1:2: warning: null character 
 ignored [-Wnull-character]
 
 ^
 /usr/local/include/X11/extensions/XvMC.h:1:3: warning: null character 
 ignored [-Wnull-character]
  
 ^   
 /usr/local/include/X11/extensions/XvMC.h:1:4: warning: null character 
 ignored [-Wnull-character]
 
 ^
 . . .
 
 Removing the xorgproto-2021.4 package and rebuilding via
 poudiere-devel did not get a failure of any ports dependent
 on it.
 
 This was from a use of:
 
 # poudriere jail -j13_0R-CA7 -i
 Jail name: 13_0R-CA7
 Jail version:  13.0-RELEASE-p5
 Jail arch: arm.armv7
 Jail method:   null
 Jail mount:/usr/obj/DESTDIRs/13_0R-CA7-poud
 Jail fs:   
 Jail updated:  2021-11-04 01:48:49
 Jail pkgbase:  disabled
 
 but another not-investigated example was from:
 
 # poudriere jail -j13_0R-CA72 -i
 Jail name: 13_0R-CA72
 Jail version:  13.0-RELEASE-p5
 Jail arch: arm64.aarch64
 Jail method:   null
 Jail mount:/usr/obj/DESTDIRs/13_0R-CA72-poud
 Jail fs:   
 Jail updated:  2021-11-04 01:48:01
 Jail pkgbase:  disabled
 
 (so no 32-bit COMPAT involved). The apparent corruption
 was in a different port (autoconfig, noticed by the
 build of automake failing via config reporting
 /usr/local/share/autoconf-2.69/autoconf/autoconf.m4f
 being rejected).
 
 /usr/obj/DESTDIRs/13_0R-CA7-poud/ and
 /usr/obj/DESTDIRs/13_0R-CA72-poud/ and the like track the
 system versions.
 
 The media is an Optane 960 in the PCIe slot of a HoneyComb
 (16 Cortex-A72's). The context is a root on ZFS one, ZFS
 used in order to have bectl, not redundancy.
 
 The ThreadRipper 1950X (so amd64) port builds did not give
 evidence of such problems based on the updated system. (Also
 Optane media in a PCIe slot, also root on ZFS.) But the
 errors seem rare enough to not be able to conclude much.
>>> 
>>> For aarch64 targeting aarch64 there was also this
>>> explicit corruption notice during the poudriere(-devel)
>>> bulk build:
>>> 
>>> . . .
>>> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3: .
>>> pkg-static: Fail to extract /usr/local/libexec/gcc/arm-none-eabi/8.4.0/lto1 
>>> from package: Lzma library error: Corrupted input data
>>> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3... done
>>> 
>>> Failed to install the following 1 package(s): 
>>> /packages/All/arm-none-eabi-gcc-8.4.0_3.pkg
>>> *** Error code 1
>>> Stop.
>>> make: stopped in /usr/ports/sysutils/u-boot-orangepi-plus-2e
>>> 
>>> I'm not yet to the point of retrying after removing
>>> arm-none-eabi-gcc-8.4.0_3 : other things are being built.
>> 
>> 
>> Another context with my prior general update of /usr/ports/
>> and the matching port builds: Back then I used USE_TMPFS=all
>> but the failure is based on USE_TMPFS-"data" instead. So:
>> lots more I/O.
>> 
> 
> None of the 3 corruptions repeated during bulk builds that
> retried the builds that generated the files. All of the
> ports that failed by hitting the corruptions in what they
> depended on, built fine in teh retries.
> 
> For reference:
> 
> I'll note that, back when I was using USE_TMPFS=all , I also
> did some separate bulk -a test runs, both aarch64 (Cortex-A72)
> native and Cortext-A72 targ

Re: cross-compiling for i386 on amd64 fails

2021-11-16 Thread Michael Butler via freebsd-current

I should have been more specific ..

I'm observing that "/usr/src/release/release.sh -c release-i386.conf" 
fails when targeting a i386 build on an amd64 host :-(



On 11/16/21 02:33, Warner Losh wrote:

A meta-build worked for me just now...

Warner

On Mon, Nov 15, 2021 at 9:35 PM Michael Butler via freebsd-current <
freebsd-current@freebsd.org> wrote:


Haven't had time to identify which change caused this yet but I now get ..

===> lib/libsbuf (obj,all,install)
===> cddl/lib/libumem (obj,all,install)
===> cddl/lib/libnvpair (obj,all,install)
===> cddl/lib/libavl (obj,all,install)
ld: error: /usr/obj/usr/src/i386.i386/tmp/usr/lib/libspl.a(assert.o) is
incompatible with elf_i386_fbsd
===> cddl/lib/libspl (obj,all,install)
cc: error: linker command failed with exit code 1 (use -v to see
invocation)
--- libavl.so.2 ---
*** [libavl.so.2] Error code 1

make[4]: stopped in /usr/src/cddl/lib/libavl

 imb









cross-compiling for i386 on amd64 fails

2021-11-15 Thread Michael Butler via freebsd-current

Haven't had time to identify which change caused this yet but I now get ..

===> lib/libsbuf (obj,all,install)
===> cddl/lib/libumem (obj,all,install)
===> cddl/lib/libnvpair (obj,all,install)
===> cddl/lib/libavl (obj,all,install)
ld: error: /usr/obj/usr/src/i386.i386/tmp/usr/lib/libspl.a(assert.o) is 
incompatible with elf_i386_fbsd

===> cddl/lib/libspl (obj,all,install)
cc: error: linker command failed with exit code 1 (use -v to see invocation)
--- libavl.so.2 ---
*** [libavl.so.2] Error code 1

make[4]: stopped in /usr/src/cddl/lib/libavl

imb



Re: aarch64(?) poudiere-devel based builds seem to get fairly-rare corrupted files after recent system update(s?)

2021-11-15 Thread Mark Millard via freebsd-current



On 2021-Nov-15, at 13:13, Mark Millard  wrote:

> On 2021-Nov-15, at 12:51, Mark Millard  wrote:
> 
>> On 2021-Nov-15, at 11:31, Mark Millard  wrote:
>> 
>>> I updated from (shown a system that I've not updated yet):
>>> 
>>> # uname -apKU
>>> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
>>> main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
>>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>>>   arm64 aarch64 
>>> 1400040 1400040
>>> 
>>> to:
>>> 
>>> # uname -apKU
>>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
>>> main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
>>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>>>   arm64 aarch64 1400042 1400042
>>> 
>>> and then updated /usr/ports/ and started poudriere-devel based builds of
>>> the ports I's set up to use. However my last round of port builds from
>>> a general update of /usr/ports/ were on 2021-10-23 before either of the
>>> above.
>>> 
>>> I've had at least two files that seem to be corrupted, where a later part
>>> of the build hits problematical file(s) from earlier build activity. For
>>> example:
>>> 
>>> /usr/local/include/X11/extensions/XvMC.h:1:1: warning: null character 
>>> ignored [-Wnull-character]
>>>  
>>> ^
>>> /usr/local/include/X11/extensions/XvMC.h:1:2: warning: null character 
>>> ignored [-Wnull-character]
>>> 
>>>  ^
>>> /usr/local/include/X11/extensions/XvMC.h:1:3: warning: null character 
>>> ignored [-Wnull-character]
>>>  
>>>  ^   
>>> /usr/local/include/X11/extensions/XvMC.h:1:4: warning: null character 
>>> ignored [-Wnull-character]
>>> 
>>>  ^
>>> . . .
>>> 
>>> Removing the xorgproto-2021.4 package and rebuilding via
>>> poudiere-devel did not get a failure of any ports dependent
>>> on it.
>>> 
>>> This was from a use of:
>>> 
>>> # poudriere jail -j13_0R-CA7 -i
>>> Jail name: 13_0R-CA7
>>> Jail version:  13.0-RELEASE-p5
>>> Jail arch: arm.armv7
>>> Jail method:   null
>>> Jail mount:/usr/obj/DESTDIRs/13_0R-CA7-poud
>>> Jail fs:   
>>> Jail updated:  2021-11-04 01:48:49
>>> Jail pkgbase:  disabled
>>> 
>>> but another not-investigated example was from:
>>> 
>>> # poudriere jail -j13_0R-CA72 -i
>>> Jail name: 13_0R-CA72
>>> Jail version:  13.0-RELEASE-p5
>>> Jail arch: arm64.aarch64
>>> Jail method:   null
>>> Jail mount:/usr/obj/DESTDIRs/13_0R-CA72-poud
>>> Jail fs:   
>>> Jail updated:  2021-11-04 01:48:01
>>> Jail pkgbase:  disabled
>>> 
>>> (so no 32-bit COMPAT involved). The apparent corruption
>>> was in a different port (autoconfig, noticed by the
>>> build of automake failing via config reporting
>>> /usr/local/share/autoconf-2.69/autoconf/autoconf.m4f
>>> being rejected).
>>> 
>>> /usr/obj/DESTDIRs/13_0R-CA7-poud/ and
>>> /usr/obj/DESTDIRs/13_0R-CA72-poud/ and the like track the
>>> system versions.
>>> 
>>> The media is an Optane 960 in the PCIe slot of a HoneyComb
>>> (16 Cortex-A72's). The context is a root on ZFS one, ZFS
>>> used in order to have bectl, not redundancy.
>>> 
>>> The ThreadRipper 1950X (so amd64) port builds did not give
>>> evidence of such problems based on the updated system. (Also
>>> Optane media in a PCIe slot, also root on ZFS.) But the
>>> errors seem rare enough to not be able to conclude much.
>> 
>> For aarch64 targeting aarch64 there was also this
>> explicit corruption notice during the poudriere(-devel)
>> bulk build:
>> 
>> . . .
>> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3: .
>> pkg-static: Fail to extract /usr/local/libexec/gcc/arm-none-eabi/8.4.0/lto1 
>> from package: Lzma library error: Corrupted input data
>> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3... done
>> 
>> Failed to install the following 1 package(s): 
>> /packages/All/arm-none-eabi-gcc-8.4.0_3.pkg
>> *** Error code 1
>> Stop.
>> make: stopped in /usr/ports/sysutils/u-boot-orangepi-plus-2e
>> 
>> I'm not yet to the point of retrying after removing
>> arm-none-eabi-gcc-8.4.0_3 : other things are being built.
> 
> 
> Another context with my prior general update of /usr/ports/
> and the matching port builds: Back then I used USE_TMPFS=all
> but the failure is based on USE_TMPFS-"data" instead. So:
> lots more I/O.
> 

None of the 3 corruptions repeated during bulk builds that
retried the builds that generated the files. All of the
ports that failed by hitting the corruptions in what they
depended on, built fine in teh retries.

For reference:

I'll note that, back when I was using USE_TMPFS=all , I also
did some separate bulk -a test runs, both aarch64 (Cortex-A72)
native and Cortext-A72 targeting Cortex-A7 (armv7). None of
those showed evidence of file corruptions. In general I've
not had previous file corruptions with this system. (There
was a little more than 245 GiBytes sw

Re: aarch64(?) poudiere-devel based builds seem to get fairly-rare corrupted files after recent system update(s?)

2021-11-15 Thread Mark Millard via freebsd-current
On 2021-Nov-15, at 12:51, Mark Millard  wrote:

> On 2021-Nov-15, at 11:31, Mark Millard  wrote:
> 
>> I updated from (shown a system that I've not updated yet):
>> 
>> # uname -apKU
>> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
>> main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>>   arm64 aarch64 
>> 1400040 1400040
>> 
>> to:
>> 
>> # uname -apKU
>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
>> main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>>   arm64 aarch64 1400042 1400042
>> 
>> and then updated /usr/ports/ and started poudriere-devel based builds of
>> the ports I's set up to use. However my last round of port builds from
>> a general update of /usr/ports/ were on 2021-10-23 before either of the
>> above.
>> 
>> I've had at least two files that seem to be corrupted, where a later part
>> of the build hits problematical file(s) from earlier build activity. For
>> example:
>> 
>> /usr/local/include/X11/extensions/XvMC.h:1:1: warning: null character 
>> ignored [-Wnull-character]
>>  
>> ^
>> /usr/local/include/X11/extensions/XvMC.h:1:2: warning: null character 
>> ignored [-Wnull-character]
>> 
>>   ^
>> /usr/local/include/X11/extensions/XvMC.h:1:3: warning: null character 
>> ignored [-Wnull-character]
>>  
>>   ^   
>> /usr/local/include/X11/extensions/XvMC.h:1:4: warning: null character 
>> ignored [-Wnull-character]
>> 
>>   ^
>> . . .
>> 
>> Removing the xorgproto-2021.4 package and rebuilding via
>> poudiere-devel did not get a failure of any ports dependent
>> on it.
>> 
>> This was from a use of:
>> 
>> # poudriere jail -j13_0R-CA7 -i
>> Jail name: 13_0R-CA7
>> Jail version:  13.0-RELEASE-p5
>> Jail arch: arm.armv7
>> Jail method:   null
>> Jail mount:/usr/obj/DESTDIRs/13_0R-CA7-poud
>> Jail fs:   
>> Jail updated:  2021-11-04 01:48:49
>> Jail pkgbase:  disabled
>> 
>> but another not-investigated example was from:
>> 
>> # poudriere jail -j13_0R-CA72 -i
>> Jail name: 13_0R-CA72
>> Jail version:  13.0-RELEASE-p5
>> Jail arch: arm64.aarch64
>> Jail method:   null
>> Jail mount:/usr/obj/DESTDIRs/13_0R-CA72-poud
>> Jail fs:   
>> Jail updated:  2021-11-04 01:48:01
>> Jail pkgbase:  disabled
>> 
>> (so no 32-bit COMPAT involved). The apparent corruption
>> was in a different port (autoconfig, noticed by the
>> build of automake failing via config reporting
>> /usr/local/share/autoconf-2.69/autoconf/autoconf.m4f
>> being rejected).
>> 
>> /usr/obj/DESTDIRs/13_0R-CA7-poud/ and
>> /usr/obj/DESTDIRs/13_0R-CA72-poud/ and the like track the
>> system versions.
>> 
>> The media is an Optane 960 in the PCIe slot of a HoneyComb
>> (16 Cortex-A72's). The context is a root on ZFS one, ZFS
>> used in order to have bectl, not redundancy.
>> 
>> The ThreadRipper 1950X (so amd64) port builds did not give
>> evidence of such problems based on the updated system. (Also
>> Optane media in a PCIe slot, also root on ZFS.) But the
>> errors seem rare enough to not be able to conclude much.
> 
> For aarch64 targeting aarch64 there was also this
> explicit corruption notice during the poudriere(-devel)
> bulk build:
> 
> . . .
> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3: .
> pkg-static: Fail to extract /usr/local/libexec/gcc/arm-none-eabi/8.4.0/lto1 
> from package: Lzma library error: Corrupted input data
> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3... done
> 
> Failed to install the following 1 package(s): 
> /packages/All/arm-none-eabi-gcc-8.4.0_3.pkg
> *** Error code 1
> Stop.
> make: stopped in /usr/ports/sysutils/u-boot-orangepi-plus-2e
> 
> I'm not yet to the point of retrying after removing
> arm-none-eabi-gcc-8.4.0_3 : other things are being built.


Another context with my prior general update of /usr/ports/
and the matching port builds: Back then I used USE_TMPFS=all
but the failure is based on USE_TMPFS-"data" instead. So:
lots more I/O.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: aarch64(?) poudiere-devel based builds seem to get fairly-rare corrupted files after recent system update(s?)

2021-11-15 Thread Mark Millard via freebsd-current



On 2021-Nov-15, at 11:31, Mark Millard  wrote:

> I updated from (shown a system that I've not updated yet):
> 
> # uname -apKU
> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
> main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 
> 1400040 1400040
> 
> to:
> 
> # uname -apKU
> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
> main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 1400042 1400042
> 
> and then updated /usr/ports/ and started poudriere-devel based builds of
> the ports I's set up to use. However my last round of port builds from
> a general update of /usr/ports/ were on 2021-10-23 before either of the
> above.
> 
> I've had at least two files that seem to be corrupted, where a later part
> of the build hits problematical file(s) from earlier build activity. For
> example:
> 
> /usr/local/include/X11/extensions/XvMC.h:1:1: warning: null character ignored 
> [-Wnull-character]
>  
> ^
> /usr/local/include/X11/extensions/XvMC.h:1:2: warning: null character ignored 
> [-Wnull-character]
> 
>^
> /usr/local/include/X11/extensions/XvMC.h:1:3: warning: null character ignored 
> [-Wnull-character]
>  
>^   
> /usr/local/include/X11/extensions/XvMC.h:1:4: warning: null character ignored 
> [-Wnull-character]
> 
>^
> . . .
> 
> Removing the xorgproto-2021.4 package and rebuilding via
> poudiere-devel did not get a failure of any ports dependent
> on it.
> 
> This was from a use of:
> 
> # poudriere jail -j13_0R-CA7 -i
> Jail name: 13_0R-CA7
> Jail version:  13.0-RELEASE-p5
> Jail arch: arm.armv7
> Jail method:   null
> Jail mount:/usr/obj/DESTDIRs/13_0R-CA7-poud
> Jail fs:   
> Jail updated:  2021-11-04 01:48:49
> Jail pkgbase:  disabled
> 
> but another not-investigated example was from:
> 
> # poudriere jail -j13_0R-CA72 -i
> Jail name: 13_0R-CA72
> Jail version:  13.0-RELEASE-p5
> Jail arch: arm64.aarch64
> Jail method:   null
> Jail mount:/usr/obj/DESTDIRs/13_0R-CA72-poud
> Jail fs:   
> Jail updated:  2021-11-04 01:48:01
> Jail pkgbase:  disabled
> 
> (so no 32-bit COMPAT involved). The apparent corruption
> was in a different port (autoconfig, noticed by the
> build of automake failing via config reporting
> /usr/local/share/autoconf-2.69/autoconf/autoconf.m4f
> being rejected).
> 
> /usr/obj/DESTDIRs/13_0R-CA7-poud/ and
> /usr/obj/DESTDIRs/13_0R-CA72-poud/ and the like track the
> system versions.
> 
> The media is an Optane 960 in the PCIe slot of a HoneyComb
> (16 Cortex-A72's). The context is a root on ZFS one, ZFS
> used in order to have bectl, not redundancy.
> 
> The ThreadRipper 1950X (so amd64) port builds did not give
> evidence of such problems based on the updated system. (Also
> Optane media in a PCIe slot, also root on ZFS.) But the
> errors seem rare enough to not be able to conclude much.

For aarch64 targeting aarch64 there was also this
explicit corruption notice during the poudriere(-devel)
bulk build:

. . .
[CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3: .
pkg-static: Fail to extract /usr/local/libexec/gcc/arm-none-eabi/8.4.0/lto1 
from package: Lzma library error: Corrupted input data
[CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3... done

Failed to install the following 1 package(s): 
/packages/All/arm-none-eabi-gcc-8.4.0_3.pkg
*** Error code 1
Stop.
make: stopped in /usr/ports/sysutils/u-boot-orangepi-plus-2e

I'm not yet to the point of retrying after removing
arm-none-eabi-gcc-8.4.0_3 : other things are being built.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




aarch64(?) poudiere-devel based builds seem to get fairly-rare corrupted files after recent system update(s?)

2021-11-15 Thread Mark Millard via freebsd-current
I updated from (shown a system that I've not updated yet):

# uname -apKU
FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
  arm64 aarch64 
1400040 1400040

to:

# uname -apKU
FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 
main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
  arm64 aarch64 1400042 1400042

and then updated /usr/ports/ and started poudriere-devel based builds of
the ports I's set up to use. However my last round of port builds from
a general update of /usr/ports/ were on 2021-10-23 before either of the
above.

I've had at least two files that seem to be corrupted, where a later part
of the build hits problematical file(s) from earlier build activity. For
example:

/usr/local/include/X11/extensions/XvMC.h:1:1: warning: null character ignored 
[-Wnull-character]
 
^
/usr/local/include/X11/extensions/XvMC.h:1:2: warning: null character ignored 
[-Wnull-character]

^
/usr/local/include/X11/extensions/XvMC.h:1:3: warning: null character ignored 
[-Wnull-character]
 
^   
/usr/local/include/X11/extensions/XvMC.h:1:4: warning: null character ignored 
[-Wnull-character]

^
. . .

Removing the xorgproto-2021.4 package and rebuilding via
poudiere-devel did not get a failure of any ports dependent
on it.

This was from a use of:

# poudriere jail -j13_0R-CA7 -i
Jail name: 13_0R-CA7
Jail version:  13.0-RELEASE-p5
Jail arch: arm.armv7
Jail method:   null
Jail mount:/usr/obj/DESTDIRs/13_0R-CA7-poud
Jail fs:   
Jail updated:  2021-11-04 01:48:49
Jail pkgbase:  disabled

but another not-investigated example was from:

# poudriere jail -j13_0R-CA72 -i
Jail name: 13_0R-CA72
Jail version:  13.0-RELEASE-p5
Jail arch: arm64.aarch64
Jail method:   null
Jail mount:/usr/obj/DESTDIRs/13_0R-CA72-poud
Jail fs:   
Jail updated:  2021-11-04 01:48:01
Jail pkgbase:  disabled

(so no 32-bit COMPAT involved). The apparent corruption
was in a different port (autoconfig, noticed by the
build of automake failing via config reporting
/usr/local/share/autoconf-2.69/autoconf/autoconf.m4f
being rejected).

/usr/obj/DESTDIRs/13_0R-CA7-poud/ and
/usr/obj/DESTDIRs/13_0R-CA72-poud/ and the like track the
system versions.

The media is an Optane 960 in the PCIe slot of a HoneyComb
(16 Cortex-A72's). The context is a root on ZFS one, ZFS
used in order to have bectl, not redundancy.

The ThreadRipper 1950X (so amd64) port builds did not give
evidence of such problems based on the updated system. (Also
Optane media in a PCIe slot, also root on ZFS.) But the
errors seem rare enough to not be able to conclude much.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




FYI: "ELF interpreter /libexec/ld-elf.so.1 not found, error 13" installworld race?

2021-11-14 Thread Mark Millard via freebsd-current
In my update sequence to install FreeBSD with the clang 13
related materials, the first -j32 installworld attempt got:

. . .
--- realinstall_subdir_lib/libc/tests/hash ---
install  -o root  -g wheel -m 444  
/usr/main-src/contrib/netbsd-tests/lib/libc/hash/data/md5test-out 
/usr/tests/lib/libc/hash/data/md5test-out
--- maninstall ---
install  -o root -g wheel -m 444 posix_spawnattr_getschedpolicy.3.gz  
/usr/share/man/man3/
--- realinstall_subdir_lib/libc/tests ---
--- realinstall_subdir_lib/libc/tests/setjmp ---
install   -o root -g wheel -m 555   threadjmp_test 
/usr/tests/lib/libc/setjmp/threadjmp_test
ELF interpreter /libexec/ld-elf.so.1 not found, error 13
ELF interpreter /libexec/ld-elf.so.1 not found, error 13
--- realinstall_subdir_lib/libc/tests/nss ---
--- getrpc_test.install ---
--- realinstall_subdir_lib/libc/tests/gen ---
--- realinstall_subdir_lib/libc/tests/gen/posix_spawn ---
--- h_fileactions.install ---
*** [h_fileactions.install] Signal 6
--- realinstall_subdir_lib/libc/tests/nss ---
--- getproto_test.install ---
install  -o root -g wheel -m 444  getproto_test.debug 
/usr/lib/debug/usr/tests/lib/libc/nss/getproto_test.debug
--- realinstall_subdir_lib/libc/tests/gen ---
--- realinstall_subdir_lib/libc/tests/gen/execve ---
--- _proginstall ---
--- realinstall_subdir_lib/libc/tests/gen/posix_spawn ---
1 error

make[8]: stopped in /usr/main-src/lib/libc/tests/gen/posix_spawn
. . .

The retry did not get the problem.

The context was a ThreadRipper 1950X (so amd64). It was
my own build that was being installed.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: FYI: aarch64 main [so: 14] system hung up with a large amount of memory in use (given the RAM+SWAP configuration) but lots of swap left

2021-11-13 Thread Mark Millard via freebsd-current



On 2021-Nov-13, at 03:20, Mark Millard  wrote:


> While attempting to see if I could repeat a bugzilla report in a
> somewhat different context, I has the system hang up to the
> point that ^C and ^Z did not work and ^T did not echo out what
> would be expected for poudriere (or even the kernel backtrace).
> I was able to escape to ddb.
> 
> The context was Cortex-A72 based aarch64 system using:
> 
> # poudriere jail -jmain-CA7 -i
> Jail name: main-CA7
> Jail version:  14.0-CURRENT
> Jail arch: arm.armv7
> Jail method:   null
> Jail mount:/usr/obj/DESTDIRs/main-CA7-poud
> Jail fs:   
> Jail updated:  2021-06-27 17:58:33
> Jail pkgbase:  disabled
> 
> # uname -apKU
> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
> main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 1400040 1400040
> 
> It is a non-debug build (but with symbols).
> 
> 16 cortex-A72 cores, 64 GiBytes RAM, root on ZFS, 251904Mi swap,
> USE_TMPFS=all in use. ALLOW_PARALLEL_JOBS= in use too.
> (Mentioned only for context: I've no specific evidence if other
> contexts would also have failed, say, USE+TMPFS="data" or UFS.)
> 
> When I looked around at the db> prompts I noticed one
> oddity (I'm no expert at such inspections):
> 
> db> show allchains
> . . .
> chain 92:
> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
> . . . (thousands of more instances of that line content,
>   I never found the last) . . .
> 
> My patched top (that reports some "maximum observed" (MaxObs???)
> figures) was showing (having hung up with the system):
> 
> last pid: 18816;  load averages: 10.11, 16.76, 18.73 MaxObs: 115.65, 103.13, 
> 96.36 
>  up 8+06:52:04  20:30:57
> 324 threads:   17 running, 305 sleeping, 2 waiting, 147 MaxObsRunning
> CPU:  2.8% user,  0.0% nice, 97.1% system,  0.0% interrupt,  0.0% idle
> Mem: 19044Ki Active, 331776B Inact, 73728B Laundry, 6950Mi Wired, 69632B Buf, 
> 558860Ki Free, 47709Mi MaxObsActive, 12556Mi MaxObsWired, 59622Mi 
> MaxObs(Act+Wir+Lndry)
> ARC: 2005Mi Total, 623319Ki MFU, 654020Ki MRU, 2048Ki Anon, 27462Ki Header, 
> 745685Ki Other
> 783741Ki Compressed, 3981Mi Uncompressed, 5.20:1 Ratio
> Swap: 251904Mi Total, 101719Mi Used, 150185Mi Free, 40% Inuse, 3432Ki In, 
> 3064Ki Out, 101719Mi MaxObsUsed, 101737Mi MaxObs(Act+Lndry+SwapUsed), 
> 109816Mi MaxObs(Act+Wir+Lndry+SwapUsed)
> 
> (Based on the 20:30:57 time shown, it had been hung up for over
> 2 hours when I got to it.)
> 
> There were no console messages. /var/log/messages had its
> last message at 18:57:52. No out-of-swap or such
> messages.
> 
> 
> I did get a dump via the db> prompt.
> 

In retrying the poudriere-devel run expiriment I'm
getting various builds that are generating
multi-GiByte log files (and growing) that have
lines like:

thread 'rustc' panicked at 'capacity overflow', 
library/alloc/src/raw_vec.rs:559:5
stack backtrace:
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose 
backtrace.

error: internal compiler error: unexpected panic

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: 
https://github.com/rust-lang/rust/issues/new?labels=C-bug%2C+I-ICE%2C+T-compiler&template=ice.md


note: rustc 1.55.0 running on armv7-unknown-freebsd

note: compiler flags: -C embed-bitcode=no -C debuginfo=2 -C linker=cc 
--crate-type lib

note: some of the compiler flags provided by cargo are hidden

query stack during panic:
#0 [trimmed_def_paths] calculating trimmed def paths
#1 [lint_mod] linting module `transitions`
#2 [analysis] running analysis passes on this crate
end of query stack
thread 'rustc' panicked at 'cannot panic during the backtrace function', 
library/std/src/../../backtrace/src/lib.rs:147:13
stack backtrace:
   0: 0x4710076c - ::fmt::h4428caffcb182c5b
   1: 0x471c9d00 - core::fmt::write::h91f4a7678561fd61
   2: 0x470e2180 - 
   3: 0x470ebd40 - 
   4: 0x470eb824 - 
   5: 0x41ed4848 - 
   6: 0x470ec690 - std::panicking::rust_panic_with_hook::h6bc4b7e83060df25
   7: 0x47100f0c - 
   8: 0x47100900 - 
   9: 0x470ec374 - 
. . .
  65: 0x470ee71c - 
  66: 0x401361bc - 
  67: 0x40135cd8 - pthread_create
  68: 0x40138b9c - pthread_peekjoin_np
  69: 0x40138b9c - pthread_peekjoin_np
  70: 0x40138b9c - pthread_peekjoin_np
  71: 0x40138b9c - pthread_peekjoin_np
  72: 0x40138b9c - pthread_peekjoin_np
  73: 0x40138b9c - pthread_peekjoin_np
. . . mass

FYI: aarch64 main [so: 14] system hung up with a large amount of memory in use (given the RAM+SWAP configuration) but lots of swap left

2021-11-13 Thread Mark Millard via freebsd-current


While attempting to see if I could repeat a bugzilla report in a
somewhat different context, I has the system hang up to the
point that ^C and ^Z did not work and ^T did not echo out what
would be expected for poudriere (or even the kernel backtrace).
I was able to escape to ddb.

The context was Cortex-A72 based aarch64 system using:

# poudriere jail -jmain-CA7 -i
Jail name: main-CA7
Jail version:  14.0-CURRENT
Jail arch: arm.armv7
Jail method:   null
Jail mount:/usr/obj/DESTDIRs/main-CA7-poud
Jail fs:   
Jail updated:  2021-06-27 17:58:33
Jail pkgbase:  disabled

# uname -apKU
FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 
main-n250455-890cae197737-dirty: Thu Nov  4 13:43:17 PDT 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
  arm64 aarch64 1400040 1400040

It is a non-debug build (but with symbols).

16 cortex-A72 cores, 64 GiBytes RAM, root on ZFS, 251904Mi swap,
USE_TMPFS=all in use. ALLOW_PARALLEL_JOBS= in use too.
(Mentioned only for context: I've no specific evidence if other
contexts would also have failed, say, USE+TMPFS="data" or UFS.)

When I looked around at the db> prompts I noticed one
oddity (I'm no expert at such inspections):

db> show allchains
. . .
chain 92:
 thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
 thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
 thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
 thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
 thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
 thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL
. . . (thousands of more instances of that line content,
   I never found the last) . . .

My patched top (that reports some "maximum observed" (MaxObs???)
figures) was showing (having hung up with the system):

last pid: 18816;  load averages: 10.11, 16.76, 18.73 MaxObs: 115.65, 103.13, 
96.36   
   up 8+06:52:04  20:30:57
324 threads:   17 running, 305 sleeping, 2 waiting, 147 MaxObsRunning
CPU:  2.8% user,  0.0% nice, 97.1% system,  0.0% interrupt,  0.0% idle
Mem: 19044Ki Active, 331776B Inact, 73728B Laundry, 6950Mi Wired, 69632B Buf, 
558860Ki Free, 47709Mi MaxObsActive, 12556Mi MaxObsWired, 59622Mi 
MaxObs(Act+Wir+Lndry)
ARC: 2005Mi Total, 623319Ki MFU, 654020Ki MRU, 2048Ki Anon, 27462Ki Header, 
745685Ki Other
 783741Ki Compressed, 3981Mi Uncompressed, 5.20:1 Ratio
Swap: 251904Mi Total, 101719Mi Used, 150185Mi Free, 40% Inuse, 3432Ki In, 
3064Ki Out, 101719Mi MaxObsUsed, 101737Mi MaxObs(Act+Lndry+SwapUsed), 109816Mi 
MaxObs(Act+Wir+Lndry+SwapUsed)

(Based on the 20:30:57 time shown, it had been hung up for over
2 hours when I got to it.)

There were no console messages. /var/log/messages had its
last message at 18:57:52. No out-of-swap or such
messages.


I did get a dump via the db> prompt.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: Build of devel/ninja and lang/gcc11 fails with latest 14-CURRENT amd64

2021-11-12 Thread Evgeniy Khramtsov via freebsd-current
I confirm, the attached patch fixes ports mentioned in my previous mail.



Re: Build of devel/ninja and lang/gcc11 fails with latest 14-CURRENT amd64

2021-11-12 Thread Evgeniy Khramtsov via freebsd-current
Ports graphics/cairo, multimedia/ffmpeg, www/firefox are also affected.



Re: current now panics when starting VBox VM

2021-11-02 Thread Greg V via freebsd-current



On November 2, 2021 5:16:35 PM GMT+03:00, Michael Butler via freebsd-current 
 wrote:
>On current as of this morning (I haven't tried to bisect yet) ..
>
>  .. with either graphics/drm-devel-kmod or graphics/drm-current-kmod, 
>trying to start a VirtualBox VM triggers this panic ..
>

>#16 0x80c81fc8 at calltrap+0x8
>#17 0x808b4d69 at sysctl_kern_proc_pathname+0xc9

something something https://reviews.freebsd.org/D32738 ? 
sysctl_kern_proc_pathname was touched recently there.

(Also can someone commit https://reviews.freebsd.org/D30174 ? These 
warning-filled reports are unreadable >_<)



current now panics when starting VBox VM

2021-11-02 Thread Michael Butler via freebsd-current

On current as of this morning (I haven't tried to bisect yet) ..

FreeBSD toshi.auburn.protected-networks.net 14.0-CURRENT FreeBSD 
14.0-CURRENT #42 main-a670e1c13a: Tue Nov  2 09:29:28 EDT 2021 
r...@toshi.auburn.protected-networks.net:/usr/obj/usr/src/amd64.amd64/sys/TOSHI 
 amd64


 .. with either graphics/drm-devel-kmod or graphics/drm-current-kmod, 
trying to start a VirtualBox VM triggers this panic ..


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x80ca5564
stack pointer   = 0x28:0xfe011c036b80
frame pointer   = 0x28:0xfe011c036b80
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 1378 (plasmashell)
trap number = 12
WARNING !drm_modeset_is_locked(&crtc->mutex) failed at 
/usr/ports/graphics/drm-current-kmod/work/drm-kmod-drm_v5.4.144_2/drivers/gpu/drm/drm_atomic_helper.c:621

#0 0x81ec2c63 at linux_dump_stack+0x23
#1 0x81e403b2 at drm_atomic_helper_check_modeset+0xb2
#2 0x81d3870c at intel_atomic_check+0x8c
#3 0x81e3f383 at drm_atomic_check_only+0x423
#4 0x81e3f783 at drm_atomic_commit+0x13
#5 0x81e4c2e8 at drm_client_modeset_commit_atomic+0x148
#6 0x81e4c046 at drm_client_modeset_commit_force+0x66
#7 0x81e8bf1a at drm_fb_helper_restore_fbdev_mode_unlocked+0x7a
#8 0x81e85ef6 at vt_kms_postswitch+0x166
#9 0x807a59f0 at vt_window_switch+0x120
#10 0x807a2b4f at vtterm_cngrab+0x4f
#11 0x80865716 at cngrab+0x16
#12 0x808cbe1c at vpanic+0xec
#13 0x808cbd23 at panic+0x43
#14 0x80ca928c at trap_fatal+0x2dc
#15 0x80ca962f at trap_pfault+0x32f
#16 0x80ca8a3c at trap+0x23c
#17 0x80c81fc8 at calltrap+0x8
WARNING !drm_modeset_is_locked(&crtc->mutex) failed at 
/usr/ports/graphics/drm-current-kmod/work/drm-kmod-drm_v5.4.144_2/drivers/gpu/drm/drm_atomic_helper.c:621

#0 0x81ec2c63 at linux_dump_stack+0x23
#1 0x81e403b2 at drm_atomic_helper_check_modeset+0xb2
#2 0x81d3870c at intel_atomic_check+0x8c
#3 0x81e3f383 at drm_atomic_check_only+0x423
#4 0x81e3f783 at drm_atomic_commit+0x13
#5 0x81e4c2e8 at drm_client_modeset_commit_atomic+0x148
#6 0x81e4c046 at drm_client_modeset_commit_force+0x66
#7 0x81e8bf1a at drm_fb_helper_restore_fbdev_mode_unlocked+0x7a
#8 0x81e85ef6 at vt_kms_postswitch+0x166
#9 0x807a59f0 at vt_window_switch+0x120
#10 0x807a2b4f at vtterm_cngrab+0x4f
#11 0x80865716 at cngrab+0x16
#12 0x808cbe1c at vpanic+0xec
#13 0x808cbd23 at panic+0x43
#14 0x80ca928c at trap_fatal+0x2dc
#15 0x80ca962f at trap_pfault+0x32f
#16 0x80ca8a3c at trap+0x23c
#17 0x80c81fc8 at calltrap+0x8
WARNING !drm_modeset_is_locked(&crtc->mutex) failed at 
/usr/ports/graphics/drm-current-kmod/work/drm-kmod-drm_v5.4.144_2/drivers/gpu/drm/drm_atomic_helper.c:621

#0 0x81ec2c63 at linux_dump_stack+0x23
#1 0x81e403b2 at drm_atomic_helper_check_modeset+0xb2
#2 0x81d3870c at intel_atomic_check+0x8c
#3 0x81e3f383 at drm_atomic_check_only+0x423
#4 0x81e3f783 at drm_atomic_commit+0x13
#5 0x81e4c2e8 at drm_client_modeset_commit_atomic+0x148
#6 0x81e4c046 at drm_client_modeset_commit_force+0x66
#7 0x81e8bf1a at drm_fb_helper_restore_fbdev_mode_unlocked+0x7a
#8 0x81e85ef6 at vt_kms_postswitch+0x166
#9 0x807a59f0 at vt_window_switch+0x120
#10 0x807a2b4f at vtterm_cngrab+0x4f
#11 0x80865716 at cngrab+0x16
#12 0x808cbe1c at vpanic+0xec
#13 0x808cbd23 at panic+0x43
#14 0x80ca928c at trap_fatal+0x2dc
#15 0x80ca962f at trap_pfault+0x32f
#16 0x80ca8a3c at trap+0x23c
#17 0x80c81fc8 at calltrap+0x8
WARNING !drm_modeset_is_locked(&dev->mode_config.connection_mutex) 
failed at 
/usr/ports/graphics/drm-current-kmod/work/drm-kmod-drm_v5.4.144_2/drivers/gpu/drm/drm_atomic_helper.c:666

#0 0x81ec2c63 at linux_dump_stack+0x23
#1 0x81e40542 at drm_atomic_helper_check_modeset+0x242
#2 0x81d3870c at intel_atomic_check+0x8c
#3 0x81e3f383 at drm_atomic_check_only+0x423
#4 0x81e3f783 at drm_atomic_commit+0x13
#5 0x81e4c2e8 at drm_client_modeset_commit_atomic+0x148
#6 0x81e4c046 at drm_client_modeset_commit_force+0x66
#7 0x81e8bf1a at drm_fb_helper_restore_fbdev_mode_unlocked+0x7a
#8 0x81e85ef6 at vt_kms_postswitch+0x166
#9 0x807a59f0 at vt_window_switch+0x120
#10 0x807a2b4f at vtterm_cngrab+0x4f
#11 0x80865716 at cngrab+0x16
#12 0x808cbe1c at vpanic+0xec
#13 0x808cbd23 at pani

Re: git: 2f7f8995367b - main - libdialog: Bump shared library version to 10. [ the .so.10 is listed in mk/OptionalObsoleteFiles.inc ?]

2021-10-27 Thread Mark Millard via freebsd-current



On 2021-Oct-27, at 15:21, Mark Millard  wrote:

> Unfortunately(?) this update added the .so.10 to mk/OptionalObsoleteFiles.inc 
> :
> 
> diff --git a/tools/build/mk/OptionalObsoleteFiles.inc 
> b/tools/build/mk/OptionalObsoleteFiles.inc
> index a8b0329104c4..91822aac492a 100644
> --- a/tools/build/mk/OptionalObsoleteFiles.inc
> +++ b/tools/build/mk/OptionalObsoleteFiles.inc
> _at__at_ -1663,11 +1663,11 _at__at_ OLD_FILES+=usr/bin/dialog
> . . .
> OLD_FILES+=usr/lib/libdialog.so
> -OLD_FILES+=usr/lib/libdialog.so.8
> +OLD_FILES+=usr/lib/libdialog.so.10
> . . .
> 
> Looks to my like that +line should have been:
> 
> +OLD_FILES+=usr/lib/libdialog.so.9
> 
> (presuming the original .so.8 was correct during
> .so.9 's time frame).


Looks like:

+OLD_FILES+=usr/lib/libdpv.so.3

is the same sort of issue and possibly should have been:

+OLD_FILES+=usr/lib/libdpv.so.2


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: git: 2f7f8995367b - main - libdialog: Bump shared library version to 10. [ the .so.10 is listed in mk/OptionalObsoleteFiles.inc ?]

2021-10-27 Thread Mark Millard via freebsd-current
Unfortunately(?) this update added the .so.10 to mk/OptionalObsoleteFiles.inc :

diff --git a/tools/build/mk/OptionalObsoleteFiles.inc 
b/tools/build/mk/OptionalObsoleteFiles.inc
index a8b0329104c4..91822aac492a 100644
--- a/tools/build/mk/OptionalObsoleteFiles.inc
+++ b/tools/build/mk/OptionalObsoleteFiles.inc
_at__at_ -1663,11 +1663,11 _at__at_ OLD_FILES+=usr/bin/dialog
. . .
 OLD_FILES+=usr/lib/libdialog.so
-OLD_FILES+=usr/lib/libdialog.so.8
+OLD_FILES+=usr/lib/libdialog.so.10
. . .

Looks to my like that +line should have been:

+OLD_FILES+=usr/lib/libdialog.so.9

(presuming the original .so.8 was correct during
.so.9 's time frame).


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: what does "failed to read progbits" mean?

2021-10-25 Thread Michael Butler via freebsd-current
This seems to have gone away after 
https://freshbsd.org/freebsd/src/commit/70f51f0e474ffe1fb74cb427423a2fba3637544d


Not sure if the bug that commit fixes was the underlying cause,

Michael


On 10/25/21 11:40, Nuno Teixeira wrote:

Same here:
kldxref /boot/kernel
failed to read progbits

But kernel failed to install. I will include log tomorrow, I'm doing a 
clean build with /usr/obj/.. deleted.


Michael Butler via freebsd-current <mailto:freebsd-current@freebsd.org>> escreveu no dia quinta, 21/10/2021 
à(s) 20:14:


Well this is different .. I did a full rebuild (after "rm -rf
/usr/obj/*") this morning and now see ..

===> linux_common (install)
install -T release -o root -g wheel -m 555   linux_common.ko
/boot/kernel/
install -T dbg -o root -g wheel -m 555   linux_common.ko.debug
/usr/lib/debug/boot/kernel/
===> linuxkpi (install)
install -T release -o root -g wheel -m 555   linuxkpi.ko /boot/kernel/
install -T dbg -o root -g wheel -m 555   linuxkpi.ko.debug
/usr/lib/debug/boot/kernel/
kldxref /boot/kernel
failed to read progbits
failed to read progbits
failed to read progbits
failed to read progbits
failed to read progbits
failed to read progbits
failed to read progbits
kldxref: /boot/kernel/kernel: cannot load DT_RELA section
--
  >>> Installing kernel TOSHI completed on Thu Oct 21 15:05:00 EDT 2021
--

Is something broken or just cosmetic noise?

         imb






main changed DIALOG_STATE, DIALOG_VARS, and DIALOG_COLORS but /usr/lib/libdialog.so.? naming was not adjusted? (crashes in releng/13 programs on main [so: 14] can result)

2021-10-22 Thread Mark Millard via freebsd-current
main [soi: 14] commit a96ef450 (2021-02-26 09:16:49 +)
changed DIALOG_STATE, DIALOG_VARS, and DIALOG_COLORS .
These are publicly exposed in (ones that I noticed):

/usr/include/dialog.h:extern DIALOG_STATE dialog_state;
/usr/include/dialog.h:extern DIALOG_VARS dialog_vars;
/usr/include/dialog.h:extern DIALOG_COLORS dlg_color_table[];

and ends up with the storage being form the .bss of
the likes of dialog4ports (the example I ran into).

But the .9 in /usr/lib/libdialog.so.9 's .text that references
the storage where not increased compared to releng/13.0 and
stable/13 that predate the changes, there by not matching
old programs built under releng/13.0 or stable/13 .

Turns out that this explains the crashes I get when I attempt
to use a releng/13 based dialog4ports under main [so: 14]. For
a particular example, see:

https://lists.freebsd.org/archives/freebsd-current/2021-October/000860.html

It shows /usr/main-src/contrib/dialog/dlg_keys.c in
/usr/lib/libdialog.so.9 updating a new field:

286 } else {
287 dialog_state.had_resize = FALSE;
   0x0008002d298e <+62>:movb   $0x0,0x84(%rax)

such that the following happens:

Hardware watchpoint 1: -location __stderrp

Old value = (FILE *) 0x8004d4940
New value = (FILE *) 0x4d4940

where:

(gdb) print &__stderrp
$4 = (FILE **) 0x208568 <__stderrp>

which has that storage in the dialog4ports area:

0x00208360 - 0x00208c50 is .bss

with the older set of fields and size for:

extern DIALOG_STATE dialog_state;

That in turn later leads to a SIGSEGV from the point of
view of a releng/13 based dialog4ports build.

Should main [14] instead have:

/usr/lib/libdialog.so.10

in order to avoid some releng/13.0 and stable/13 programs
trashing their memory? I'm guessing there is no reasonble
way to "compat" this. But preventing programs from trashing
there own memory and running in a corrupted state seems
achievable if the /usr/lib/libdialog.so.? name changes.

This might be something for a freebsd-arch discussion for
relevant folks.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: Is dialog4ports built in/for releng/13.0 also supposed to work under main [so: 14]? It gets SIGSEGV in my context. (some low level failure info now)

2021-10-21 Thread Mark Millard via freebsd-current
On 2021-Oct-21, at 16:24, Mark Millard  wrote:

> On 2021-Oct-21, at 11:53, Mark Millard  wrote:
> 
>> On 2021-Oct-21, at 08:27, Tomoaki AOKI  wrote:
>> 
>>> On Thu, 21 Oct 2021 07:40:36 -0700
>>> Mark Millard via freebsd-current  wrote:
>>> 
>>>> 
>>>> 
>>>> On 2021-Oct-21, at 06:14, Gary Jennejohn  wrote:
>>>> 
>>>>> On Thu, 21 Oct 2021 01:34:47 -0700
>>>>> Mark Millard via freebsd-current  wrote:
>>>>> 
>>>>>> I get the following crash (amd64 example shown), as reported
>>>>>> via gdb afterwards. (devel/llvm13 is just an example context.)
>>>>>> 
>>>>>> gdb `which dialog4ports` devel/llvm13/dialog4ports.core
>>>>>> . . .
>>>>>> Core was generated by `/usr/local/bin/dialog4ports'.
>>>>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>>>>> Address not mapped to object.
>>>>>> #0  vfprintf_l (fp=0x4d4940, locale=0x8004d4128 
>>>>>> <__xlocale_global_locale>, fmt0=0x201f64 "\"%s\"", 
>>>>>> ap=ap@entry=0x7fffcf00) at 
>>>>>> /usr/main-src/lib/libc/stdio/vfprintf.c:281
>>>>>> 281  if ((fp->_flags & (__SNBF|__SWR|__SRW)) == 
>>>>>> (__SNBF|__SWR) &&
>>>>>> (gdb) bt
>>>>>> #0  vfprintf_l (fp=0x4d4940, locale=0x8004d4128 
>>>>>> <__xlocale_global_locale>, fmt0=0x201f64 "\"%s\"", 
>>>>>> ap=ap@entry=0x7fffcf00) at 
>>>>>> /usr/main-src/lib/libc/stdio/vfprintf.c:281
>>>>>> #1  0x000800409283 in fprintf (fp=0x800411660 
>>>>>> <__stdio_cancel_cleanup>, fmt=0x7fffcdd0 "0\317\377\377\377\177") at 
>>>>>> /usr/main-src/lib/libc/stdio/fprintf.c:57
>>>>>> #2  0x0020399d in main (argc=, argv=>>>>> out>) at dialog4ports.c:332
>>>>>> (gdb) quit
>>>>>> 
>>>>>> The crash happens after selecting OK but not after selecting Cancel. The
>>>>>> display is also odd before that (no line drawing, just odd text instead),
>>>>>> but is sufficient to be usable at that stage.
>>>>>> 
>>>>>> . . .
>>>> 
> 
> gdb's disass/s reports the failure point via:
> 
> . . .
> /usr/main-src/lib/libc/stdio/vfprintf.c:
> 279   FLOCKFILE_CANCELSAFE(fp);
>   0x000800412357 <+71>:   mov0xbf082(%rip),%rax# 0x8004d13e0
>   0x00080041235e <+78>:   cmpl   $0x0,(%rax)
>   0x000800412361 <+81>:   je 0x800412370 
>   0x000800412363 <+83>:   mov%rbx,%rdi
>   0x000800412366 <+86>:   call   0x8004c6730 <_flockfile@plt>
>   0x00080041236b <+91>:   mov%rbx,%rsi
>   0x00080041236e <+94>:   jmp0x800412372 
>   0x000800412370 <+96>:   xor%esi,%esi
>   0x000800412372 <+98>:   lea-0xd19(%rip),%rdi# 0x800411660 
> <__stdio_cancel_cleanup>
>   0x000800412379 <+105>:  lea-0x70(%rbp),%rdx
>   0x00080041237d <+109>:  call   0x800384a90 
> <__pthread_cleanup_push_imp_int>
> 
> 280   /* optimise fprintf(stderr) (and other unbuffered Unix files) */
> 281   if ((fp->_flags & (__SNBF|__SWR|__SRW)) == (__SNBF|__SWR) &&
> => 0x000800412382 <+114>: movzwl 0x10(%rbx),%eax
>   0x000800412386 <+118>:  and$0x1a,%eax
>   0x000800412389 <+121>:  cmp$0xa,%ax
>   0x00080041238d <+125>:  jne0x8004123a9 
> 
> 282   fp->_file >= 0)
>   0x00080041238f <+127>:  cmpw   $0x0,0x12(%rbx)
> 
> 281   if ((fp->_flags & (__SNBF|__SWR|__SRW)) == (__SNBF|__SWR) &&
>   0x000800412394 <+132>:  js 0x8004123a9 
> . . .
> 
> (gdb) info reg
> rax0x0 0
> rbx0x4d49405065024
> rcx0x7fffd0e0  140737488343264
> rdx0x7fffcfb0  140737488342960
> rsi0x0 0
> rdi0x800411660 34364003936
> rbp0x7fffd020  0x7fffd020
> rsp0x7fffcfb0  0x7fffcfb0
> r8 0x0 0
> r9 0x0 0
> r100x800a330f0 34370433264
> r110x206   518
> r120x800

Re: Is dialog4ports built in/for releng/13.0 also supposed to work under main [so: 14]? It gets SIGSEGV in my context.

2021-10-21 Thread Mark Millard via freebsd-current
On 2021-Oct-21, at 11:53, Mark Millard  wrote:

> On 2021-Oct-21, at 08:27, Tomoaki AOKI  wrote:
> 
>> On Thu, 21 Oct 2021 07:40:36 -0700
>> Mark Millard via freebsd-current  wrote:
>> 
>>> 
>>> 
>>> On 2021-Oct-21, at 06:14, Gary Jennejohn  wrote:
>>> 
>>>> On Thu, 21 Oct 2021 01:34:47 -0700
>>>> Mark Millard via freebsd-current  wrote:
>>>> 
>>>>> I get the following crash (amd64 example shown), as reported
>>>>> via gdb afterwards. (devel/llvm13 is just an example context.)
>>>>> 
>>>>> gdb `which dialog4ports` devel/llvm13/dialog4ports.core
>>>>> . . .
>>>>> Core was generated by `/usr/local/bin/dialog4ports'.
>>>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>>>> Address not mapped to object.
>>>>> #0  vfprintf_l (fp=0x4d4940, locale=0x8004d4128 
>>>>> <__xlocale_global_locale>, fmt0=0x201f64 "\"%s\"", 
>>>>> ap=ap@entry=0x7fffcf00) at /usr/main-src/lib/libc/stdio/vfprintf.c:281
>>>>> 281   if ((fp->_flags & (__SNBF|__SWR|__SRW)) == 
>>>>> (__SNBF|__SWR) &&
>>>>> (gdb) bt
>>>>> #0  vfprintf_l (fp=0x4d4940, locale=0x8004d4128 
>>>>> <__xlocale_global_locale>, fmt0=0x201f64 "\"%s\"", 
>>>>> ap=ap@entry=0x7fffcf00) at /usr/main-src/lib/libc/stdio/vfprintf.c:281
>>>>> #1  0x000800409283 in fprintf (fp=0x800411660 
>>>>> <__stdio_cancel_cleanup>, fmt=0x7fffcdd0 "0\317\377\377\377\177") at 
>>>>> /usr/main-src/lib/libc/stdio/fprintf.c:57
>>>>> #2  0x0020399d in main (argc=, argv=>>>> out>) at dialog4ports.c:332
>>>>> (gdb) quit
>>>>> 
>>>>> The crash happens after selecting OK but not after selecting Cancel. The
>>>>> display is also odd before that (no line drawing, just odd text instead),
>>>>> but is sufficient to be usable at that stage.
>>>>> 
>>>>> . . .
>>> 

gdb's disass/s reports the failure point via:

. . .
/usr/main-src/lib/libc/stdio/vfprintf.c:
279 FLOCKFILE_CANCELSAFE(fp);
   0x000800412357 <+71>:mov0xbf082(%rip),%rax# 0x8004d13e0
   0x00080041235e <+78>:cmpl   $0x0,(%rax)
   0x000800412361 <+81>:je 0x800412370 
   0x000800412363 <+83>:mov%rbx,%rdi
   0x000800412366 <+86>:call   0x8004c6730 <_flockfile@plt>
   0x00080041236b <+91>:mov%rbx,%rsi
   0x00080041236e <+94>:jmp0x800412372 
   0x000800412370 <+96>:xor%esi,%esi
   0x000800412372 <+98>:lea-0xd19(%rip),%rdi# 0x800411660 
<__stdio_cancel_cleanup>
   0x000800412379 <+105>:   lea-0x70(%rbp),%rdx
   0x00080041237d <+109>:   call   0x800384a90 
<__pthread_cleanup_push_imp_int>

280 /* optimise fprintf(stderr) (and other unbuffered Unix files) */
281 if ((fp->_flags & (__SNBF|__SWR|__SRW)) == (__SNBF|__SWR) &&
=> 0x000800412382 <+114>:   movzwl 0x10(%rbx),%eax
   0x000800412386 <+118>:   and$0x1a,%eax
   0x000800412389 <+121>:   cmp$0xa,%ax
   0x00080041238d <+125>:   jne0x8004123a9 

282 fp->_file >= 0)
   0x00080041238f <+127>:   cmpw   $0x0,0x12(%rbx)

281 if ((fp->_flags & (__SNBF|__SWR|__SRW)) == (__SNBF|__SWR) &&
   0x000800412394 <+132>:   js 0x8004123a9 
. . .

(gdb) info reg
rax0x0 0
rbx0x4d49405065024
rcx0x7fffd0e0  140737488343264
rdx0x7fffcfb0  140737488342960
rsi0x0 0
rdi0x800411660 34364003936
rbp0x7fffd020  0x7fffd020
rsp0x7fffcfb0  0x7fffcfb0
r8 0x0 0
r9 0x0 0
r100x800a330f0 34370433264
r110x206   518
r120x8004d4128 34364801320
r130x2083a02130848
r140x7fffd0e0  140737488343264
r150x201f642105188
rip0x800412382 0x800412382 
eflags 0x10246 [ PF ZF IF RF ]
cs 0x4367
ss 0x3b59
ds 
es 
fs 
gs 
fs_base
gs_base

where:

(gdb) disass/s __pthread_cleanup_push_imp_int
Dump of assembler code for function __pthread_cleanup_push_imp_int:
/usr/main-src/lib/libc/gen/_pthread_stubs.c:
289 STUB_FUNC3(__pthread_cleanup_push_imp, PJT_CLEANUP_PUSH_IMP, void, void 
*,
   0x000800384a90 <+0>: push   %rbp
   0x000800384a91 <+1>: mov%rsp,%rbp
   0x000800384a94 <+4>: mov0x14c94d(%rip),%rax# 0x8004d13e8
   0x000800384a9b <+11>:mov0x3c8(%rax),%rax
   0x000800384aa2 <+18>:pop%rbp
   0x000800384aa3 <+19>:jmp*%rax
End of assembler dump.


It is not obvious that any of this has any relationship with
libtinfow.so.9 or libncursesw.so.9 use unless some memory is
being trashed first.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




what does "failed to read progbits" mean?

2021-10-21 Thread Michael Butler via freebsd-current
Well this is different .. I did a full rebuild (after "rm -rf 
/usr/obj/*") this morning and now see ..


===> linux_common (install)
install -T release -o root -g wheel -m 555   linux_common.ko /boot/kernel/
install -T dbg -o root -g wheel -m 555   linux_common.ko.debug 
/usr/lib/debug/boot/kernel/

===> linuxkpi (install)
install -T release -o root -g wheel -m 555   linuxkpi.ko /boot/kernel/
install -T dbg -o root -g wheel -m 555   linuxkpi.ko.debug 
/usr/lib/debug/boot/kernel/

kldxref /boot/kernel
failed to read progbits
failed to read progbits
failed to read progbits
failed to read progbits
failed to read progbits
failed to read progbits
failed to read progbits
kldxref: /boot/kernel/kernel: cannot load DT_RELA section
--
>>> Installing kernel TOSHI completed on Thu Oct 21 15:05:00 EDT 2021
--

Is something broken or just cosmetic noise?

imb



Re: Is dialog4ports built in/for releng/13.0 also supposed to work under main [so: 14]? It gets SIGSEGV in my context.

2021-10-21 Thread Mark Millard via freebsd-current



On 2021-Oct-21, at 08:27, Tomoaki AOKI  wrote:

> On Thu, 21 Oct 2021 07:40:36 -0700
> Mark Millard via freebsd-current  wrote:
> 
>> 
>> 
>> On 2021-Oct-21, at 06:14, Gary Jennejohn  wrote:
>> 
>>> On Thu, 21 Oct 2021 01:34:47 -0700
>>> Mark Millard via freebsd-current  wrote:
>>> 
>>>> I get the following crash (amd64 example shown), as reported
>>>> via gdb afterwards. (devel/llvm13 is just an example context.)
>>>> 
>>>> gdb `which dialog4ports` devel/llvm13/dialog4ports.core
>>>> . . .
>>>> Core was generated by `/usr/local/bin/dialog4ports'.
>>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>>> Address not mapped to object.
>>>> #0  vfprintf_l (fp=0x4d4940, locale=0x8004d4128 <__xlocale_global_locale>, 
>>>> fmt0=0x201f64 "\"%s\"", ap=ap@entry=0x7fffcf00) at 
>>>> /usr/main-src/lib/libc/stdio/vfprintf.c:281
>>>> 281if ((fp->_flags & (__SNBF|__SWR|__SRW)) == 
>>>> (__SNBF|__SWR) &&
>>>> (gdb) bt
>>>> #0  vfprintf_l (fp=0x4d4940, locale=0x8004d4128 <__xlocale_global_locale>, 
>>>> fmt0=0x201f64 "\"%s\"", ap=ap@entry=0x7fffcf00) at 
>>>> /usr/main-src/lib/libc/stdio/vfprintf.c:281
>>>> #1  0x000800409283 in fprintf (fp=0x800411660 
>>>> <__stdio_cancel_cleanup>, fmt=0x7fffcdd0 "0\317\377\377\377\177") at 
>>>> /usr/main-src/lib/libc/stdio/fprintf.c:57
>>>> #2  0x0020399d in main (argc=, argv=>>> out>) at dialog4ports.c:332
>>>> (gdb) quit
>>>> 
>>>> The crash happens after selecting OK but not after selecting Cancel. The
>>>> display is also odd before that (no line drawing, just odd text instead),
>>>> but is sufficient to be usable at that stage.
>>>> 
>>> 
>>> This is an indication that something is missing in dialog4ports which
>>> is required by FBSD-14 but not FBSD-13.  I had a similar problem with
>>> dialog4ports under FBSD-14 some weeks ago, because i had a really old
>>> version installed.  After upgrading it using the pkg repositories for
>>> FBSD-14 all problems, in particular garbled text, disappeared.
>>> 
>>> IIRC there were updates to ncurses in FBSD-14 fairly recently which
>>> would explain the problem with old versions of dialog4ports.
>> 
>> I do (and did) my own port builds with poudriere-devel. See the
>> version of ports below. In summary: my dialog4ports is 
>> based on 4116dc2f of ports (CommitDate: 2021-10-17 21:52:37 +).
>> 
>> However it was deliberately built in/for a releng/13.0 based
>> context then also used under main [so:14].
>> 
>> For ports not requiring kernel vintage matching, newer systems
>> versions generally allow running software built for older FreeBSD
>> systems (going back a fair distance, anyway). dialog4ports does
>> not appear to require kernel vintage matching. I do not install
>> any ports requiring kernel vintage matching.
> 
> IIRC, dialog4ports case wouldn't be a kernel-related.
> For ncurses libraries, main (aka 14-current) fully switched to *w ones
> and deleted non-*w ones. And dialog4ports built with 13 and earlier
> crashed on 14.

So I did a chroot into a bectl mount of my stable/13 13S-amd64-nodbg and
looked:

# ldd `which dialog4ports`
/usr/local/bin/dialog4ports:
libncursesw.so.9 => /lib/libncursesw.so.9 (0x800248000)
libm.so.5 => /lib/libm.so.5 (0x8002bc000)
libdialog.so.9 => /usr/lib/libdialog.so.9 (0x8002f3000)
libc.so.7 => /lib/libc.so.7 (0x80032d000)
# ldd /usr/lib/libdialog.so.9
/usr/lib/libdialog.so.9:
libncursesw.so.9 => /lib/libncursesw.so.9 (0x8006a7000)
libm.so.5 => /lib/libm.so.5 (0x80071b000)
libc.so.7 => /lib/libc.so.7 (0x800261000)

This context worked fine for OK selection but note that there
is libncursesw.so.9 use (so: *w in use). The problem is not

libncursesw.so vs. libncurses.so

use. Instead it seems to be the split between:

libncursesw.so.9
and:
libtinfow.so.9

in main [so: 14] that looks to be the difference that matters.
Somehow the binding to libtinfow.so.9 in main is insufficient to
allow full use of the releng/13.0 based dialog4ports build.
(For all I know, this might be expected.)

For reference for the stable/13 test:

# uname -apKU
FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #3 
main-n249978-032448cd2c52-dirty: Fri Oct  8 23:57:23 PDT 2021 
root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-

Re: Is dialog4ports built in/for releng/13.0 also supposed to work under main [so: 14]? It gets SIGSEGV in my context.

2021-10-21 Thread Mark Millard via freebsd-current



On 2021-Oct-21, at 06:14, Gary Jennejohn  wrote:

> On Thu, 21 Oct 2021 01:34:47 -0700
> Mark Millard via freebsd-current  wrote:
> 
>> I get the following crash (amd64 example shown), as reported
>> via gdb afterwards. (devel/llvm13 is just an example context.)
>> 
>> gdb `which dialog4ports` devel/llvm13/dialog4ports.core
>> . . .
>> Core was generated by `/usr/local/bin/dialog4ports'.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> Address not mapped to object.
>> #0  vfprintf_l (fp=0x4d4940, locale=0x8004d4128 <__xlocale_global_locale>, 
>> fmt0=0x201f64 "\"%s\"", ap=ap@entry=0x7fffcf00) at 
>> /usr/main-src/lib/libc/stdio/vfprintf.c:281
>> 281  if ((fp->_flags & (__SNBF|__SWR|__SRW)) == (__SNBF|__SWR) &&
>> (gdb) bt
>> #0  vfprintf_l (fp=0x4d4940, locale=0x8004d4128 <__xlocale_global_locale>, 
>> fmt0=0x201f64 "\"%s\"", ap=ap@entry=0x7fffcf00) at 
>> /usr/main-src/lib/libc/stdio/vfprintf.c:281
>> #1  0x000800409283 in fprintf (fp=0x800411660 <__stdio_cancel_cleanup>, 
>> fmt=0x7fffcdd0 "0\317\377\377\377\177") at 
>> /usr/main-src/lib/libc/stdio/fprintf.c:57
>> #2  0x0020399d in main (argc=, argv=) 
>> at dialog4ports.c:332
>> (gdb) quit
>> 
>> The crash happens after selecting OK but not after selecting Cancel. The
>> display is also odd before that (no line drawing, just odd text instead),
>> but is sufficient to be usable at that stage.
>> 
> 
> This is an indication that something is missing in dialog4ports which
> is required by FBSD-14 but not FBSD-13.  I had a similar problem with
> dialog4ports under FBSD-14 some weeks ago, because i had a really old
> version installed.  After upgrading it using the pkg repositories for
> FBSD-14 all problems, in particular garbled text, disappeared.
> 
> IIRC there were updates to ncurses in FBSD-14 fairly recently which
> would explain the problem with old versions of dialog4ports.

I do (and did) my own port builds with poudriere-devel. See the
version of ports below. In summary: my dialog4ports is 
based on 4116dc2f of ports (CommitDate: 2021-10-17 21:52:37 +).

However it was deliberately built in/for a releng/13.0 based
context then also used under main [so:14].

For ports not requiring kernel vintage matching, newer systems
versions generally allow running software built for older FreeBSD
systems (going back a fair distance, anyway). dialog4ports does
not appear to require kernel vintage matching. I do not install
any ports requiring kernel vintage matching.

>> I've not had any other of the ports that I built in/for releng/13.0
>> (and have used) fail to operate under main [so: under 14]. (But the
>> variety used is not wide.)
>> 
>> For reference . . . 
>> 
>> # uname -apKU
>> FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #3 
>> main-n249978-032448cd2c52-dirty: Fri Oct  8 23:57:23 PDT 2021 
>> root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
>>   amd64 amd64 1400036 1400036
>> 
>> (Not a debug build but has debug symbols enabled.)
>> 
>> # pwd
>> /usr/ports
>> # ~/fbsd-based-on-what-commit.sh 
>> branch: main
>> merge-base: 4116dc2f1f6385b42fb668badb6b4c1cbb195f9d
>> merge-base: CommitDate: 2021-10-17 21:52:37 +
>> 4116dc2f1f63 (HEAD -> main, freebsd/main, freebsd/HEAD) 
>> ports-mgmt/poudriere-devel: Update to 3.3.0-1022-g964cf327f
>> n562472 (--first-parent --count for merge-base)

The above indicates the vintage of ports that my dialog4ports
build is based on (in detail): Not all that old at this point.

>> # file `which dialog4ports`
>> /usr/local/bin/dialog4ports: ELF 64-bit LSB executable, x86-64, version 1 
>> (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 
>> 13.0 (1300139), FreeBSD-style, with debug_info, not stripped
>> 
>> # ldd `which dialog4ports`
>> /usr/local/bin/dialog4ports:
>>  libncursesw.so.9 => /lib/libncursesw.so.9 (0x800248000)
>>  libm.so.5 => /lib/libm.so.5 (0x800281000)
>>  libdialog.so.9 => /usr/lib/libdialog.so.9 (0x8002b8000)
>>  libc.so.7 => /lib/libc.so.7 (0x8002f6000)
>>  libtinfow.so.9 => /lib/libtinfow.so.9 (0x800703000)
>> 
>> Note: The dialog4ports is a non-debug build but with debug symbols,
>> as is normal for my port builds via poudriere-devel .
>> 
>> As for the poudriere-devel build context for the ports:
>> 
>> # chroot /usr/obj/DESTDIRs/13_0R-amd64-poud/
>> # uname -apKU
>> Free

Is dialog4ports built in/for releng/13.0 also supposed to work under main [so: 14]? It gets SIGSEGV in my context.

2021-10-21 Thread Mark Millard via freebsd-current
I get the following crash (amd64 example shown), as reported
via gdb afterwards. (devel/llvm13 is just an example context.)

gdb `which dialog4ports` devel/llvm13/dialog4ports.core
. . .
Core was generated by `/usr/local/bin/dialog4ports'.
Program terminated with signal SIGSEGV, Segmentation fault.
Address not mapped to object.
#0  vfprintf_l (fp=0x4d4940, locale=0x8004d4128 <__xlocale_global_locale>, 
fmt0=0x201f64 "\"%s\"", ap=ap@entry=0x7fffcf00) at 
/usr/main-src/lib/libc/stdio/vfprintf.c:281
281 if ((fp->_flags & (__SNBF|__SWR|__SRW)) == (__SNBF|__SWR) &&
(gdb) bt
#0  vfprintf_l (fp=0x4d4940, locale=0x8004d4128 <__xlocale_global_locale>, 
fmt0=0x201f64 "\"%s\"", ap=ap@entry=0x7fffcf00) at 
/usr/main-src/lib/libc/stdio/vfprintf.c:281
#1  0x000800409283 in fprintf (fp=0x800411660 <__stdio_cancel_cleanup>, 
fmt=0x7fffcdd0 "0\317\377\377\377\177") at 
/usr/main-src/lib/libc/stdio/fprintf.c:57
#2  0x0020399d in main (argc=, argv=) at 
dialog4ports.c:332
(gdb) quit

The crash happens after selecting OK but not after selecting Cancel. The
display is also odd before that (no line drawing, just odd text instead),
but is sufficient to be usable at that stage.

I've not had any other of the ports that I built in/for releng/13.0
(and have used) fail to operate under main [so: under 14]. (But the
variety used is not wide.)

For reference . . . 

# uname -apKU
FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #3 
main-n249978-032448cd2c52-dirty: Fri Oct  8 23:57:23 PDT 2021 
root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
  amd64 amd64 1400036 1400036

(Not a debug build but has debug symbols enabled.)

# pwd
/usr/ports
# ~/fbsd-based-on-what-commit.sh 
branch: main
merge-base: 4116dc2f1f6385b42fb668badb6b4c1cbb195f9d
merge-base: CommitDate: 2021-10-17 21:52:37 +
4116dc2f1f63 (HEAD -> main, freebsd/main, freebsd/HEAD) 
ports-mgmt/poudriere-devel: Update to 3.3.0-1022-g964cf327f
n562472 (--first-parent --count for merge-base)

# file `which dialog4ports`
/usr/local/bin/dialog4ports: ELF 64-bit LSB executable, x86-64, version 1 
(FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 
13.0 (1300139), FreeBSD-style, with debug_info, not stripped

# ldd `which dialog4ports`
/usr/local/bin/dialog4ports:
libncursesw.so.9 => /lib/libncursesw.so.9 (0x800248000)
libm.so.5 => /lib/libm.so.5 (0x800281000)
libdialog.so.9 => /usr/lib/libdialog.so.9 (0x8002b8000)
libc.so.7 => /lib/libc.so.7 (0x8002f6000)
libtinfow.so.9 => /lib/libtinfow.so.9 (0x800703000)

Note: The dialog4ports is a non-debug build but with debug symbols,
as is normal for my port builds via poudriere-devel .

As for the poudriere-devel build context for the ports:

# chroot /usr/obj/DESTDIRs/13_0R-amd64-poud/
# uname -apKU
FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #3 
main-n249978-032448cd2c52-dirty: Fri Oct  8 23:57:23 PDT 2021 
root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
  amd64 amd64 1400036 1300139

# cd /usr/13_0R-src/
# ~/fbsd-based-on-what-commit.sh 
branch: releng/13.0
merge-base: 940681634ee17d12225ecd722c07fef1a0bde813
merge-base: CommitDate: 2021-08-24 18:23:29 +
940681634ee1 (HEAD -> releng/13.0, freebsd/releng/13.0) Add UPDATING entries 
and bump version.
n244760 (--first-parent --count for merge-base)



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: [HEADSUP] making /bin/sh the default shell for root

2021-10-12 Thread Guido Falsi via freebsd-current

On 12/10/21 14:21, Gary Jennejohn wrote:

On Tue, 12 Oct 2021 06:59:00 -0400
grarpamp  wrote:


No. The system shell is supposed to make the system usable
by the users. Actually, the real problem is that the easiest way
to shoot one's own foot is by changing the language (say, the
shell) spoken by default by FreeBSD.


Well, the FreeBSD system speaks sh for its own use, this is clearly
documented as the shell called by init(8), and later by rc(8),
it should probably be the root:0 entry at least for consistancy.
No other shell is called by the FreeBSD system there.
Whatever the users want for their own shells is really up
to them to decide after that.

"Default" is bit of low context word, as there is no falling
back to some shell occuring, no filling in for some missing
option, etc. Maybe use word "shipped" or "root" instead.

Everyone said they already do, and will continue to,
exec whatever shell they like, whether after login,
or by changing the entry. So in addition to the user
being ultimately responsible for their own box and usage,
this well announced entry for UPDATING cannot therein
really be responsible for any user self-shooting.


This is non-sense.


Well, FreeBSD does not add every shell in base,
does not add every app to base, etc.
Some reasons for those limits should be obvious.
This update gives further distilling clarity by
limiting the number of shipped uid 0 entries to 1,
with that 1 being sh.


Every unix user should know that it's
possible to changing the used shell by using
chsh and this includes root.


Then for every user, this update is not a problem.



I've been using UNIX both privately and professionally since 1984
and I must admit that I never heard of chsh before seeing this
e-mail.  I simply use vipw; it's the logical way to do this sort
of thing IMHO.  But I suppose that this is the way to go for users
who don't have root access (which I always have).


AFAIK only root can use vipw, while chsh is usable by all system users.

Guess you've been root since 1984 :)

--
Guido Falsi 



Re: drm-devel-kmod build failures

2021-10-11 Thread Michael Butler via freebsd-current

Thanks - that works :-)

On 10/11/21 13:31, Mateusz Guzik wrote:

This should do it (untested):

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 37b268afa..f05de73fa 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -117,9 +117,15 @@ dma_buf_close(struct file *fp, struct thread *td)
 return (0);
  }

+#if __FreeBSD_version >= 1400037
+static int
+dma_buf_stat(struct file *fp, struct stat *sb,
+struct ucred *active_cred __unused)
+#else
  static int
  dma_buf_stat(struct file *fp, struct stat *sb,
  struct ucred *active_cred __unused, struct thread *td __unused)
+#endif
  {

 /* XXX need to define flags for st_mode */


On 10/11/21, Michael Butler via freebsd-current
 wrote:

After the latest freebsd version bump in param.h, I tried to rebuild the
DRM modules. It failed with ..

--- dma-buf.o ---
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.5.19_4/drivers/dma-buf//dma-buf.c:121:1:

error: conflicting types for 'dma_buf_stat'
dma_buf_stat(struct file *fp, struct stat *sb,
^
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.5.19_4/drivers/dma-buf//dma-buf.c:70:18:

note: previous declaration is here
static fo_stat_t dma_buf_stat;
   ^
1 error generated.
*** [dma-buf.o] Error code 1

make[3]: stopped in
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.5.19_4/linuxkpi
1 error

make[3]: stopped in
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.5.19_4/linuxkpi

I get a similar error with drm-current-kmod. What changed?

imb










drm-devel-kmod build failures

2021-10-11 Thread Michael Butler via freebsd-current
After the latest freebsd version bump in param.h, I tried to rebuild the 
DRM modules. It failed with ..


--- dma-buf.o ---
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.5.19_4/drivers/dma-buf//dma-buf.c:121:1: 
error: conflicting types for 'dma_buf_stat'

dma_buf_stat(struct file *fp, struct stat *sb,
^
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.5.19_4/drivers/dma-buf//dma-buf.c:70:18: 
note: previous declaration is here

static fo_stat_t dma_buf_stat;
 ^
1 error generated.
*** [dma-buf.o] Error code 1

make[3]: stopped in 
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.5.19_4/linuxkpi

1 error

make[3]: stopped in 
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.5.19_4/linuxkpi


I get a similar error with drm-current-kmod. What changed?

imb



Re: RFC: Use of VOP_ALLOCATE() by NFSV4.2 nfsd

2021-10-10 Thread Willem Jan Withagen via freebsd-current

On 10-10-2021 07:57, Rick Macklem wrote:



This leads me to a couple of questions:
- Is there a good reason for not using vop_stdallocate() for ZFS?

Yes.  posix_fallocate is supposed to guarantee that subsequent writes
to the file will not fail with ENOSPC.  But ZFS, being a copy-on-write
file system, cannot possibly guarantee that.  See SVN r325320.

However, vop_stdallocate() just does VOP_WRITE()s to the area (with
bytes of data all zeros). Wouldn't that satisfy the criteria?

I had the same problem in Ceph, where a guaranteed writable space is 
required
for keeping a log of modifications to the system. Not having this space 
might case loss of data.


Writing al zero's is probably even worse on filesystems that have 
compression set.

Almost nothing is allocated, and so no guarantee at all.
Next trick wass to write random data, but then you run into the problem 
signaled by

Alan and Warner. New writes will need free space, since the CoW nature.

Solution was to actually create a specific zpool just for this.
But that will not help you with NFS 4.2 I guess

--WjW




Re: intermittent bsdtar/jemalloc failures

2021-10-08 Thread Michael Butler via freebsd-current

On 10/7/21 20:19, Konstantin Belousov wrote:

On Thu, Oct 07, 2021 at 05:43:14PM -0400, Michael Butler wrote:

On 10/7/21 16:52, Mark Johnston wrote:

On Thu, Oct 07, 2021 at 04:18:28PM -0400, Michael Butler via freebsd-current 
wrote:

On 10/7/21 15:39, Konstantin Belousov wrote:

On Thu, Oct 07, 2021 at 03:28:44PM -0400, Michael Butler via freebsd-current 
wrote:

While building a local release bundle, I sometimes get bsdtar failing (and
dumping core) as follows below. Worse, as can be seen below, it doesn't stop
the build unless I happen to notice and it yields an incomplete package.

a usr/src/sys/netgraph/ng_checksum.h
a usr/src/sys/netgraph/ng_message.h
a usr/src/sys/netgraph/ng_echo.c
a usr/src/sys/netgraph/ng_gif.h
: jemalloc_arena.c:747: Failed assertion:
"nstime_compare(&decay->epoch, &time) <= 0"
Abort trap (core dumped)
sh /usr/src/release/scripts/make-manifest.sh *.txz > MANIFEST

What causes this? Build machine is a 2x4-core Intel box with ZFS
file-systems all around. I tried stopping NTPD temporarily but the failures
persist .. sometimes :-(

I've seen this at different points in the archiving process so it doesn't
seem specific to building kernel.txz.


What timecounter do you use? Perhaps show the whole output from
sysctl kern.timecounter.


imb@vm01:/home/imb> sysctl kern.timecounter
kern.timecounter.tsc_shift: 1
kern.timecounter.smp_tsc_adjust: 0
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: ACPI-fast(900) HPET(950) i8254(0) TSC-low(1000)
dummy(-100)
kern.timecounter.hardware: HPET
kern.timecounter.alloweddeviation: 5
kern.timecounter.timehands_count: 2
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.counter: 16124892
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.counter: 1883995229
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.counter: 57
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.TSC-low.quality: 1000
kern.timecounter.tc.TSC-low.frequency: 1413153007
kern.timecounter.tc.TSC-low.counter: 2352002295
kern.timecounter.tc.TSC-low.mask: 4294967295

I overrode the default selection of counter-type as NTPD drifted so
badly as to require stepping almost hourly :-(


If you return to TSC, does the problem go away?
Same question if you leave HPET on, but set fast_gettime to 0.


While I've only done one build (still) with HPET with 
kern.timecounter.fast_gettime=0, I didn't see a core-dump.

I'll test more over the weekend,

imb





Re: intermittent bsdtar/jemalloc failures

2021-10-07 Thread Michael Butler via freebsd-current

On 10/7/21 16:52, Mark Johnston wrote:

On Thu, Oct 07, 2021 at 04:18:28PM -0400, Michael Butler via freebsd-current 
wrote:

On 10/7/21 15:39, Konstantin Belousov wrote:

On Thu, Oct 07, 2021 at 03:28:44PM -0400, Michael Butler via freebsd-current 
wrote:

While building a local release bundle, I sometimes get bsdtar failing (and
dumping core) as follows below. Worse, as can be seen below, it doesn't stop
the build unless I happen to notice and it yields an incomplete package.

a usr/src/sys/netgraph/ng_checksum.h
a usr/src/sys/netgraph/ng_message.h
a usr/src/sys/netgraph/ng_echo.c
a usr/src/sys/netgraph/ng_gif.h
: jemalloc_arena.c:747: Failed assertion:
"nstime_compare(&decay->epoch, &time) <= 0"
Abort trap (core dumped)
sh /usr/src/release/scripts/make-manifest.sh *.txz > MANIFEST

What causes this? Build machine is a 2x4-core Intel box with ZFS
file-systems all around. I tried stopping NTPD temporarily but the failures
persist .. sometimes :-(

I've seen this at different points in the archiving process so it doesn't
seem specific to building kernel.txz.


What timecounter do you use? Perhaps show the whole output from
sysctl kern.timecounter.


imb@vm01:/home/imb> sysctl kern.timecounter
kern.timecounter.tsc_shift: 1
kern.timecounter.smp_tsc_adjust: 0
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: ACPI-fast(900) HPET(950) i8254(0) TSC-low(1000)
dummy(-100)
kern.timecounter.hardware: HPET
kern.timecounter.alloweddeviation: 5
kern.timecounter.timehands_count: 2
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.counter: 16124892
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.counter: 1883995229
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.counter: 57
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.TSC-low.quality: 1000
kern.timecounter.tc.TSC-low.frequency: 1413153007
kern.timecounter.tc.TSC-low.counter: 2352002295
kern.timecounter.tc.TSC-low.mask: 4294967295

I overrode the default selection of counter-type as NTPD drifted so
badly as to require stepping almost hourly :-(


Could you show output from

# kldload cpuctl
# cpucontrol -i 0x15 /dev/cpuctl0
# cpucontrol -i 0x16 /dev/cpuctl0

as well as a copy of the dmesg after a boot?  I am looking at a similar
problem currently.


root@vm01:/usr/home/imb # cpucontrol -i 0x15 /dev/cpuctl0
cpuid level 0x15: 0x07280202 0x 0x 0x0503
root@vm01:/usr/home/imb # cpucontrol -i 0x16 /dev/cpuctl0
cpuid level 0x16: 0x07280202 0x 0x 0x0503

This is a Dell-1950 1-U box with a SAS drive-box attached ..

root@vm01:/usr/home/imb # less /var/log/dmesg.today
---<>---
VERBOSE_SYSINIT: DDB not enabled, symbol lookups disabled.
Copyright (c) 1992-2021 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 14.0-CURRENT #256 main-42dfad2ef1: Sat Oct  2 09:41:36 EDT 2021

r...@vm01.auburn.protected-networks.net:/usr/obj/usr/src/amd64.amd64/sys/VM01 
amd64
FreeBSD clang version 12.0.1 (g...@github.com:llvm/llvm-project.git 
llvmorg-12.0.1-0-gfed41342a82f)

VT(vga): resolution 640x480
CPU: Intel(R) Xeon(R) CPU   E5440  @ 2.83GHz (2826.31-MHz 
K8-class CPU)

  Origin="GenuineIntel"  Id=0x10676  Family=0x6  Model=0x17  Stepping=6

Features=0xbfebfbff

Features2=0xce3bd
  AMD Features=0x20100800
  AMD Features2=0x1
  VT-x: HLT,PAUSE
  TSC: P-state invariant, performance statistics
real memory  = 68719476736 (65536 MB)
avail memory = 65811677184 (62762 MB)
Event timer "LAPIC" quality 100
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
FreeBSD/SMP: 2 package(s) x 4 core(s)
random: unblocking device.
Security policy loaded: MAC/ntpd (mac_ntpd)
ioapic0: MADT APIC ID 8 != hw id 0
ioapic0  irqs 0-23
Launching APs: 3 4 1 6 2 5 7
Timecounter "TSC-low" frequency 1413155409 Hz quality 1000
random: entropy device external interface
kbd1 at kbdmux0
vtvga0: 
smbios0:  at iomem 0xfcdf0-0xfce0e
smbios0: Version: 2.5, BCD Revision: 2.5
acpi0: 
acpi0: Power Button (fixed)
Firmware Error (ACPI): Could not resolve symbol [\134_SB._OSC.CDW1], 
AE_NOT_FOUND (20210930/psargs-503)
ACPI Error: Aborting method \134_SB._OSC due to previous error 
(AE_NOT_FOUND) (20210930/psparse-689)

apei0:  on acpi0
ipmi0:  port 0xca8,0xcac on acpi0
ipmi0: KCS mode found at io 0xca8 on acpi
cpu0:  on acpi0
atrtc0:  port 0x70-0x7f irq 8 on acpi0
atrtc0: registered as a time-

Re: intermittent bsdtar/jemalloc failures

2021-10-07 Thread Michael Butler via freebsd-current

On 10/7/21 15:39, Konstantin Belousov wrote:

On Thu, Oct 07, 2021 at 03:28:44PM -0400, Michael Butler via freebsd-current 
wrote:

While building a local release bundle, I sometimes get bsdtar failing (and
dumping core) as follows below. Worse, as can be seen below, it doesn't stop
the build unless I happen to notice and it yields an incomplete package.

a usr/src/sys/netgraph/ng_checksum.h
a usr/src/sys/netgraph/ng_message.h
a usr/src/sys/netgraph/ng_echo.c
a usr/src/sys/netgraph/ng_gif.h
: jemalloc_arena.c:747: Failed assertion:
"nstime_compare(&decay->epoch, &time) <= 0"
Abort trap (core dumped)
sh /usr/src/release/scripts/make-manifest.sh *.txz > MANIFEST

What causes this? Build machine is a 2x4-core Intel box with ZFS
file-systems all around. I tried stopping NTPD temporarily but the failures
persist .. sometimes :-(

I've seen this at different points in the archiving process so it doesn't
seem specific to building kernel.txz.


What timecounter do you use? Perhaps show the whole output from
sysctl kern.timecounter.


imb@vm01:/home/imb> sysctl kern.timecounter
kern.timecounter.tsc_shift: 1
kern.timecounter.smp_tsc_adjust: 0
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: ACPI-fast(900) HPET(950) i8254(0) TSC-low(1000) 
dummy(-100)

kern.timecounter.hardware: HPET
kern.timecounter.alloweddeviation: 5
kern.timecounter.timehands_count: 2
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.counter: 16124892
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.counter: 1883995229
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.counter: 57
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.TSC-low.quality: 1000
kern.timecounter.tc.TSC-low.frequency: 1413153007
kern.timecounter.tc.TSC-low.counter: 2352002295
kern.timecounter.tc.TSC-low.mask: 4294967295

I overrode the default selection of counter-type as NTPD drifted so 
badly as to require stepping almost hourly :-(


So .. I have this in /etc/sysctl.conf ..

kern.timecounter.hardware=HPET

While I hope it wouldn't make a difference, I also have powerd enabled 
in /etc/rc.conf to (marginally) reduce the power-consumption when the 
machine is near-idle. sysctl -a | grep ^dev.cpu | grep freq shows ..


dev.cpu.7.freq_levels: 2834/103000 2333/9 2000/79000
dev.cpu.7.freq: 2834
dev.cpu.3.freq_levels: 2834/103000 2333/94000 2000/86000
dev.cpu.3.freq: 2834
dev.cpu.5.freq_levels: 2834/103000 2333/9 2000/79000
dev.cpu.5.freq: 2834
dev.cpu.1.freq_levels: 2834/103000 2333/94000 2000/86000
dev.cpu.1.freq: 2834
dev.cpu.6.freq_levels: 2834/103000 2333/9 2000/79000
dev.cpu.6.freq: 2834
dev.cpu.2.freq_levels: 2834/103000 2333/94000 2000/86000
dev.cpu.2.freq: 2834
dev.cpu.4.freq_levels: 2834/103000 2333/9 2000/79000
dev.cpu.4.freq: 2834
dev.cpu.0.freq_levels: 2834/103000 2333/94000 2000/86000
dev.cpu.0.freq: 2834

imb


OpenPGP_signature
Description: OpenPGP digital signature


intermittent bsdtar/jemalloc failures

2021-10-07 Thread Michael Butler via freebsd-current
While building a local release bundle, I sometimes get bsdtar failing 
(and dumping core) as follows below. Worse, as can be seen below, it 
doesn't stop the build unless I happen to notice and it yields an 
incomplete package.


a usr/src/sys/netgraph/ng_checksum.h
a usr/src/sys/netgraph/ng_message.h
a usr/src/sys/netgraph/ng_echo.c
a usr/src/sys/netgraph/ng_gif.h
: jemalloc_arena.c:747: Failed assertion: 
"nstime_compare(&decay->epoch, &time) <= 0"

Abort trap (core dumped)
sh /usr/src/release/scripts/make-manifest.sh *.txz > MANIFEST

What causes this? Build machine is a 2x4-core Intel box with ZFS 
file-systems all around. I tried stopping NTPD temporarily but the 
failures persist .. sometimes :-(


I've seen this at different points in the archiving process so it 
doesn't seem specific to building kernel.txz.


Any thoughts?

imb


OpenPGP_signature
Description: OpenPGP digital signature


Re: I get odd time reports from poudriere on armv7 system, under a (non-debug) main [so: 14] FreeBSD.

2021-09-26 Thread Mark Millard via freebsd-current
On 2021-Sep-25, at 23:25, Mark Millard  wrote:

> I get odd time reports from poudriere on an armv7 under main [so: 14]:
> 
> 
> 
> # poudriere bulk -jmain-CA7 lang/rust
> [00:00:00] Creating the reference jail... done
> . . .
> [00:00:00] Balancing pool
> [main-CA7-default] [2021-09-25_23h11m13s] [balancing_pool:] Queued: 70 Built: 
> 0  Failed: 0  Skipped: 0  Ignored: 0  Fetched: 0  Tobuild: 70  Time: 
> -258342:-3:-36
> [00:00:00] Recording filesystem state for prepkg... done
> . . .
> 
> 
> # poudriere bulk -j13_0R-CA7 lang/rust
> [00:00:00] Creating the reference jail... done
> . . .
> [00:00:00] Balancing pool
> [13_0R-CA7-default] [2021-09-25_18h06m23s] [balancing_pool:] Queued: 1  
> Built: 0  Failed: 0  Skipped: 0  Ignored: 0  Fetched: 0  Tobuild: 1   Time: 
> -9522:-38:-44
> [00:00:00] Recording filesystem state for prepkg... done
> . . .
> 
> 
> # poudriere bulk -j13_0R-CA7 lang/rust
> [00:00:00] Creating the reference jail... done
> . . .
> [00:00:00] Balancing pool
> [13_0R-CA7-default] [2021-09-25_22h52m58s] [balancing_pool:] Queued: 1  
> Built: 0  Failed: 0  Skipped: 0  Ignored: 0  Fetched: 0  Tobuild: 1   Time: 
> -666894:-15:-9
> [00:00:00] Recording filesystem state for prepkg... done
> . . .
> 
> 
> For reference:
> 
> # poudriere version
> poudriere-git-3.3.99.20210907_1
> 
> # uname -apKU
> FreeBSD OPiP2E_RPi2v11 14.0-CURRENT FreeBSD 14.0-CURRENT #9 
> main-n249019-0637070b5bca-dirty: Sat Sep  4 03:15:41 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA7-nodbg-clang/usr/main-src/arm.armv7/sys/GENERIC-NODBG-CA7
>   arm armv7 1400032 1400032
> 
> # poudriere jail -jmain-CA7 -i
> Jail name: main-CA7
> Jail version:  14.0-CURRENT
> Jail arch: arm.armv7
> Jail method:   null
> Jail mount:/usr/obj/DESTDIRs/main-CA7-poud
> Jail fs:   
> Jail updated:  2021-06-27 17:58:33
> Jail pkgbase:  disabled
> 
> # poudriere jail -j13_0R-CA7 -i
> Jail name: 13_0R-CA7
> Jail version:  13.0-RELEASE-p4
> Jail arch: arm.armv7
> Jail method:   null
> Jail mount:/usr/obj/DESTDIRs/13_0R-CA7-poud
> Jail fs:   
> Jail updated:  2021-09-06 19:10:46
> Jail pkgbase:  disabled
> 
> # chroot /usr/obj/DESTDIRs/main-CA7-poud/
> # uname -apKU
> FreeBSD OPiP2E_RPi2v11 14.0-CURRENT FreeBSD 14.0-CURRENT #9 
> main-n249019-0637070b5bca-dirty: Sat Sep  4 03:15:41 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA7-nodbg-clang/usr/main-src/arm.armv7/sys/GENERIC-NODBG-CA7
>   arm armv7 1400032 1400032
> 
> # chroot /usr/obj/DESTDIRs/13_0R-CA7-poud/
> # uname -apKU
> FreeBSD OPiP2E_RPi2v11 14.0-CURRENT FreeBSD 14.0-CURRENT #9 
> main-n249019-0637070b5bca-dirty: Sat Sep  4 03:15:41 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA7-nodbg-clang/usr/main-src/arm.armv7/sys/GENERIC-NODBG-CA7
>   arm armv7 1400032 1300139
> 


This looks to be poudriere's problem . . .

poudriere/src/libexec/poudriere/clock/clock.c

has:

if (argc == 3 && strcmp(argv[2], "-nsec") == 0)
printf("%ld.%ld\n", ts.tv_sec, ts.tv_nsec);
else
printf("%ld\n", ts.tv_sec);

where:

 struct timespec {
 time_t  tv_sec; /* seconds */
 longtv_nsec;/* and nanoseconds */
 };


but for tv_sec the type is for armv7:

/usr/include/machine/_types.h:typedef   __int64_t   __time_t;   
/* time()... */

From man arch:

 Machine-dependent type sizes:

   Architecturevoid *long doubletime_t
   aarch64 8 16 8
   amd64   8 16 8
   armv6   4 8  8
   armv7   4 8  8
   i3864 12 4
   mips4 8  8
   mipsel  4 8  8
   mipselhf4 8  8
   mipshf  4 8  8
   mipsn32 4 8  8
   mips64  8 8  8
   mips64el8 8  8
   mips64elhf  8 8  8
   mips64hf8 8  8
   powerpc 4 8  8
   powerpcspe  4 8  8
   powerpc64   8 8  8
   powerpc64le 8 8  8
   riscv64 8 16 8
   riscv64sf   8 16 8

%ld is for long arguments, 32-bits in an ILP32 context, not __int64_t
(long long) arguments. Applies to armv6, armv7, mips, mipsel, mipselhf,
mipshf, mipsn32, powerpc, and powerpcspe.

Note: i386 should use %ld for time_t for FreeBSD, despite being IPL32.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: [HEADSUP] making /bin/sh the default shell for root

2021-09-22 Thread Daniel Morante via freebsd-current
Will history/completion continue to work the same way? (for example 
typing part of the command, pressing UP and having it complete based on 
history)


On 9/22/2021 4:36 AM, Baptiste Daroussin wrote:

Hello,

TL;DR: this is not a proposal to deorbit csh from base!!!

For years now, csh is the default root shell for FreeBSD, csh can be confusing
as a default shell for many as all other unix like settled on a bourne shell
compatible interactive shell: zsh, bash, or variant of ksh.

Recently our sh(1) has receive update to make it more user friendly in
interactive mode:
* command completion (thanks pstef@)
* improvement in the emacs mode, to make it behave by default like other shells
* improvement in the vi mode (in particular the vi edit to respect $EDITOR)
* support for history as described by POSIX.

This makes it a usable shell by default, which is why I would like to propose to
make it the default shell for root starting FreeBSD 14.0-RELEASE (not MFCed)

If no strong arguments has been raised until October 15th, I will make this
proposal happen.

Again just in case: THIS IS NOT A PROPOSAL TO REMOVE CSH FROM BASE!

Best regards,
Baptiste





smime.p7s
Description: S/MIME Cryptographic Signature


Re: zpool import: "The pool cannot be imported due to damaged devices or data" but zpool status -x: "all pools are healthy" and zpool destroy: "no such pool"

2021-09-16 Thread Mark Millard via freebsd-current



On 2021-Sep-16, at 16:27, Freddie Cash  wrote:
> 
> [message chopped and butchered, don't follow the quotes, it's just to show 
> some bits together from different messages]
> 
> On Thu, Sep 16, 2021 at 3:54 PM Mark Millard via freebsd-current 
>  wrote:
> > > For reference, as things now are:
> > > 
> > > # gpart show
> > > =>   40  937703008  nda0  GPT  (447G)
> > >  40 532480 1  efi  (260M)
> > >  532520   2008- free -  (1.0M)
> > >  534528  937166848 2  freebsd-zfs  (447G)
> > >   937701376   1672- free -  (836K)
> > > . . .
>  
> > > So you just want to clean nda0p2 in order to reuse it?  Do "zpool 
> > > labelclear -f /dev/nda0p2"
> > > 
> >> 
> >> I did not extract and show everything that I'd tried but
> >> there were examples of:
> >> 
> >> # zpool labelclear -f /dev/nda0p2
> >> failed to clear label for /dev/nda0p2
> 
> The start of the problem looked like (console context,
> so messages interlaced):
> 
> # zpool create -O compress=lz4 -O atime=off -f -tzopt0 zpopt0 /dev/nvd0
> GEOM: nda0: the primary GPT table is corrupt or invalid.
> GEOM: nda0: using the secondary instead -- recovery strongly advised.
> cannot create 'zpopt0': no such pool or dataset
> # Sep 16 12:19:31 CA72_4c8G_ZFS ZFS[]: vdev problem, zpool=zopt0 
> path=/dev/nvd0 type=ereport.fs.zfs.vdev.open_failed
> 
> The GPT table was okay just prior to the command.
> So I recovered it.
> 
> It looks like you're trying to use a disk partition for a ZFS pool (nda0p2), 
> but then you turn around and use the entire drive (nvd0) for the pool which 
> clobbers the GPT.

I'd not noticed my lack of a "p2" suffix. Thanks. Explains how
I got things messed up, with GPT and zfs conflicting. (Too many
distractions at the time, I guess.)

> You need to be consistent in using partitions for all commands.

Yep: dumb typo that I'd not noticed.

> You're also mixing up your disk device nodes for the different commands; 
> while they are just different names for the same thing, it's best to be 
> consistent.

Once I had commands failing, I expectly tried alternatives that
I thought should be equivalent in case they were not in some way.
Not my normal procedure.

> GEOM is built out of layers (or more precisely, "containers" as it specifies 
> a new start and end point on the disk), which is very powerful.  But it's 
> also very easy to make a mess of things when you start accessing things 
> outside of the layers.  :)  And ZFS labelclear option is the nuclear option 
> that tends to remove everything ZFS-related, and everything GPT-related; 
> although I've never seen it used on a partition before, usually just the disk.

> Best bet in this situation is to just zero out the entire disk (dd 
> if=/dev/zero of=/dev/nda0 bs=1M), and start over from scratch.  Create a new 
> GPT.  Create new partitions.  Use the specific partition with the "zpool 
> create" command.

I ended up writing something less than a full 480 GiByte of writes. It
preserved /dev/nda0p1 .

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: zpool import: "The pool cannot be imported due to damaged devices or data" but zpool status -x: "all pools are healthy" and zpool destroy: "no such pool"

2021-09-16 Thread Mark Millard via freebsd-current



On 2021-Sep-16, at 15:16, Alan Somers  wrote:

> On Thu, Sep 16, 2021 at 4:02 PM Mark Millard  wrote:
> 
> 
> On 2021-Sep-16, at 13:39, Alan Somers  wrote:
> 
> > On Thu, Sep 16, 2021 at 2:04 PM Mark Millard via freebsd-current 
> >  wrote:
> > What do I go about:
> > 
> > QUOTE
> > # zpool import
> >pool: zopt0
> >  id: 18166787938870325966
> >   state: FAULTED
> > status: One or more devices contains corrupted data.
> >  action: The pool cannot be imported due to damaged devices or data.
> >see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
> >  config:
> > 
> > zopt0   FAULTED  corrupted data
> >   nda0p2UNAVAIL  corrupted data
> > 
> > # zpool status -x
> > all pools are healthy
> > 
> > # zpool destroy zopt0
> > cannot open 'zopt0': no such pool
> > END QUOTE
> > 
> > (I had attempted to clean out the old zfs context on
> > the media and delete/replace the 2 freebsd swap
> > partitions and 1 freebsd-zfs partition, leaving the
> > efi partition in place. Clearly I did not do everything
> > require [or something is very wrong]. zopt0 had been
> > a root-on-ZFS context and would be again. I have a
> > backup of the context to send/receive once the pool
> > in the partition is established.)
> > 
> > For reference, as things now are:
> > 
> > # gpart show
> > =>   40  937703008  nda0  GPT  (447G)
> >  40 532480 1  efi  (260M)
> >  532520   2008- free -  (1.0M)
> >  534528  937166848 2  freebsd-zfs  (447G)
> >   937701376   1672- free -  (836K)
> > . . .
> > 
> > (That is not how it looked before I started.)
> > 
> > # uname -apKU
> > FreeBSD CA72_4c8G_ZFS 13.0-RELEASE-p4 FreeBSD 13.0-RELEASE-p4 #4 
> > releng/13.0-n244760-940681634ee1-dirty: Mon Aug 30 11:35:45 PDT 2021 
> > root@CA72_16Gp_ZFS:/usr/obj/BUILDs/13_0R-CA72-nodbg-clang/usr/13_0R-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
> >   arm64 aarch64 1300139 1300139
> > 
> > I have also tried under:
> > 
> > # uname -apKU
> > FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #12 
> > main-n249019-0637070b5bca-dirty: Tue Aug 31 02:24:20 PDT 2021 
> > root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
> >   arm64 aarch64 1400032 1400032
> > 
> > after reaching this state. It behaves the same.
> > 
> > The text presented by:
> > 
> > https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
> > 
> > does not deal with what is happening overall.
> > 
> > So you just want to clean nda0p2 in order to reuse it?  Do "zpool 
> > labelclear -f /dev/nda0p2"
> > 
>> 
>> I did not extract and show everything that I'd tried but
>> there were examples of:
>> 
>> # zpool labelclear -f /dev/nda0p2
>> failed to clear label for /dev/nda0p2
>> 
>> from when I'd tried such. So far I've not
>> identified anything with official commands
>> to deal with the issue.
>> 
> That is the correct command to run.  However, the OpenZFS import in FreeBSD 
> 13.0 brought in a regression in that command.  It wasn't a code bug really, 
> more like a UI bug.  OpenZFS just had a less useful labelclear command than 
> FreeBSD did.  The regression has now been fixed upstream.
> https://github.com/openzfs/zfs/pull/12511

Cool.

>> Ultimately I zeroed out areas of the media that
>> happened to span the zfs related labels. After
>> that things returned to normal. I'd still like
>> to know a supported way of dealing with the
>> issue.
>> 
>> The page at the URL it listed just says:
>> 
>> QUOTE
>> The pool must be destroyed and recreated from an appropriate backup source
>> END QUOTE
> 
> It advised to to "destroy and recreate" the pool because you ran "zpool 
> import", so ZFS thought that you actually wanted to import the pool.  The 
> error message is appropriate if that had been the case.

The start of the problem looked like (console context,
so messages interlaced):

# zpool create -O compress=lz4 -O atime=off -f -tzopt0 zpopt0 /dev/nvd0
GEOM: nda0: the primary GPT table is corrupt or invalid.
GEOM: nda0: using the secondary instead -- recovery strongly advised.
cannot create 'zpopt0': no such pool or dataset
# Sep 16 12:19:31 CA72_4c8G_ZFS ZFS[]: vdev problem, zpool=zopt0 
path=/dev/nvd0 type=ereport.fs.zfs.vdev.open_failed

The GPT table was okay just

Re: zpool import: "The pool cannot be imported due to damaged devices or data" but zpool status -x: "all pools are healthy" and zpool destroy: "no such pool"

2021-09-16 Thread Mark Millard via freebsd-current



On 2021-Sep-16, at 13:26, joe mcguckin  wrote:

> I experienced the same yesterday. I grabbed an old disk that was previously 
> part of a pool. Stuck it in the chassis and did ‘zpool import’ and got the 
> same output you did.

Mine was a single-disk pool. I use zfs just in order to
use bectl, not for redundancy or other such. So my
configuration is very simple.

> Since the other drives of the pool were missing, the pool could not be 
> imported.
> 
> zpool status reports 'everything ok’ because all the existing pools are ok. 
> zpool destroy can’t destroy the pool becuase it has not been imported.

Yea, but the material at the URL it listed just says:

QUOTE
The pool must be destroyed and recreated from an appropriate backup source
END QUOTE

so it says to do something that in my context could not
be done via the normal zfs-related commands as far as I
can tell.

> I simply created a new pool specifying the drive address of the disk - zfs 
> happily overwrote the old incomplete pool info.

Ultimately, I zeroed out an area of the media that
had the zfs related labels and after that things
operated normally and I could recreate the pool in
the partition, send/recieve to it the backup, and
use the restored state. I did not find a way to
use the zpool/zfs related commands to deal with
fixing the messed-up status. (I did not report
everything that I'd tried.)

> joe
> 
> 
> Joe McGuckin
> ViaNet Communications
> 
> j...@via.net
> 650-207-0372 cell
> 650-213-1302 office
> 650-969-2124 fax
> 
> 
> 
>> On Sep 16, 2021, at 1:01 PM, Mark Millard via freebsd-current 
>>  wrote:
>> 
>> What do I go about:
>> 
>> QUOTE
>> # zpool import
>>   pool: zopt0
>> id: 18166787938870325966
>>  state: FAULTED
>> status: One or more devices contains corrupted data.
>> action: The pool cannot be imported due to damaged devices or data.
>>   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
>> config:
>> 
>>zopt0   FAULTED  corrupted data
>>  nda0p2UNAVAIL  corrupted data
>> 
>> # zpool status -x
>> all pools are healthy
>> 
>> # zpool destroy zopt0
>> cannot open 'zopt0': no such pool
>> END QUOTE
>> 
>> (I had attempted to clean out the old zfs context on
>> the media and delete/replace the 2 freebsd swap
>> partitions and 1 freebsd-zfs partition, leaving the
>> efi partition in place. Clearly I did not do everything
>> require [or something is very wrong]. zopt0 had been
>> a root-on-ZFS context and would be again. I have a
>> backup of the context to send/receive once the pool
>> in the partition is established.)
>> 
>> For reference, as things now are:
>> 
>> # gpart show
>> =>   40  937703008  nda0  GPT  (447G)
>> 40 532480 1  efi  (260M)
>> 532520   2008- free -  (1.0M)
>> 534528  937166848 2  freebsd-zfs  (447G)
>>  937701376   1672- free -  (836K)
>> . . .
>> 
>> (That is not how it looked before I started.)
>> 
>> # uname -apKU
>> FreeBSD CA72_4c8G_ZFS 13.0-RELEASE-p4 FreeBSD 13.0-RELEASE-p4 #4 
>> releng/13.0-n244760-940681634ee1-dirty: Mon Aug 30 11:35:45 PDT 2021 
>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/13_0R-CA72-nodbg-clang/usr/13_0R-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>>   arm64 aarch64 1300139 1300139
>> 
>> I have also tried under:
>> 
>> # uname -apKU
>> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #12 
>> main-n249019-0637070b5bca-dirty: Tue Aug 31 02:24:20 PDT 2021 
>> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>>   arm64 aarch64 1400032 1400032
>> 
>> after reaching this state. It behaves the same.
>> 
>> The text presented by:
>> 
>> https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
>> 
>> does not deal with what is happening overall.
>> 
> 


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: zpool import: "The pool cannot be imported due to damaged devices or data" but zpool status -x: "all pools are healthy" and zpool destroy: "no such pool"

2021-09-16 Thread Mark Millard via freebsd-current



On 2021-Sep-16, at 13:39, Alan Somers  wrote:

> On Thu, Sep 16, 2021 at 2:04 PM Mark Millard via freebsd-current 
>  wrote:
> What do I go about:
> 
> QUOTE
> # zpool import
>pool: zopt0
>  id: 18166787938870325966
>   state: FAULTED
> status: One or more devices contains corrupted data.
>  action: The pool cannot be imported due to damaged devices or data.
>see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
>  config:
> 
> zopt0   FAULTED  corrupted data
>   nda0p2UNAVAIL  corrupted data
> 
> # zpool status -x
> all pools are healthy
> 
> # zpool destroy zopt0
> cannot open 'zopt0': no such pool
> END QUOTE
> 
> (I had attempted to clean out the old zfs context on
> the media and delete/replace the 2 freebsd swap
> partitions and 1 freebsd-zfs partition, leaving the
> efi partition in place. Clearly I did not do everything
> require [or something is very wrong]. zopt0 had been
> a root-on-ZFS context and would be again. I have a
> backup of the context to send/receive once the pool
> in the partition is established.)
> 
> For reference, as things now are:
> 
> # gpart show
> =>   40  937703008  nda0  GPT  (447G)
>  40 532480 1  efi  (260M)
>  532520   2008- free -  (1.0M)
>  534528  937166848 2  freebsd-zfs  (447G)
>   937701376   1672- free -  (836K)
> . . .
> 
> (That is not how it looked before I started.)
> 
> # uname -apKU
> FreeBSD CA72_4c8G_ZFS 13.0-RELEASE-p4 FreeBSD 13.0-RELEASE-p4 #4 
> releng/13.0-n244760-940681634ee1-dirty: Mon Aug 30 11:35:45 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/13_0R-CA72-nodbg-clang/usr/13_0R-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 1300139 1300139
> 
> I have also tried under:
> 
> # uname -apKU
> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #12 
> main-n249019-0637070b5bca-dirty: Tue Aug 31 02:24:20 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 1400032 1400032
> 
> after reaching this state. It behaves the same.
> 
> The text presented by:
> 
> https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
> 
> does not deal with what is happening overall.
> 
> So you just want to clean nda0p2 in order to reuse it?  Do "zpool labelclear 
> -f /dev/nda0p2"
> 

I did not extract and show everything that I'd tried but
there were examples of:

# zpool labelclear -f /dev/nda0p2
failed to clear label for /dev/nda0p2

from when I'd tried such. So far I've not
identified anything with official commands
to deal with the issue.

Ultimately I zeroed out areas of the media that
happened to span the zfs related labels. After
that things returned to normal. I'd still like
to know a supported way of dealing with the
issue.

The page at the URL it listed just says:

QUOTE
The pool must be destroyed and recreated from an appropriate backup source
END QUOTE

But the official destroy commands did not work:
same sort of issue of reporting that nothing
appropriate was found to destroy and no way to
import the problematical pool.


Note: I use ZFS because of wanting to use bectl, not
for redundancy or such. So the configuration is very
simple.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Re: zpool import: "The pool cannot be imported due to damaged devices or data" but zpool status -x: "all pools are healthy" and zpool destroy: "no such pool"

2021-09-16 Thread Mark Millard via freebsd-current
On 2021-Sep-16, at 13:01, Mark Millard  wrote:

> What do I go about:
> 
> QUOTE
> # zpool import
>   pool: zopt0
> id: 18166787938870325966
>  state: FAULTED
> status: One or more devices contains corrupted data.
> action: The pool cannot be imported due to damaged devices or data.
>   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
> config:
> 
>zopt0   FAULTED  corrupted data
>  nda0p2UNAVAIL  corrupted data
> 
> # zpool status -x
> all pools are healthy
> 
> # zpool destroy zopt0
> cannot open 'zopt0': no such pool
> END QUOTE
> 
> (I had attempted to clean out the old zfs context on
> the media and delete/replace the 2 freebsd swap
> partitions and 1 freebsd-zfs partition, leaving the
> efi partition in place. Clearly I did not do everything
> require [or something is very wrong]. zopt0 had been
> a root-on-ZFS context and would be again. I have a
> backup of the context to send/receive once the pool
> in the partition is established.)
> 
> For reference, as things now are:
> 
> # gpart show
> =>   40  937703008  nda0  GPT  (447G)
> 40 532480 1  efi  (260M)
> 532520   2008- free -  (1.0M)
> 534528  937166848 2  freebsd-zfs  (447G)
>  937701376   1672- free -  (836K)
> . . .
> 
> (That is not how it looked before I started.)
> 
> # uname -apKU
> FreeBSD CA72_4c8G_ZFS 13.0-RELEASE-p4 FreeBSD 13.0-RELEASE-p4 #4 
> releng/13.0-n244760-940681634ee1-dirty: Mon Aug 30 11:35:45 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/13_0R-CA72-nodbg-clang/usr/13_0R-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 1300139 1300139
> 
> I have also tried under:
> 
> # uname -apKU
> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #12 
> main-n249019-0637070b5bca-dirty: Tue Aug 31 02:24:20 PDT 2021 
> root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
>   arm64 aarch64 1400032 1400032
> 
> after reaching this state. It behaves the same.
> 
> The text presented by:
> 
> https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
> 
> does not deal with what is happening overall.
> 

I finally seem to have stomped on enough to have gotten
past the issue (last actions):

# gpart add -tfreebsd-swap -s440g /dev/nda0
nda0p2 added

# gpart add -tfreebsd-swap /dev/nda0
nda0p3 added

7384907776 bytes transferred in 5.326024 secs (1386570546 bytes/sec)
# dd if=/dev/zero of=/dev/nda0p3 bs=4k conv=sync status=progress
dd: /dev/nda0p3: end of device972 MiB) transferred 55.001s, 133 MB/s

1802957+0 records in
1802956+0 records out
7384907776 bytes transferred in 55.559644 secs (132918559 bytes/sec)

# gpart delete -i3 /dev/nda0
nda0p3 deleted

# gpart delete -i2 /dev/nda0
nda0p2 deleted

# gpart add -tfreebsd-zfs -a1m /dev/nda0
nda0p2 added

# zpool import
no pools available to import

# gpart show
. . .

=>   40  937703008  nda0  GPT  (447G)
 40 532480 1  efi  (260M)
 532520   2008- free -  (1.0M)
 534528  937166848 2  freebsd-zfs  (447G)
  937701376   1672- free -  (836K)

# zpool create -O compress=lz4 -O atime=off -f -tzpopt0 zopt0 /dev/nda0p2

# zpool list
NAME SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAGCAP  DEDUPHEALTH  
ALTROOT
zpopt0   444G   420K   444G- - 0% 0%  1.00xONLINE  -
zroot824G   105G   719G- - 1%12%  1.00xONLINE  -


I've no clue what made my original zpool labelclear -f attempt
leave material behind before repartitioning. Still could have
been operator error of some kind.



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




zpool import: "The pool cannot be imported due to damaged devices or data" but zpool status -x: "all pools are healthy" and zpool destroy: "no such pool"

2021-09-16 Thread Mark Millard via freebsd-current
What do I go about:

QUOTE
# zpool import
   pool: zopt0
 id: 18166787938870325966
  state: FAULTED
status: One or more devices contains corrupted data.
 action: The pool cannot be imported due to damaged devices or data.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
 config:

zopt0   FAULTED  corrupted data
  nda0p2UNAVAIL  corrupted data

# zpool status -x
all pools are healthy

# zpool destroy zopt0
cannot open 'zopt0': no such pool
END QUOTE

(I had attempted to clean out the old zfs context on
the media and delete/replace the 2 freebsd swap
partitions and 1 freebsd-zfs partition, leaving the
efi partition in place. Clearly I did not do everything
require [or something is very wrong]. zopt0 had been
a root-on-ZFS context and would be again. I have a
backup of the context to send/receive once the pool
in the partition is established.)

For reference, as things now are:

# gpart show
=>   40  937703008  nda0  GPT  (447G)
 40 532480 1  efi  (260M)
 532520   2008- free -  (1.0M)
 534528  937166848 2  freebsd-zfs  (447G)
  937701376   1672- free -  (836K)
. . .

(That is not how it looked before I started.)

# uname -apKU
FreeBSD CA72_4c8G_ZFS 13.0-RELEASE-p4 FreeBSD 13.0-RELEASE-p4 #4 
releng/13.0-n244760-940681634ee1-dirty: Mon Aug 30 11:35:45 PDT 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/13_0R-CA72-nodbg-clang/usr/13_0R-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
  arm64 aarch64 1300139 1300139

I have also tried under:

# uname -apKU
FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #12 
main-n249019-0637070b5bca-dirty: Tue Aug 31 02:24:20 PDT 2021 
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
  arm64 aarch64 1400032 1400032

after reaching this state. It behaves the same.

The text presented by:

https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E

does not deal with what is happening overall.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




git commit for WITH_DETECT_TZ_CHANGES breaks date, et al

2021-09-13 Thread Michael Butler via freebsd-current
After commit ddedf2a11eb20af1ee52cb3da70a57c21904af8f date fails to 
recognize any configured timezone when WITH_DETECT_TZ_CHANGES is not set.


For example ..

imb@vm01:/home/imb> date
Tue Sep 14 01:25:57  2021

Every other daemon also thinks it's running in UTC+0 :-(

When libc is recompiled with WITH_DETECT_TZ_CHANGES=yes in 
/etc/src.conf, the output is (for me) ..


imb@vm01:/usr/src/lib/libc> date
Mon Sep 13 21:28:29 EDT 2021

imb






Re: recent head having significantly less "avail memory"

2021-09-13 Thread Guido Falsi via freebsd-current

On 14/09/21 00:12, Konstantin Belousov wrote:

On Mon, Sep 13, 2021 at 10:07:46PM +0200, Guido Falsi wrote:

On 13/09/21 19:08, Konstantin Belousov wrote:

On Mon, Sep 13, 2021 at 02:59:25PM +0200, Guido Falsi via freebsd-current wrote:

Hi,

I updated head recently and today I noticed a difference which looks wrong.

At boodt the new head shows signifcantly less avail memory than before,
around 3 GiB less.

I moved from commit 71fbc6faed6 [1] where I got:

Aug 28 22:03:03 marvin kernel: real memory  = 17179869184 (16384 MB)
Aug 28 22:03:03 marvin kernel: avail memory = 16590352384 (15821 MB)

to commit 7955efd574b [2] where I get:

Sep 13 10:44:40 marvin kernel: real memory  = 17179869184 (16384 MB)
Sep 13 10:44:40 marvin kernel: avail memory = 13298876416 (12682 MB)

I'm seeing this on multiple machines.

Unluckily bisecting and trying an older loader.efi in sseparate tests did
not give me any more insight.

The recent changes to efi loader, starting with commit 6032b6ba9596 [3] look
like a possible trigger to this, but I have been unable to confirm it.

Any suggesstions on how to proceed to debug thiss? ANy idea what a fix could
be?


Is this UEFI or bios boot?
Provide verbose dmesg for old and new boots on the same machine.
For UEFI boot, show output of 'sysctl machdep.efi_map', again for old
and new boots.



I shared the data you request here:

https://www.madpilot.net/cloud/s/ENW5zF7jfmrmFeG


Thanks.

If you do on the loader prompt for the new (AKA bad) kernel
copy_staging enable
and then boot, does the report of avail memory becomes good?



Yes, it works as expected, that is reports the amount of memory I expect:

Sep 14 00:24:50 marvin kernel: real memory  = 17179869184 (16384 MB)
Sep 14 00:24:50 marvin kernel: avail memory = 16590356480 (15821 MB)

--
Guido Falsi 



Re: recent head having significantly less "avail memory"

2021-09-13 Thread Guido Falsi via freebsd-current

On 13/09/21 19:08, Konstantin Belousov wrote:

On Mon, Sep 13, 2021 at 02:59:25PM +0200, Guido Falsi via freebsd-current wrote:

Hi,

I updated head recently and today I noticed a difference which looks wrong.

At boodt the new head shows signifcantly less avail memory than before,
around 3 GiB less.

I moved from commit 71fbc6faed6 [1] where I got:

Aug 28 22:03:03 marvin kernel: real memory  = 17179869184 (16384 MB)
Aug 28 22:03:03 marvin kernel: avail memory = 16590352384 (15821 MB)

to commit 7955efd574b [2] where I get:

Sep 13 10:44:40 marvin kernel: real memory  = 17179869184 (16384 MB)
Sep 13 10:44:40 marvin kernel: avail memory = 13298876416 (12682 MB)

I'm seeing this on multiple machines.

Unluckily bisecting and trying an older loader.efi in sseparate tests did
not give me any more insight.

The recent changes to efi loader, starting with commit 6032b6ba9596 [3] look
like a possible trigger to this, but I have been unable to confirm it.

Any suggesstions on how to proceed to debug thiss? ANy idea what a fix could
be?


Is this UEFI or bios boot?
Provide verbose dmesg for old and new boots on the same machine.
For UEFI boot, show output of 'sysctl machdep.efi_map', again for old
and new boots.



I shared the data you request here:

https://www.madpilot.net/cloud/s/ENW5zF7jfmrmFeG

--
Guido Falsi 



Re: recent head having significantly less "avail memory"

2021-09-13 Thread Guido Falsi via freebsd-current

On 13/09/21 20:17, Ryan Stone wrote:

On Mon, Sep 13, 2021 at 2:13 PM Guido Falsi via freebsd-current
 wrote:

I'm not sure how to get the verbose data for the old boot, since I've
been unable to revert the machine to the old state. I'll try anyway though.


Do you have physical access to the machine?  It might be easiest to
grab a snapshot image, stick it on a USB drive and boot from that.



I definitely have physical access, it's my desktop, laptop and build 
machines.


First thing I'm doing is disable cron job removing old zfs snapshots, so 
state is not lost.


Since this is involving only UEFI, loader and kernel, I've recovered the 
old parts and now I have the machine reporting the usual amount of 
memory, so I should be able to extract the requested data and post it 
shortly.


Thanks for your help anyway!

--
Guido Falsi 



Re: recent head having significantly less "avail memory"

2021-09-13 Thread Guido Falsi via freebsd-current

On 13/09/21 19:08, Konstantin Belousov wrote:

On Mon, Sep 13, 2021 at 02:59:25PM +0200, Guido Falsi via freebsd-current wrote:

Hi,

I updated head recently and today I noticed a difference which looks wrong.

At boodt the new head shows signifcantly less avail memory than before,
around 3 GiB less.

I moved from commit 71fbc6faed6 [1] where I got:

Aug 28 22:03:03 marvin kernel: real memory  = 17179869184 (16384 MB)
Aug 28 22:03:03 marvin kernel: avail memory = 16590352384 (15821 MB)

to commit 7955efd574b [2] where I get:

Sep 13 10:44:40 marvin kernel: real memory  = 17179869184 (16384 MB)
Sep 13 10:44:40 marvin kernel: avail memory = 13298876416 (12682 MB)

I'm seeing this on multiple machines.

Unluckily bisecting and trying an older loader.efi in sseparate tests did
not give me any more insight.

The recent changes to efi loader, starting with commit 6032b6ba9596 [3] look
like a possible trigger to this, but I have been unable to confirm it.

Any suggesstions on how to proceed to debug thiss? ANy idea what a fix could
be?


Is this UEFI or bios boot?


Machine is UEFI


Provide verbose dmesg for old and new boots on the same machine.
For UEFI boot, show output of 'sysctl machdep.efi_map', again for old
and new boots.



I'm not sure how to get the verbose data for the old boot, since I've 
been unable to revert the machine to the old state. I'll try anyway though.


Anyway this is happening on tree different machines. I forgot to mention 
they are using a custom kernel. I don't think it makes a difference but 
I'll also test GENERIC, just in case.


--
Guido Falsi 



  1   2   3   4   >