Re: CURRENT: exec_machdep.c:80:2: error: KDB must be enabled in order for DDB

2024-05-30 Thread Gary Jennejohn
On Thu, 30 May 2024 05:12:01 +0200
FreeBSD User  wrote:

> Hello,
>
> for customising my world and kernel, I try to "overlay" GENERIC via included 
> files containing
> "nodevice" and "nooptions" tags starting from a top level config file like
>
> include GENERIC
> include NODEVICE-GENERIC
> include   SPECIAL
>
> Within "NODEVICE-GENERIC" I utilize
> [...]
> # Debugging support.  Always need this:
> nooptions   KDB # Enable kernel debugger support.
> nooptions   KDB_TRACE   # Print a stack trace for a panic.
> # For full debugger support use (turn off in stable branch):
> include "std.nodebug"
> [...]
>
> to disable KDB. The include "std.debug" in GENERIC is new, prior to its 
> occurence the sketched
> scheme worked fine for me, but now I get this error while perfoming "make -jX 
> buildworld
> buildkernel":
>
> [...]
> /usr/src/sys/amd64/amd64/exec_machdep.c:80:2: error: KDB must be enabled in 
> order for DDB to
> work! 80 | #error KDB must be enabled in order for DDB to work!
>   |  ^
> [...]
>
> Apart from the recommendation not to disable KDB in CURRENT, is there a way 
> to disable
> debugging features and mimik a stable branch?
>
> Thanks in advance,
>

GENERIC contains options DDB_CTF, which results in opt_ddb.h being created.

/sys/conf/kern.pre.mk:DDB_ENABLED!= grep DDB opt_ddb.h || true ; echo
will result in DDB_ENABLED being true, since #define DDB_CTF 1 will be
present in opt_ddb.h.

So adding noptions DDB and nooptions DDB_CTF to your NODEVICE-GENERIC
might solve your problem.

--
Gary Jennejohn



CURRENT: exec_machdep.c:80:2: error: KDB must be enabled in order for DDB

2024-05-29 Thread FreeBSD User
Hello,

for customising my world and kernel, I try to "overlay" GENERIC via included 
files containing
"nodevice" and "nooptions" tags starting from a top level config file like

include GENERIC
include NODEVICE-GENERIC
include SPECIAL

Within "NODEVICE-GENERIC" I utilize
[...]
# Debugging support.  Always need this:
nooptions   KDB # Enable kernel debugger support.
nooptions   KDB_TRACE   # Print a stack trace for a panic.
# For full debugger support use (turn off in stable branch):
include "std.nodebug"
[...]

to disable KDB. The include "std.debug" in GENERIC is new, prior to its 
occurence the sketched
scheme worked fine for me, but now I get this error while perfoming "make -jX 
buildworld
buildkernel":

[...]
/usr/src/sys/amd64/amd64/exec_machdep.c:80:2: error: KDB must be enabled in 
order for DDB to
work! 80 | #error KDB must be enabled in order for DDB to work!
  |  ^
[...]

Apart from the recommendation not to disable KDB in CURRENT, is there a way to 
disable
debugging features and mimik a stable branch?

Thanks in advance,

oh



-- 
O. Hartmann



Re: May 2024 stabilization week

2024-05-29 Thread Gleb Smirnoff
On Mon, May 27, 2024 at 09:24:14PM -0700, Gleb Smirnoff wrote:
T> On Mon, May 27, 2024 at 01:00:24AM -0700, Gleb Smirnoff wrote:
T> T> This is an automated email to inform you that the May 2024 stabilization 
week
T> T> started with FreeBSD/main at main-n270422-cca0ce62f367, which was tagged 
as
T> T> main-stabweek-2024-May.
T> 
T> Monday night status update:
T> 
T> - Updated my personal desktop and home router, no issues noticed.
T> - Testing at Netflix is delayed due to several issues: the test cluster
T>   busy with other stuff, some small difficulties with merging, etc.
T>   Usually we run the test Monday night to Tuesday, but this time we
T>   plan to do it Tuesday to Wednesday.

This time at Netflix we had limited testing capacity.  The test was run
on 3 pairs of machines (normally we have > 20).  Anyway, no stability
issues neither performance regressions were found by our testing.

Since I didn't receive any negative feedback on the stabilization snapshot,
the stabilization week is declared completed.

I created a branch stabweek-2024-May that has two bugfixes cherry-picked:

4c053c17f2c8a715988f215d16284879857ca376 (affects 32-bit ZFS users)
2780e5f43d5b0e8b155472300ee63816a660780e (affects users of linuxulator)

The branch is published at https://github.com/glebius/FreeBSD.

For those, who want to recreate the branch without using my repo:

git checkout -b stabweek-2024-May cca0ce62f367d03ed429bf99e41e6aca8cb7f2ac
git cherry-pick -x 4c053c17f2c8a715988f215d16284879857ca376
git cherry-pick -x 2780e5f43d5b0e8b155472300ee63816a660780e

-- 
Gleb Smirnoff



Re: May 2024 stabilization week

2024-05-28 Thread void

Hi Gleb,

On Mon, May 27, 2024 at 09:24:14PM -0700, Gleb Smirnoff wrote:


Replying to this email thread with your success reports as well
as reporting any regressions is very much appreciated. Thanks!


Works fine, no issues on arm64.aarch64 where it's running
nginx, monit and a poudriere instance.

--



Re: [Bug 269133] bnxt(4): BCM57416 - HWRM_CFA_L2_SET_RX_MASK command returned RESOURCE_ALLOC_ERROR error

2024-05-28 Thread Gerrit Kühn
Am Tue, 28 May 2024 11:25:09 +0200
schrieb Santiago Martinez :

> *"The latest I have is 214.0.286.18"*
> Indeed, the firmware on my box is older, I cannot upgrade it right now, 
> but it is on my to-do list.

Same here, I guess (pkgver). It says
dev.bnxt.0.ver.fw_ver: 214.4.9.10/pkg 214.0.286.18
on both systems.

Also, I don't think I can upgrade the firmware separately, it comes with
the mainboard's bios (which is the latest available).


cu
  Gerrit


smime.p7s
Description: S/MIME cryptographic signature


Re: [Bug 269133] bnxt(4): BCM57416 - HWRM_CFA_L2_SET_RX_MASK command returned RESOURCE_ALLOC_ERROR error

2024-05-28 Thread Santiago Martinez

Hi!

*"The latest I have is 214.0.286.18"*
Indeed, the firmware on my box is older, I cannot upgrade it right now, 
but it is on my to-do list.


I'm also trying to apply the patch recommended by @Warner.

I will keep you posted.

Santi


On 5/28/24 11:19, Gerrit Kühn wrote:

Am Tue, 28 May 2024 10:59:00 +0200
schrieb Santiago Martinez:


Not sure if it will break your setup, but this already happened with
13.2 (I cant recall the exact release).

I have two machines with onboard NICs (Supermicro H12SSL-CT mainboards)
running just fine. One is 13.3, the other is 14.0.


Drivers used to be ok, before 13.X and then I started to see many errors.

No errors at all on my side here. Do you have onboard NICs or PCIe cards?
 From the bugreport linked in earliers mails I can also see that the
firmware I have here appears to be much older than what other people use.
The latest I have is 214.0.286.18.


is it possible for you to test on that machine and see what happens or
its prod?

Well, the 13.3 is production, the 14.0 is configured and loaded with
data, should actually go into production this week (that's why I was
asking... :-). However, I do have a third system of the same hardware that
is unused right now. I could do tests there (given I find some time).


cu
   Gerrit

Re: [Bug 269133] bnxt(4): BCM57416 - HWRM_CFA_L2_SET_RX_MASK command returned RESOURCE_ALLOC_ERROR error

2024-05-28 Thread Gerrit Kühn
Am Tue, 28 May 2024 10:59:00 +0200
schrieb Santiago Martinez :

> Not sure if it will break your setup, but this already happened with 
> 13.2 (I cant recall the exact release).

I have two machines with onboard NICs (Supermicro H12SSL-CT mainboards)
running just fine. One is 13.3, the other is 14.0.

> Drivers used to be ok, before 13.X and then I started to see many errors.

No errors at all on my side here. Do you have onboard NICs or PCIe cards?
From the bugreport linked in earliers mails I can also see that the
firmware I have here appears to be much older than what other people use.
The latest I have is 214.0.286.18.

> is it possible for you to test on that machine and see what happens or 
> its prod?

Well, the 13.3 is production, the 14.0 is configured and loaded with
data, should actually go into production this week (that's why I was
asking... :-). However, I do have a third system of the same hardware that
is unused right now. I could do tests there (given I find some time).


cu
  Gerrit


smime.p7s
Description: S/MIME cryptographic signature


Re: [Bug 269133] bnxt(4): BCM57416 - HWRM_CFA_L2_SET_RX_MASK command returned RESOURCE_ALLOC_ERROR error

2024-05-28 Thread Santiago Martinez

Hi Gerrit,

Not sure if it will break your setup, but this already happened with 
13.2 (I cant recall the exact release).


Drivers used to be ok, before 13.X and then I started to see many errors.

That's why I was suggesting to have a pkg with if_bnxt to get releases 
as required without needing a P1/2/++ release.


is it possible for you to test on that machine and see what happens or 
its prod?


Santi

On 5/28/24 08:40, Gerrit Kühn wrote:

Am Mon, 27 May 2024 15:05:31 -0600
schrieb Warner Losh:


I'd like it there, but I think this will need to be a EN to get it into
14.1 given the late date of this commit  Unless we slip 14.1 for
other reasons...

I have systems running 14.0 that use onboard bnxt  chipsets, seen no issues so far. Does this
mean I'll have to stick with 14.0 as 14.1 will probably break the
interfaces?


cu
   Gerrit

Re: May 2024 stabilization week

2024-05-28 Thread Alexander Leidinger

Am 2024-05-28 06:24, schrieb Gleb Smirnoff:

On Mon, May 27, 2024 at 01:00:24AM -0700, Gleb Smirnoff wrote:
T> This is an automated email to inform you that the May 2024 
stabilization week
T> started with FreeBSD/main at main-n270422-cca0ce62f367, which was 
tagged as

T> main-stabweek-2024-May.

Monday night status update:

- Updated my personal desktop and home router, no issues noticed.
- Testing at Netflix is delayed due to several issues: the test cluster
  busy with other stuff, some small difficulties with merging, etc.
  Usually we run the test Monday night to Tuesday, but this time we
  plan to do it Tuesday to Wednesday.

Regressions I am aware of and tracking:

- Linuxulator too strict on Netlink (PR 279012)

Replying to this email thread with your success reports as well
as reporting any regressions is very much appreciated. Thanks!


Intel 32bit users users which use ZFS may want to have
  
https://cgit.FreeBSD.org/src/commit/?id=4c053c17f2c8a715988f215d16284879857ca376


Apart from that much more stable on my 30-jails + poudriere host than 
the src as from the middle of the month.


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


signature.asc
Description: OpenPGP digital signature


Re: [Bug 269133] bnxt(4): BCM57416 - HWRM_CFA_L2_SET_RX_MASK command returned RESOURCE_ALLOC_ERROR error

2024-05-28 Thread Gerrit Kühn
Am Mon, 27 May 2024 15:05:31 -0600
schrieb Warner Losh :

> I'd like it there, but I think this will need to be a EN to get it into
> 14.1 given the late date of this commit  Unless we slip 14.1 for
> other reasons...

I have systems running 14.0 that use onboard bnxt  chipsets, seen no issues so far. Does this
mean I'll have to stick with 14.0 as 14.1 will probably break the
interfaces?


cu
  Gerrit


smime.p7s
Description: S/MIME cryptographic signature


Re: May 2024 stabilization week

2024-05-27 Thread Gleb Smirnoff
On Mon, May 27, 2024 at 01:00:24AM -0700, Gleb Smirnoff wrote:
T> This is an automated email to inform you that the May 2024 stabilization week
T> started with FreeBSD/main at main-n270422-cca0ce62f367, which was tagged as
T> main-stabweek-2024-May.

Monday night status update:

- Updated my personal desktop and home router, no issues noticed.
- Testing at Netflix is delayed due to several issues: the test cluster
  busy with other stuff, some small difficulties with merging, etc.
  Usually we run the test Monday night to Tuesday, but this time we
  plan to do it Tuesday to Wednesday.

Regressions I am aware of and tracking:

- Linuxulator too strict on Netlink (PR 279012)

Replying to this email thread with your success reports as well
as reporting any regressions is very much appreciated. Thanks!

-- 
Gleb Smirnoff



Re: [Bug 269133] bnxt(4): BCM57416 - HWRM_CFA_L2_SET_RX_MASK command returned RESOURCE_ALLOC_ERROR error

2024-05-27 Thread Warner Losh
On Mon, May 27, 2024 at 3:01 PM Colin Percival  wrote:

> On 5/27/24 13:51, Warner Losh wrote:
> > On Mon, May 27, 2024 at 10:16 AM Santiago Martinez  > > wrote:
> > Just wondering if anyone has any contact at broadcom.
> >
> > The bnxt drivers on 14.1BETA1 are unusable.
> >
> > Cards stop working randomly, LRO cannot be disable (fail
> FILTER_ALLOT),
> > even chaining mtu renders the card unusable.
> >
> > The cards, is the same it was used to open the original PR.
> >
> >
> > There's a series of reviews that I've reviewed, but haven't yet been
> committed.
> >
> > I think they start at https://reviews.freebsd.org/D45005
> > .
> >
> > I'll see if I can prevail upon them to commit them to -current soon.
> Just to be clear, you're not expecting this to get into 14.1-RELEASE,
> right?
>

I'd like it there, but I think this will need to be a EN to get it into
14.1 given the late date of this commit  Unless we slip 14.1 for other
reasons...

Warner


Re: [Bug 269133] bnxt(4): BCM57416 - HWRM_CFA_L2_SET_RX_MASK command returned RESOURCE_ALLOC_ERROR error

2024-05-27 Thread Warner Losh
On Mon, May 27, 2024 at 10:16 AM Santiago Martinez 
wrote:

> Hi Everyone,
>
> Just wondering if anyone has any contact at broadcom.
>
> The bnxt drivers on 14.1BETA1 are unusable.
>
> Cards stop working randomly, LRO cannot be disable (fail FILTER_ALLOT),
> even chaining mtu renders the card unusable.
>
> The cards, is the same it was used to open the original PR.
>

There's a series of reviews that I've reviewed, but haven't yet been
committed.

I think they start at https://reviews.freebsd.org/D45005.

I'll see if I can prevail upon them to commit them to -current soon.

Warner


> Best regards.
>
> Santiago
>
>
> On 5/4/23 14:20, bugzilla-nore...@freebsd.org wrote:
> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269133
> >
> > --- Comment #8 from geoffroy desvernay  ---
> > Since upgrade from 12.3p1x to 13.2-RELEASE, we have the same error
> message here
> > with bnxt (not tested with 13.1):
> >
> > dmesg:
> > bnxt0:  mem
> > 0xb9a1-0xb9a1,0xb910-0xb91f,0xb9aa2000-0xb9aa3fff irq 48
> at
> > device 0.0 numa-domain 0 on pci9
> > bnxt0: Using 256 TX descriptors and 256 RX descriptors
> > bnxt0: Using 12 RX queues 12 TX queues
> > bnxt0: Using MSI-X interrupts with 13 vectors
> > bnxt0: Ethernet address: d0:94:66:81:60:e3
> > bnxt0: netmap queues/slots: TX 12/256, RX 12/256
> > bnxt1:  mem
> > 0xb9a0-0xb9a0,0xb880-0xb88f,0xb9aa-0xb9aa1fff irq 52
> at
> > device 0.1 numa-domain 0 on pci9
> > bnxt1: Using 256 TX descriptors and 256 RX descriptors
> > bnxt1: Using 12 RX queues 12 TX queues
> > bnxt1: Using MSI-X interrupts with 13 vectors
> > bnxt1: Ethernet address: d0:94:66:81:60:e4
> > bnxt1: netmap queues/slots: TX 12/256, RX 12/256
> > bnxt0: Link is UP full duplex, FC - none - 1 Mbps
> > bnxt0: link state changed to UP
> > bnxt1: Link is UP full duplex, FC - none - 1 Mbps
> > bnxt1: link state changed to UP
> > bnxt0: Attempt to re-allocate l2 ctx filter (fid: 0x1170204)
> > bnxt1: Attempt to re-allocate l2 ctx filter (fid: 0x11c0003f004)
> > bnxt0: Attempt to re-allocate l2 ctx filter (fid: 0x1250204)
> > bnxt1: Attempt to re-allocate l2 ctx filter (fid: 0x1280003f004)
> > bnxt0: HWRM_CFA_L2_SET_RX_MASK command returned RESOURCE_ALLOC_ERROR
> error.
> > bnxt0: set_multi: rx_mask set failed
> > bnxt0: HWRM_CFA_L2_SET_RX_MASK command returned RESOURCE_ALLOC_ERROR
> error.
> > bnxt0: set_multi: rx_mask set failed
> > [same messages x 100's]
> >
> >
> > sysctl:
> >
> > dev.bnxt.0.%domain: 0
> > dev.bnxt.0.%parent: pci9
> > dev.bnxt.0.%pnpinfo: vendor=0x14e4 device=0x16d8 subvendor=0x1028
> > subdevice=0x1feb class=0x02
> > dev.bnxt.0.%location: slot=0 function=0 dbsf=pci0:94:0:0
> > dev.bnxt.0.%driver: bnxt
> > dev.bnxt.0.%desc: Broadcom BCM57416 NetXtreme-E 10GBase-T Ethernet
> > dev.bnxt.0.ver.hwrm_min_ver: 1.10.2
> > dev.bnxt.0.ver.package_ver: 
> > dev.bnxt.0.ver.chip_type: ASIC
> > dev.bnxt.0.ver.chip_bond_id: 0
> > dev.bnxt.0.ver.chip_metal: 1
> > dev.bnxt.0.ver.chip_rev: 1
> > dev.bnxt.0.ver.chip_num: 5848
> > dev.bnxt.0.ver.phy_partnumber: 616740003
> > dev.bnxt.0.ver.phy_vendor: Amphenol
> > dev.bnxt.0.ver.roce_fw_name: BONO_FW
> > dev.bnxt.0.ver.netctrl_fw_name: KONG_FW
> > dev.bnxt.0.ver.mgmt_fw_name: AFW_223.0.205.0
> > dev.bnxt.0.ver.hwrm_fw_name: CHIMP_FW
> > dev.bnxt.0.ver.phy: 13.1.11
> > dev.bnxt.0.ver.fw_ver: 223.0.205.0/pkg 22.31.13.70
> > dev.bnxt.0.ver.roce_fw: 223.0.205
> > dev.bnxt.0.ver.netctrl_fw: 223.0.205
> > dev.bnxt.0.ver.mgmt_fw: 223.0.205
> > dev.bnxt.0.ver.hwrm_fw: 223.0.205
> > dev.bnxt.0.ver.driver_hwrm_if: 1.10.2.34
> > dev.bnxt.0.ver.hwrm_if: 1.10.2
> >
>
>


Re: [Bug 269133] bnxt(4): BCM57416 - HWRM_CFA_L2_SET_RX_MASK command returned RESOURCE_ALLOC_ERROR error

2024-05-27 Thread Santiago Martinez

Hi Everyone,

Just wondering if anyone has any contact at broadcom.

The bnxt drivers on 14.1BETA1 are unusable.

Cards stop working randomly, LRO cannot be disable (fail FILTER_ALLOT), 
even chaining mtu renders the card unusable.


The cards, is the same it was used to open the original PR.

Best regards.

Santiago


On 5/4/23 14:20, bugzilla-nore...@freebsd.org wrote:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269133

--- Comment #8 from geoffroy desvernay  ---
Since upgrade from 12.3p1x to 13.2-RELEASE, we have the same error message here
with bnxt (not tested with 13.1):

dmesg:
bnxt0:  mem
0xb9a1-0xb9a1,0xb910-0xb91f,0xb9aa2000-0xb9aa3fff irq 48 at
device 0.0 numa-domain 0 on pci9
bnxt0: Using 256 TX descriptors and 256 RX descriptors
bnxt0: Using 12 RX queues 12 TX queues
bnxt0: Using MSI-X interrupts with 13 vectors
bnxt0: Ethernet address: d0:94:66:81:60:e3
bnxt0: netmap queues/slots: TX 12/256, RX 12/256
bnxt1:  mem
0xb9a0-0xb9a0,0xb880-0xb88f,0xb9aa-0xb9aa1fff irq 52 at
device 0.1 numa-domain 0 on pci9
bnxt1: Using 256 TX descriptors and 256 RX descriptors
bnxt1: Using 12 RX queues 12 TX queues
bnxt1: Using MSI-X interrupts with 13 vectors
bnxt1: Ethernet address: d0:94:66:81:60:e4
bnxt1: netmap queues/slots: TX 12/256, RX 12/256
bnxt0: Link is UP full duplex, FC - none - 1 Mbps
bnxt0: link state changed to UP
bnxt1: Link is UP full duplex, FC - none - 1 Mbps
bnxt1: link state changed to UP
bnxt0: Attempt to re-allocate l2 ctx filter (fid: 0x1170204)
bnxt1: Attempt to re-allocate l2 ctx filter (fid: 0x11c0003f004)
bnxt0: Attempt to re-allocate l2 ctx filter (fid: 0x1250204)
bnxt1: Attempt to re-allocate l2 ctx filter (fid: 0x1280003f004)
bnxt0: HWRM_CFA_L2_SET_RX_MASK command returned RESOURCE_ALLOC_ERROR error.
bnxt0: set_multi: rx_mask set failed
bnxt0: HWRM_CFA_L2_SET_RX_MASK command returned RESOURCE_ALLOC_ERROR error.
bnxt0: set_multi: rx_mask set failed
[same messages x 100's]


sysctl:

dev.bnxt.0.%domain: 0
dev.bnxt.0.%parent: pci9
dev.bnxt.0.%pnpinfo: vendor=0x14e4 device=0x16d8 subvendor=0x1028
subdevice=0x1feb class=0x02
dev.bnxt.0.%location: slot=0 function=0 dbsf=pci0:94:0:0
dev.bnxt.0.%driver: bnxt
dev.bnxt.0.%desc: Broadcom BCM57416 NetXtreme-E 10GBase-T Ethernet
dev.bnxt.0.ver.hwrm_min_ver: 1.10.2
dev.bnxt.0.ver.package_ver: 
dev.bnxt.0.ver.chip_type: ASIC
dev.bnxt.0.ver.chip_bond_id: 0
dev.bnxt.0.ver.chip_metal: 1
dev.bnxt.0.ver.chip_rev: 1
dev.bnxt.0.ver.chip_num: 5848
dev.bnxt.0.ver.phy_partnumber: 616740003
dev.bnxt.0.ver.phy_vendor: Amphenol
dev.bnxt.0.ver.roce_fw_name: BONO_FW
dev.bnxt.0.ver.netctrl_fw_name: KONG_FW
dev.bnxt.0.ver.mgmt_fw_name: AFW_223.0.205.0
dev.bnxt.0.ver.hwrm_fw_name: CHIMP_FW
dev.bnxt.0.ver.phy: 13.1.11
dev.bnxt.0.ver.fw_ver: 223.0.205.0/pkg 22.31.13.70
dev.bnxt.0.ver.roce_fw: 223.0.205
dev.bnxt.0.ver.netctrl_fw: 223.0.205
dev.bnxt.0.ver.mgmt_fw: 223.0.205
dev.bnxt.0.ver.hwrm_fw: 223.0.205
dev.bnxt.0.ver.driver_hwrm_if: 1.10.2.34
dev.bnxt.0.ver.hwrm_if: 1.10.2





[HEADSUP] broken kernels for head this week for pkgbase

2024-05-27 Thread Baptiste Daroussin
Hello,

For people running, current, this week the kenrel was broken from
da76d349b6b104f4e70562304c800a0793dea18d to
73eb53813fe3a2245edbeb670902e4bb9d41e288

the kernel built and published as part of this weekly snapshot is impacted
I have launch a publication of a new snapshot which should replace the current
weekly in a couple of hours.

If you have install 15.snap20240525190352 make sure to upgrade again before
rebooting.

Best regards,
Bapt



May 2024 stabilization week

2024-05-27 Thread Gleb Smirnoff
  Hi FreeBSD/main users & developers:

This is an automated email to inform you that the May 2024 stabilization week
started with FreeBSD/main at main-n270422-cca0ce62f367, which was tagged as
main-stabweek-2024-May.

The tag main-stabweek-2024-May has been published at
https://github.com/glebius/FreeBSD/tags.  Those who want to participate
in the stabilization week are encouraged to update to the above
revision/tag and test their systems.

Developers are encouraged to avoid pushing new features to FreeBSD/main,
but focus on bugfixes instead.  The stabilization week runs up to
Friday 18:00 UTC, but if there is consensus that any regressions
discovered by participants have been fixed, it will end early.

Once that happens, the advisory freeze of FreeBSD/main branch is thawed.

--
Gleb Smirnoff



Re: CURRENT kernel crash beyond git: 02d15215cef2

2024-05-26 Thread Graham Perrin

On 26/05/2024 13:45, Herbert J. Skuhra wrote:

… No, idea why the fix hasn't been committed yet: …


A few hours earlier:

uma: Fix improper uses of UMA_MD_SMALL_ALLOC · freebsd/freebsd-src@d25ed65


HTH




Re: main cadd2ca217 doesn't boot

2024-05-26 Thread FreeBSD User
Am Sun, 26 May 2024 09:29:08 +0200
Bojan Novković  schrieb:

> Hi,
> 
> da76d349b6b1 replaced a UMA-related symbol but missed three instances 
> where the old one was used, ultimately causing the wrong UMA page 
> allocator to get selected and crashing the machine.
> 
> I tested this patch as a part of a bigger series where it works fine, so 
> this slipped through cracks without getting noticed.
> 
> I've attached a patch with a fix, I can boot an amd64 VM with it applied.
> Could you please give it a try and let me know if it fixes the issue?
> 
> Bojan

The patch fixes the problem on amd64 here ...

-- 
O. Hartmann



Re: main cadd2ca217 doesn't boot

2024-05-26 Thread David Wolfskill
On Sun, May 26, 2024 at 09:29:08AM +0200, Bojan Novković wrote:
> Hi,
> 
> da76d349b6b1 replaced a UMA-related symbol but missed three instances where
> the old one was used, ultimately causing the wrong UMA page allocator to get
> selected and crashing the machine.
> 
> I tested this patch as a part of a bigger series where it works fine, so
> this slipped through cracks without getting noticed.
> 
> I've attached a patch with a fix, I can boot an amd64 VM with it applied.
> Could you please give it a try and let me know if it fixes the issue?

TL;DR: Yes, it fixes it (for 2 of my laptops, at least).

Details:
Laptops were running (e.g.):

FreeBSD 15.0-CURRENT #155 main-n270400-02d15215cef2: Sat May 25 14:19:27 UTC 
2024 
r...@g1-70.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY amd64 
1500018 1500018

After updating sources to main-n270407-73eb53813fe3 and doing a normal
in-place source-based update (which owrked, as such), I (attempted)
rebooting, which exhibited the previously-documented failures.

I rebooted using the kernel from main-n270400-02d15215cef2, applied the
patch, rebuilt the kernel, and ... the reboot this time was successful:

FreeBSD 15.0-CURRENT #157 main-n270407-73eb53813fe3-dirty: Sun May 26 13:02:07 
UTC 2024 
r...@g1-51.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY amd64 
1500018 1500018

(My 3rd "development" machine -- the fastest one -- is still bogged down
with Yet Another Chromium Rebuild on behalf of production machines that
are due to be updated once that completes.)

Thanks!

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
I will not be voting for a "unified reich" in the US.

See https://www.catwhisker.org/~david/publickey.gpg for my public key.


signature.asc
Description: PGP signature


Re: CURRENT kernel crash beyond git: 02d15215cef2

2024-05-26 Thread FreeBSD User
Am Sun, 26 May 2024 14:45:37 +0200
"Herbert J. Skuhra"  schrieb:

> On Sun, May 26, 2024 at 02:35:16PM +0200, FreeBSD User wrote:
> > Hello,
> > 
> > boxes running CURRENT are last good with FreeBSD 15.0-CURRENT #44
> > main-n270400-02d15215cef2: Sat May 25 10:56:09 CEST 2024 amd64. Customized 
> > kernel.
> > 
> > After that commit, booting the kernel dies silently without any trace/core 
> > or similar and
> > resetting the system.
> > 
> > I tried to enable at least the standard debugging features, but that 
> > doesn't improve the
> > fact the machine resets/dies silently.  
> 
> Check the archive! Known issue:
> 
> https://lists.freebsd.org/archives/freebsd-current/2024-May/005990.html
> 
> No, idea why the fix hasn't been committed yet:
> 
> https://lists.freebsd.org/archives/freebsd-current/2024-May/005993.html
> 

My apology.
Nuni Teixeiras posting was right below mine :-(

Sorry for the noise,

Thanks and kind regards
oh

-- 
O. Hartmann



Re: CURRENT kernel crash beyond git: 02d15215cef2

2024-05-26 Thread Herbert J. Skuhra
On Sun, May 26, 2024 at 02:35:16PM +0200, FreeBSD User wrote:
> Hello,
> 
> boxes running CURRENT are last good with FreeBSD 15.0-CURRENT #44 
> main-n270400-02d15215cef2:
> Sat May 25 10:56:09 CEST 2024 amd64. Customized kernel.
> 
> After that commit, booting the kernel dies silently without any trace/core or 
> similar and
> resetting the system.
> 
> I tried to enable at least the standard debugging features, but that doesn't 
> improve the fact
> the machine resets/dies silently.

Check the archive! Known issue:

https://lists.freebsd.org/archives/freebsd-current/2024-May/005990.html

No, idea why the fix hasn't been committed yet:

https://lists.freebsd.org/archives/freebsd-current/2024-May/005993.html

-- 
Herbert 



Re: main cadd2ca217 doesn't boot

2024-05-26 Thread Oleg Nauman
Hello,

I can confirm that your patch fixes this issue ( am64 CURRENT cadd2ca21765 )

Thank you

On Sun, May 26, 2024 at 10:29 AM Bojan Novković  wrote:
>
> Hi,
>
> da76d349b6b1 replaced a UMA-related symbol but missed three instances
> where the old one was used, ultimately causing the wrong UMA page
> allocator to get selected and crashing the machine.
>
> I tested this patch as a part of a bigger series where it works fine, so
> this slipped through cracks without getting noticed.
>
> I've attached a patch with a fix, I can boot an amd64 VM with it applied.
> Could you please give it a try and let me know if it fixes the issue?
>
> Bojan



Re: main cadd2ca217 doesn't boot

2024-05-26 Thread tuexen
> On 26. May 2024, at 09:29, Bojan Novković  wrote:
> 
> Hi,
> 
> da76d349b6b1 replaced a UMA-related symbol but missed three instances where 
> the old one was used, ultimately causing the wrong UMA page allocator to get 
> selected and crashing the machine.
> 
> I tested this patch as a part of a bigger series where it works fine, so this 
> slipped through cracks without getting noticed.
> 
> I've attached a patch with a fix, I can boot an amd64 VM with it applied.
> Could you please give it a try and let me know if it fixes the issue?
Hi Hojan,

this fixes the issue for me using an arm64 VM (VMWare Fusion).

Best regards
Michael
> 
> Bojan
> 




Re: main cadd2ca217 doesn't boot

2024-05-26 Thread Bojan Novković

Hi,

da76d349b6b1 replaced a UMA-related symbol but missed three instances 
where the old one was used, ultimately causing the wrong UMA page 
allocator to get selected and crashing the machine.


I tested this patch as a part of a bigger series where it works fine, so 
this slipped through cracks without getting noticed.


I've attached a patch with a fix, I can boot an amd64 VM with it applied.
Could you please give it a try and let me know if it fixes the issue?

Bojan
diff --git a/sys/vm/uma_core.c b/sys/vm/uma_core.c
index 59066eb96ae9..516ac2c2965a 100644
--- a/sys/vm/uma_core.c
+++ b/sys/vm/uma_core.c
@@ -2523,7 +2523,7 @@ keg_ctor(void *mem, int size, void *udata, int flags)
 	 * If we haven't booted yet we need allocations to go through the
 	 * startup cache until the vm is ready.
 	 */
-#ifdef UMA_MD_SMALL_ALLOC
+#ifdef UMA_USE_DMAP
 	if (keg->uk_ppera == 1)
 		keg->uk_allocf = uma_small_alloc;
 	else
@@ -2536,7 +2536,7 @@ keg_ctor(void *mem, int size, void *udata, int flags)
 		keg->uk_allocf = contig_alloc;
 	else
 		keg->uk_allocf = page_alloc;
-#ifdef UMA_MD_SMALL_ALLOC
+#ifdef UMA_USE_DMAP
 	if (keg->uk_ppera == 1)
 		keg->uk_freef = uma_small_free;
 	else
@@ -5221,7 +5221,7 @@ uma_zone_reserve_kva(uma_zone_t zone, int count)
 	keg->uk_kva = kva;
 	keg->uk_offset = 0;
 	zone->uz_max_items = pages * keg->uk_ipers;
-#ifdef UMA_MD_SMALL_ALLOC
+#ifdef UMA_USE_DMAP
 	keg->uk_allocf = (keg->uk_ppera > 1) ? noobj_alloc : uma_small_alloc;
 #else
 	keg->uk_allocf = noobj_alloc;


Re: main cadd2ca217 doesn't boot

2024-05-25 Thread Ryan Libby
On Sat, May 25, 2024 at 5:47 PM Tomoaki AOKI  wrote:
>
> On Sun, 26 May 2024 00:21:31 +0100
> Nuno Teixeira  wrote:
>
> > Hello,
> >
> > Just upgraded to latest main at cadd2ca217
> >
> > Boot menu shows up and then it stops earlier around:
> > ..
> > FreeBSD clang version ...
> >
> > No crash dump.
> >
> > Thanks,
> >
> > --
> > Nuno Teixeira
> > FreeBSD UNIX: Web:  https://FreeBSD.org
>
> Just a FYI:
> commit 40d951bc5932deb87635f5c1780a6706d0c7c012, amd64 boots fine for
> me. So commits after 02d15215cef2a28f1865e6ad5b19f18af1398b8b caused
> the problem, maybe.
>
> Regards.
>
> --
> Tomoaki AOKI
>

I'm on amd64 running GENERIC.

This boots:
9b1de7e4844d vt/sc: retire logic to select vt(4) by default for UEFI boot

This doesn't:
da76d349b6b1 uma: Deduplicate uma_small_alloc

On one failed boot I saw a failed MPASS
panic: Assertion size != 0 && qsize != 0 failed at
/usr/src/freebsd/sys/kern/subr_vmem.c:427

Followed by about 1000 stack frames with 7 frames repeating, presumably
deep/infinite recursion.

Probably want to back this out until stabilized.

Ryan



Re: main cadd2ca217 doesn't boot

2024-05-25 Thread Tomoaki AOKI
On Sun, 26 May 2024 00:21:31 +0100
Nuno Teixeira  wrote:

> Hello,
> 
> Just upgraded to latest main at cadd2ca217
> 
> Boot menu shows up and then it stops earlier around:
> ..
> FreeBSD clang version ...
> 
> No crash dump.
> 
> Thanks,
> 
> -- 
> Nuno Teixeira
> FreeBSD UNIX: Web:  https://FreeBSD.org

Just a FYI:
commit 40d951bc5932deb87635f5c1780a6706d0c7c012, amd64 boots fine for
me. So commits after 02d15215cef2a28f1865e6ad5b19f18af1398b8b caused
the problem, maybe.

Regards.

-- 
Tomoaki AOKI



main cadd2ca217 doesn't boot

2024-05-25 Thread Nuno Teixeira
Hello,

Just upgraded to latest main at cadd2ca217

Boot menu shows up and then it stops earlier around:
..
FreeBSD clang version ...

No crash dump.

Thanks,

-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


Re: panic: lock "tmpfsni" 0xfffff80721307090 already initialized

2024-05-25 Thread Ryan Libby
On Sat, May 25, 2024 at 1:32 AM Alexander Leidinger
 wrote:
>
> Hi,
>
> [123095] panic: lock "tmpfsni" 0xf80721307090 already initialized
> [123095] cpuid = 8
> [123095] time = 1716597585
> [123095] KDB: stack backtrace:
> [123095] db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfe08285c9690
> [123095] vpanic() at vpanic+0x13f/frame 0xfe08285c97c0
> [123095] panic() at panic+0x43/frame 0xfe08285c9820
> [123095] lock_init() at lock_init+0x155/frame 0xfe08285c9830
> [123095] _mtx_init() at _mtx_init+0x89/frame 0xfe08285c9850
> [123095] tmpfs_node_init() at tmpfs_node_init+0x28/frame
> 0xfe08285c9870
> [123095] keg_alloc_slab() at keg_alloc_slab+0x28d/frame
> 0xfe08285c98c0
> [123095] zone_import() at zone_import+0xec/frame 0xfe08285c9950
> [123095] cache_alloc() at cache_alloc+0x3b3/frame 0xfe08285c99b0
> [123095] cache_alloc_retry() at cache_alloc_retry+0x23/frame
> 0xfe08285c99f0
> [123095] tmpfs_alloc_node() at tmpfs_alloc_node+0x108/frame
> 0xfe08285c9a40
> [123095] tmpfs_alloc_file() at tmpfs_alloc_file+0xbf/frame
> 0xfe08285c9ad0
> [123095] tmpfs_create() at tmpfs_create+0x38/frame 0xfe08285c9b00
> [123095] VOP_CREATE_APV() at VOP_CREATE_APV+0x3c/frame
> 0xfe08285c9b20
> [123095] vn_open_cred() at vn_open_cred+0x2e2/frame 0xfe08285c9c80
> [123095] openatfp() at openatfp+0x268/frame 0xfe08285c9dc0
> [123095] sys_openat() at sys_openat+0x28/frame 0xfe08285c9de0
> [123095] filemon_wrapper_openat() at filemon_wrapper_openat+0x12/frame
> 0xfe08285c9e00
> [123095] amd64_syscall() at amd64_syscall+0x15b/frame 0xfe08285c9f30
> [123095] fast_syscall_common() at fast_syscall_common+0xf8/frame
> 0xfe08285c9f30
> [123095] --- syscall (499, FreeBSD ELF64, openat), rip = 0xab82ba, rsp =
> 0x8217439e8, rbp = 0x821743a20 ---
> [123095] Uptime: 1d10h11m35s
>
> This is with a world from 2024-05-17-084543.
>
> Full logs available at https://wiki.leidinger.net/core.txt.7 (1.1 MB).
>
> This was in the middle of the night, poudriere was running.
>
> Bye,
> Alexander.
>
> --
> http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
> http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF

It looks like tmpfs_node_init ought to pass the MTX_NEW flag, but I am
not seeing what may have changed recently that would explain why this
hasn't been hit before or more often on INVARIANTS kernels.

For future debugging, maybe uma should do an initial trashing of memory
even for zones that have an init procedure.

Ryan



Re: _mtx_lock_sleep: recursed on non-recursive mutex CAM device lock @ /..../sys/cam/nvme/nvme_da.c:469

2024-05-25 Thread Alexander Leidinger

Am 2024-05-22 22:45, schrieb Alexander Leidinger:


Am 2024-05-22 20:53, schrieb Warner Losh:


First order:

Looks like we're trying to schedule a trim, but that fails due to a 
malloc issue. So then, since it's a
malloc issue, we wind up trying to automatically reschedule this I/O, 
which recurses into the driver

with a bad lock held and boop.

Can you reproduce this?


So far I had it once. At least I have only one crashdump. I had one 
more reboot/crash, but no dump. I also have a watchdog running on this 
system, so not sure what caused the (unusual) reboot. I had a poudriere 
build running at both times. Since the crashdump I didn't run poudriere 
anymore.



If so, can you test this patch?


I give it a try tomorrow anyway, and I will try to stress the system 
again with poudriere.


The nvme is a cache and also a log device for a zpool, so not really a 
deterministic way to trigger access to it.


I've run a lot of poudriere builds together with other load (about 30 
jails with mysql, postgresql, redis, webmail, postfix, imap, java stuff, 
..) on this system since thursday. So far no panic in the nvme part.


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF

signature.asc
Description: OpenPGP digital signature


panic: lock "tmpfsni" 0xfffff80721307090 already initialized

2024-05-25 Thread Alexander Leidinger

Hi,

[123095] panic: lock "tmpfsni" 0xf80721307090 already initialized
[123095] cpuid = 8
[123095] time = 1716597585
[123095] KDB: stack backtrace:
[123095] db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe08285c9690

[123095] vpanic() at vpanic+0x13f/frame 0xfe08285c97c0
[123095] panic() at panic+0x43/frame 0xfe08285c9820
[123095] lock_init() at lock_init+0x155/frame 0xfe08285c9830
[123095] _mtx_init() at _mtx_init+0x89/frame 0xfe08285c9850
[123095] tmpfs_node_init() at tmpfs_node_init+0x28/frame 
0xfe08285c9870
[123095] keg_alloc_slab() at keg_alloc_slab+0x28d/frame 
0xfe08285c98c0

[123095] zone_import() at zone_import+0xec/frame 0xfe08285c9950
[123095] cache_alloc() at cache_alloc+0x3b3/frame 0xfe08285c99b0
[123095] cache_alloc_retry() at cache_alloc_retry+0x23/frame 
0xfe08285c99f0
[123095] tmpfs_alloc_node() at tmpfs_alloc_node+0x108/frame 
0xfe08285c9a40
[123095] tmpfs_alloc_file() at tmpfs_alloc_file+0xbf/frame 
0xfe08285c9ad0

[123095] tmpfs_create() at tmpfs_create+0x38/frame 0xfe08285c9b00
[123095] VOP_CREATE_APV() at VOP_CREATE_APV+0x3c/frame 
0xfe08285c9b20

[123095] vn_open_cred() at vn_open_cred+0x2e2/frame 0xfe08285c9c80
[123095] openatfp() at openatfp+0x268/frame 0xfe08285c9dc0
[123095] sys_openat() at sys_openat+0x28/frame 0xfe08285c9de0
[123095] filemon_wrapper_openat() at filemon_wrapper_openat+0x12/frame 
0xfe08285c9e00

[123095] amd64_syscall() at amd64_syscall+0x15b/frame 0xfe08285c9f30
[123095] fast_syscall_common() at fast_syscall_common+0xf8/frame 
0xfe08285c9f30
[123095] --- syscall (499, FreeBSD ELF64, openat), rip = 0xab82ba, rsp = 
0x8217439e8, rbp = 0x821743a20 ---

[123095] Uptime: 1d10h11m35s

This is with a world from 2024-05-17-084543.

Full logs available at https://wiki.leidinger.net/core.txt.7 (1.1 MB).

This was in the middle of the night, poudriere was running.

Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


signature.asc
Description: OpenPGP digital signature


Re: build of main broken? (ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_...' failed: symbol not defined)

2024-05-24 Thread Ed Maste
On Fri, 24 May 2024 at 11:28, Matteo Riondato  wrote:
>
> > In lib/libc/rpc/Symbol.map there is:
> >
> >/* From yp_xdr.c (generated by rpcgen - include/rpcsvc/yp.x) */
> >xdr_domainname;
> >xdr_keydat;
> >
> > so maybe the rpcgen step went wrong somehow? Do you have WITHOUT_NIS 
> > enabled?
>
> Yes, I do have WITHOUT_NIS=y in src.conf

peterj reported this in PR279270 as well and I've opened a review in
https://reviews.freebsd.org/D45347 to move these symbols to
lib/libc/yp/Symbol.map. Can you give that a try?

I originally proposed augmenting Version.map generation to pass CFLAGS
to CPP in D45346 and adding #ifdef YP in D45345, before finding Peter
PR and discovering that lib/libc/yp/Symbol.map already exists.



Re: build of main broken? (ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_...' failed: symbol not defined)

2024-05-24 Thread Matteo Riondato



> On May 24, 2024, at 10:54 AM, Dimitry Andric  wrote:
> 
> On 24 May 2024, at 15:19, Matteo Riondato  wrote:
>> 
>> I’m trying to build 59aa64914aeb1b20d4fc39ead2ee159a1e5b from 
>> main-62adeb92df, and got the error below.
>> 
>> I cannot immediately trace it back to any recent commit, so I’m a bit 
>> surprised by it.
>> 
>> Any hint?
>> 
>> --
> stage 4.2: building libraries
>> --
>> cd /usr/src;  time env MACHINE_ARCH=amd64  MACHINE=amd64  
>> CPUTYPE=skylake-avx512 BUILD_TOOLS_META=.NOMETA CC="/usr/local/bin/ccache cc 
>> -target x86_64-unknown-freebsd15.0 
>> --sysroot=/usr/obj/usr/src/amd64.amd64/tmp 
>> -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin" CXX="/usr/local/bin/ccache c++  
>> -target x86_64-unknown-freebsd15.0 
>> --sysroot=/usr/obj/usr/src/amd64.amd64/tmp 
>> -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin"  CPP="cpp -target 
>> x86_64-unknown-freebsd15.0 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp 
>> -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin"  AS="as" AR="ar" ELFCTL="elfctl" 
>> LD="ld"  LLVM_LINK="" NM=nm OBJCOPY="objcopy"  RANLIB=ranlib STRINGS=  
>> SIZE="size" STRIPBIN="strip"  INSTALL="install -U"  
>> PATH=/usr/obj/usr/src/amd64.amd64/tmp/bin:/usr/obj/usr/src/amd64.amd64/tmp/usr/sbin:/usr/obj/usr/src/amd64.amd64/tmp/usr/bin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/sbin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/bin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/bin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/libexec::/sbin:/bin:/usr/sbin:/usr/bin
>>   SYSROOT=/usr/obj/usr/src/amd64.amd64/tmp make  -f Makefile.inc1  
>> BWPHASE=libraries  DESTDIR=/usr/obj/usr/src/amd64.amd64/tmp -DNO_FSCHG 
>> MK_HTML=no -DNO_LINT MK_MAN=no  MK_PROFILE=no MK_TESTS=no 
>> MK_TESTS_SUPPORT=no  libraries
>> cd /usr/src;  make -f Makefile.inc1 _prereq_libs;  make -f Makefile.inc1 
>> _startup_libs;  make -f Makefile.inc1 _prebuild_libs 
>> -DLIBCRYPTO_WITHOUT_SUBDIRS;  make -f Makefile.inc1 _generic_libs
>> Building /usr/obj/usr/src/amd64.amd64/lib/libcompiler_rt/_libinstall
>> Building /usr/obj/usr/src/amd64.amd64/lib/libcompiler_rt/_installlinks
>> Building /usr/obj/usr/src/amd64.amd64/lib/libssp_nonshared/_libinstall
>> Building /usr/obj/usr/src/amd64.amd64/lib/libgcc_eh/_libinstall
>> Building /usr/obj/usr/src/amd64.amd64/lib/libgcc_eh/_INCSINS
>> installing DIRS FILESDIR
>> install -U  -d -m 0755 -o root  -g wheel  
>> /usr/obj/usr/src/amd64.amd64/tmp/usr/lib
>> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_Scrt1.o
>> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crt1.o
>> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_gcrt1.o
>> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtbegin.o
>> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtbeginS.o
>> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtbeginT.o
>> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtend.o
>> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtendS.o
>> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crti.o
>> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtn.o
>> Building /usr/obj/usr/src/amd64.amd64/lib/libsys/_libinstall
>> Building /usr/obj/usr/src/amd64.amd64/lib/libsys/_INCSINS
>> Building /usr/obj/usr/src/amd64.amd64/lib/libc/libc.so.7
>> building shared library libc.so.7
>> ld: error: version script assignment of 'FBSD_1.0' to symbol 
>> 'xdr_domainname' failed: symbol not defined
>> ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_keydat' 
>> failed: symbol not defined
> 
> In lib/libc/rpc/Symbol.map there is:
> 
>/* From yp_xdr.c (generated by rpcgen - include/rpcsvc/yp.x) */
>xdr_domainname;
>xdr_keydat;
> 
> so maybe the rpcgen step went wrong somehow? Do you have WITHOUT_NIS enabled?

Yes, I do have WITHOUT_NIS=y in src.conf

Thanks,
Matteo




Re: bsdinstall wifi setup is broken on CURRENT

2024-05-24 Thread Renato Botelho

On 18/05/24 11:33, Alfonso S. Siciliano wrote:

On 5/16/24 20:40, Renato Botelho wrote:
I saw some users on a .br group complaining bsdinstall was failing to 
setup wifi network on 15.0 snapshots and tried it myself.  I was able 
to reproduce the problem and also noticed another one.




Thank you for your report, the video is highly appreciated to understand 
the problem quickly and exactly.


I noticed Network Selection screen only shows one line, it's not 
beautiful to navigate through items this way.  On 14.1-BETA2 it shows 
multiple lines so it seems to be a regression.


Problem 1. Looking at wlanconfig it seems related to $height $width 
$rows for the selecting menu. Please could you open a PR adding me, so 
we can test and solve.




The problem users reported was: after selecting desired network it 
just starts over instead of asking for password.  I made a video [1] 
showing the problem.


Problem 2. I know this issue about --mixedform, my last import 2 day ago 
should solve a6d8be451f62d425b71a4874f7d4e133b9fb393c.
You could try the last main snapshot (yesterday 17 May), please let me 
know any problem.


I confirmed it is fixed with bsddialog 1.0.2 but I found another issue 
while testing.


Instead of password, it was adding SSID to psk field of 
wpa_supplicant.conf.  I've created following review to address that


https://reviews.freebsd.org/D45344

Thanks!
--
Renato Botelho



Re: build of main broken? (ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_...' failed: symbol not defined)

2024-05-24 Thread Dimitry Andric
On 24 May 2024, at 15:19, Matteo Riondato  wrote:
> 
> I’m trying to build 59aa64914aeb1b20d4fc39ead2ee159a1e5b from 
> main-62adeb92df, and got the error below.
> 
> I cannot immediately trace it back to any recent commit, so I’m a bit 
> surprised by it.
> 
> Any hint?
> 
> --
 stage 4.2: building libraries
> --
> cd /usr/src;  time env MACHINE_ARCH=amd64  MACHINE=amd64  
> CPUTYPE=skylake-avx512 BUILD_TOOLS_META=.NOMETA CC="/usr/local/bin/ccache cc 
> -target x86_64-unknown-freebsd15.0 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp 
> -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin" CXX="/usr/local/bin/ccache c++  
> -target x86_64-unknown-freebsd15.0 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp 
> -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin"  CPP="cpp -target 
> x86_64-unknown-freebsd15.0 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp 
> -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin"  AS="as" AR="ar" ELFCTL="elfctl" 
> LD="ld"  LLVM_LINK="" NM=nm OBJCOPY="objcopy"  RANLIB=ranlib STRINGS=  
> SIZE="size" STRIPBIN="strip"  INSTALL="install -U"  
> PATH=/usr/obj/usr/src/amd64.amd64/tmp/bin:/usr/obj/usr/src/amd64.amd64/tmp/usr/sbin:/usr/obj/usr/src/amd64.amd64/tmp/usr/bin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/sbin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/bin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/bin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/libexec::/sbin:/bin:/usr/sbin:/usr/bin
>   SYSROOT=/usr/obj/usr/src/amd64.amd64/tmp make  -f Makefile.inc1  
> BWPHASE=libraries  DESTDIR=/usr/obj/usr/src/amd64.amd64/tmp -DNO_FSCHG 
> MK_HTML=no -DNO_LINT MK_MAN=no  MK_PROFILE=no MK_TESTS=no MK_TESTS_SUPPORT=no 
>  libraries
> cd /usr/src;  make -f Makefile.inc1 _prereq_libs;  make -f Makefile.inc1 
> _startup_libs;  make -f Makefile.inc1 _prebuild_libs 
> -DLIBCRYPTO_WITHOUT_SUBDIRS;  make -f Makefile.inc1 _generic_libs
> Building /usr/obj/usr/src/amd64.amd64/lib/libcompiler_rt/_libinstall
> Building /usr/obj/usr/src/amd64.amd64/lib/libcompiler_rt/_installlinks
> Building /usr/obj/usr/src/amd64.amd64/lib/libssp_nonshared/_libinstall
> Building /usr/obj/usr/src/amd64.amd64/lib/libgcc_eh/_libinstall
> Building /usr/obj/usr/src/amd64.amd64/lib/libgcc_eh/_INCSINS
> installing DIRS FILESDIR
> install -U  -d -m 0755 -o root  -g wheel  
> /usr/obj/usr/src/amd64.amd64/tmp/usr/lib
> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_Scrt1.o
> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crt1.o
> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_gcrt1.o
> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtbegin.o
> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtbeginS.o
> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtbeginT.o
> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtend.o
> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtendS.o
> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crti.o
> Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtn.o
> Building /usr/obj/usr/src/amd64.amd64/lib/libsys/_libinstall
> Building /usr/obj/usr/src/amd64.amd64/lib/libsys/_INCSINS
> Building /usr/obj/usr/src/amd64.amd64/lib/libc/libc.so.7
> building shared library libc.so.7
> ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_domainname' 
> failed: symbol not defined
> ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_keydat' 
> failed: symbol not defined

In lib/libc/rpc/Symbol.map there is:

/* From yp_xdr.c (generated by rpcgen - include/rpcsvc/yp.x) */
xdr_domainname;
xdr_keydat;

so maybe the rpcgen step went wrong somehow? Do you have WITHOUT_NIS enabled?

-Dimitry




build of main broken? (ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_...' failed: symbol not defined)

2024-05-24 Thread Matteo Riondato
Hi All,

I’m trying to build 59aa64914aeb1b20d4fc39ead2ee159a1e5b from 
main-62adeb92df, and got the error below.

I cannot immediately trace it back to any recent commit, so I’m a bit surprised 
by it.

Any hint?

--
>>> stage 4.2: building libraries
--
cd /usr/src;  time env MACHINE_ARCH=amd64  MACHINE=amd64  
CPUTYPE=skylake-avx512 BUILD_TOOLS_META=.NOMETA CC="/usr/local/bin/ccache cc 
-target x86_64-unknown-freebsd15.0 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp 
-B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin" CXX="/usr/local/bin/ccache c++  
-target x86_64-unknown-freebsd15.0 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp 
-B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin"  CPP="cpp -target 
x86_64-unknown-freebsd15.0 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp 
-B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin"  AS="as" AR="ar" ELFCTL="elfctl" 
LD="ld"  LLVM_LINK="" NM=nm OBJCOPY="objcopy"  RANLIB=ranlib STRINGS=  
SIZE="size" STRIPBIN="strip"  INSTALL="install -U"  
PATH=/usr/obj/usr/src/amd64.amd64/tmp/bin:/usr/obj/usr/src/amd64.amd64/tmp/usr/sbin:/usr/obj/usr/src/amd64.amd64/tmp/usr/bin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/sbin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/bin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/bin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/libexec::/sbin:/bin:/usr/sbin:/usr/bin
  SYSROOT=/usr/obj/usr/src/amd64.amd64/tmp make  -f Makefile.inc1  
BWPHASE=libraries  DESTDIR=/usr/obj/usr/src/amd64.amd64/tmp -DNO_FSCHG 
MK_HTML=no -DNO_LINT MK_MAN=no  MK_PROFILE=no MK_TESTS=no MK_TESTS_SUPPORT=no  
libraries
cd /usr/src;  make -f Makefile.inc1 _prereq_libs;  make -f Makefile.inc1 
_startup_libs;  make -f Makefile.inc1 _prebuild_libs 
-DLIBCRYPTO_WITHOUT_SUBDIRS;  make -f Makefile.inc1 _generic_libs
Building /usr/obj/usr/src/amd64.amd64/lib/libcompiler_rt/_libinstall
Building /usr/obj/usr/src/amd64.amd64/lib/libcompiler_rt/_installlinks
Building /usr/obj/usr/src/amd64.amd64/lib/libssp_nonshared/_libinstall
Building /usr/obj/usr/src/amd64.amd64/lib/libgcc_eh/_libinstall
Building /usr/obj/usr/src/amd64.amd64/lib/libgcc_eh/_INCSINS
installing DIRS FILESDIR
install -U  -d -m 0755 -o root  -g wheel  
/usr/obj/usr/src/amd64.amd64/tmp/usr/lib
Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_Scrt1.o
Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crt1.o
Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_gcrt1.o
Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtbegin.o
Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtbeginS.o
Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtbeginT.o
Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtend.o
Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtendS.o
Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crti.o
Building /usr/obj/usr/src/amd64.amd64/lib/csu/amd64/_FILESINS_crtn.o
Building /usr/obj/usr/src/amd64.amd64/lib/libsys/_libinstall
Building /usr/obj/usr/src/amd64.amd64/lib/libsys/_INCSINS
Building /usr/obj/usr/src/amd64.amd64/lib/libc/libc.so.7
building shared library libc.so.7
ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_domainname' 
failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_keydat' 
failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_mapname' 
failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_peername' 
failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_valdat' 
failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 
'xdr_ypbind_binding' failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_ypbind_resp' 
failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 
'xdr_ypbind_resptype' failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 
'xdr_ypbind_setdom' failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_ypmap_parms' 
failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_ypmaplist' 
failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 
'xdr_yppush_status' failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 
'xdr_yppushresp_xfr' failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_ypreq_key' 
failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_ypreq_nokey' 
failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 'xdr_ypreq_xfr' 
failed: symbol not defined
ld: error: version script assignment of 'FBSD_1.0' to symbol 

Re: _mtx_lock_sleep: recursed on non-recursive mutex CAM device lock @ /..../sys/cam/nvme/nvme_da.c:469

2024-05-22 Thread Alexander Leidinger

Am 2024-05-22 20:53, schrieb Warner Losh:


First order:

Looks like we're trying to schedule a trim, but that fails due to a 
malloc issue. So then, since it's a
malloc issue, we wind up trying to automatically reschedule this I/O, 
which recurses into the driver

with a bad lock held and boop.

Can you reproduce this?


So far I had it once. At least I have only one crashdump. I had one more 
reboot/crash, but no dump. I also have a watchdog running on this 
system, so not sure what caused the (unusual) reboot. I had a poudriere 
build running at both times. Since the crashdump I didn't run poudriere 
anymore.



If so, can you test this patch?


I give it a try tomorrow anyway, and I will try to stress the system 
again with poudriere.


The nvme is a cache and also a log device for a zpool, so not really a 
deterministic way to trigger access to it.


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF

signature.asc
Description: OpenPGP digital signature


Chromium with Widevine: kernel panic: condition vp->v_type == VDIR || VN_IS_DOOMED(vp) not met ⋯vfs_cache.c:3452 (vn_fullpath_dir)

2024-05-22 Thread Graham Perrin
Reproducible at .





Re: _mtx_lock_sleep: recursed on non-recursive mutex CAM device lock @ /..../sys/cam/nvme/nvme_da.c:469

2024-05-22 Thread Warner Losh
First order:

Looks like we're trying to schedule a trim, but that fails due to a malloc
issue. So then, since it's a
malloc issue, we wind up trying to automatically reschedule this I/O, which
recurses into the driver
with a bad lock held and boop.

Can you reproduce this?

If so, can you test this patch?

diff --git a/sys/cam/nvme/nvme_da.c b/sys/cam/nvme/nvme_da.c
index 3f6cf8702870..357e612200e9 100644
--- a/sys/cam/nvme/nvme_da.c
+++ b/sys/cam/nvme/nvme_da.c
@@ -1077,7 +1077,9 @@ ndastart(struct cam_periph *periph, union ccb
*start_ccb)

trim = malloc(sizeof(*trim), M_NVMEDA, M_ZERO |
M_NOWAIT);
if (trim == NULL) {
+ cam_periph_unlock(periph);
biofinish(bp, NULL, ENOMEM);
+ cam_periph_lock(periph);
xpt_release_ccb(start_ccb);
ndaschedule(periph);
return;

(the mailer may mangle it, so I've also attached it in case people want to
comment on this).

The logic here is that we have special logic in the ENOMEM case that will
recursively
call the start routine, which calls the scheduler which expects to be able
to take out the
periph lock. But it can't, because it's already locked. It also invokes a
pacing protocol that
slows down things even more.

What I'm now not sure about is whether or not we need to just release
start_ccb or if we also need to call ndaschedule too here. Seems like we
might not want to (but it's a safe nop if not). I've cc'd mav@ to see if he
has opinions on what's going on.

Warner


On Wed, May 22, 2024 at 11:22 AM Alexander Leidinger <
alexan...@leidinger.net> wrote:

> Hi,
>
> I got the panic in $Subject. Anyone an idea?
>
> Complete crashlog available at https://wiki.leidinger.net/core.txt.6
> (1.2 MB)
>
> Short version:
> ---snip---
> [11417] KDB: stack backtrace:
> [11417] db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfe043133f830
> [11417] vpanic() at vpanic+0x13f/frame 0xfe043133f960
> [11417] panic() at panic+0x43/frame 0xfe043133f9c0
> [11417] __mtx_lock_sleep() at __mtx_lock_sleep+0x491/frame
> 0xfe043133fa50
> [11417] __mtx_lock_flags() at __mtx_lock_flags+0x9c/frame
> 0xfe043133fa70
> [11417] ndastrategy() at ndastrategy+0x3c/frame 0xfe043133faa0
> [11417] g_disk_start() at g_disk_start+0x569/frame 0xfe043133fb00
> [11417] g_io_request() at g_io_request+0x2b6/frame 0xfe043133fb30
> [11417] g_io_deliver() at g_io_deliver+0x1cc/frame 0xfe043133fb80
> [11417] g_disk_done() at g_disk_done+0xee/frame 0xfe043133fbc0
> [11417] ndastart() at ndastart+0x4a3/frame 0xfe043133fc20
> [11417] xpt_run_allocq() at xpt_run_allocq+0xa5/frame 0xfe043133fc70
> [11417] ndastrategy() at ndastrategy+0x6d/frame 0xfe043133fca0
> [11417] g_disk_start() at g_disk_start+0x569/frame 0xfe043133fd00
> [11417] g_io_request() at g_io_request+0x2b6/frame 0xfe043133fd30
> [11417] g_io_request() at g_io_request+0x2b6/frame 0xfe043133fd60
> [11417] g_io_request() at g_io_request+0x2b6/frame 0xfe043133fd90
> [11417] vdev_geom_io_start() at vdev_geom_io_start+0x257/frame
> 0xfe043133fdc0
> [11417] zio_vdev_io_start() at zio_vdev_io_start+0x321/frame
> 0xfe043133fe10
> [11417] zio_execute() at zio_execute+0x78/frame 0xfe043133fe40
> [11417] taskqueue_run_locked() at taskqueue_run_locked+0x1c7/frame
> 0xfe043133fec0
> [11417] taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame
> 0xfe043133fef0
> ---snip---
>
> This is with a world from 2024-05-17-084543.
>
> Bye,
> Alexander.
>
> --
> http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
> http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF
>
diff --git a/sys/cam/nvme/nvme_da.c b/sys/cam/nvme/nvme_da.c
index 3f6cf8702870..357e612200e9 100644
--- a/sys/cam/nvme/nvme_da.c
+++ b/sys/cam/nvme/nvme_da.c
@@ -1077,7 +1077,9 @@ ndastart(struct cam_periph *periph, union ccb *start_ccb)
 
 			trim = malloc(sizeof(*trim), M_NVMEDA, M_ZERO | M_NOWAIT);
 			if (trim == NULL) {
+cam_periph_unlock(periph);
 biofinish(bp, NULL, ENOMEM);
+cam_periph_lock(periph);
 xpt_release_ccb(start_ccb);
 ndaschedule(periph);
 return;


_mtx_lock_sleep: recursed on non-recursive mutex CAM device lock @ /..../sys/cam/nvme/nvme_da.c:469

2024-05-22 Thread Alexander Leidinger

Hi,

I got the panic in $Subject. Anyone an idea?

Complete crashlog available at https://wiki.leidinger.net/core.txt.6 
(1.2 MB)


Short version:
---snip---
[11417] KDB: stack backtrace:
[11417] db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe043133f830

[11417] vpanic() at vpanic+0x13f/frame 0xfe043133f960
[11417] panic() at panic+0x43/frame 0xfe043133f9c0
[11417] __mtx_lock_sleep() at __mtx_lock_sleep+0x491/frame 
0xfe043133fa50
[11417] __mtx_lock_flags() at __mtx_lock_flags+0x9c/frame 
0xfe043133fa70

[11417] ndastrategy() at ndastrategy+0x3c/frame 0xfe043133faa0
[11417] g_disk_start() at g_disk_start+0x569/frame 0xfe043133fb00
[11417] g_io_request() at g_io_request+0x2b6/frame 0xfe043133fb30
[11417] g_io_deliver() at g_io_deliver+0x1cc/frame 0xfe043133fb80
[11417] g_disk_done() at g_disk_done+0xee/frame 0xfe043133fbc0
[11417] ndastart() at ndastart+0x4a3/frame 0xfe043133fc20
[11417] xpt_run_allocq() at xpt_run_allocq+0xa5/frame 0xfe043133fc70
[11417] ndastrategy() at ndastrategy+0x6d/frame 0xfe043133fca0
[11417] g_disk_start() at g_disk_start+0x569/frame 0xfe043133fd00
[11417] g_io_request() at g_io_request+0x2b6/frame 0xfe043133fd30
[11417] g_io_request() at g_io_request+0x2b6/frame 0xfe043133fd60
[11417] g_io_request() at g_io_request+0x2b6/frame 0xfe043133fd90
[11417] vdev_geom_io_start() at vdev_geom_io_start+0x257/frame 
0xfe043133fdc0
[11417] zio_vdev_io_start() at zio_vdev_io_start+0x321/frame 
0xfe043133fe10

[11417] zio_execute() at zio_execute+0x78/frame 0xfe043133fe40
[11417] taskqueue_run_locked() at taskqueue_run_locked+0x1c7/frame 
0xfe043133fec0
[11417] taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 
0xfe043133fef0

---snip---

This is with a world from 2024-05-17-084543.

Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


signature.asc
Description: OpenPGP digital signature


Re: __memcpy_chk family of functions

2024-05-21 Thread Dag-Erling Smørgrav
Marcin Cieslak  writes:
> Dag-Erling Smørgrav  writes:
> > Marcin Cieslak  writes:
> > > I think this (useful) change should go into the future release
> > > notes as a new feature.
> > Which change?
> https://reviews.freebsd.org/D32306 Import _FORTIFY_SOURCE
> implementation from NetBSD which introduced _memcpy_chk and friends to
> our libc.

See commit 9bfd3b4076a7 which has “Relnotes: yes”.  It will be added to
the 15.0 release notes in due time, possibly 14.2 as well if Kyle
decides to merge it.

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



Re: __memcpy_chk family of functions

2024-05-21 Thread Marcin Cieslak

On Tue, 21 May 2024, Dag-Erling Smørgrav wrote:


Marcin Cieslak  writes:

I think this (useful) change should go into the future release notes
as a new feature.


Which change?


https://reviews.freebsd.org/D32306 Import _FORTIFY_SOURCE implementation from 
NetBSD
which introduced _memcpy_chk and friends to our libc.



Re: __memcpy_chk family of functions

2024-05-21 Thread Dag-Erling Smørgrav
Marcin Cieslak  writes:
> I think this (useful) change should go into the future release notes
> as a new feature.

Which change?

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



Re: __memcpy_chk family of functions

2024-05-21 Thread Marcin Cieslak

On Tue, May 21, 2024 at 12:16 AM Dag-Erling Smørgrav 
wrote:


The purpose of UPDATING is to document changes that break backward
compatibility, i.e. running old binaries on a newer world.
happened here is that you tried to run newer binaries on an older world,


On Tue, 21 May 2024, Warner Losh wrote:


Also, our forward compatibility guarantees are extremely weak.


Just for clarification: I am not complaining about my breakage.
It was my stunt and it didn't work out, I got punished by
sitting there and waiting for llvm to compile.

I think this (useful) change should go into the future release notes
as a new feature. Where and how should this be documented?
Shall FreeBSD_version be bumped for this one?

For example, when libsys got introduced, we could learn about
this from the UPDATING file. I do not want UPDATING to be come
unreadable or a copy of git log, though.

Marcin



Re: __memcpy_chk family of functions

2024-05-21 Thread Warner Losh
On Tue, May 21, 2024 at 12:16 AM Dag-Erling Smørgrav 
wrote:

> Marcin Cieslak  writes:
> > I have updated some binary packages using pkg(8) but neglected to
> > rebuild the world and my favourite web browsers no longer started
> > complaining about the undefined symbol __memcpy_chk@FBSD_1.8
> >
> > Would that be a good idea to add that information to the Handbook and
> > possible bump FreeBSD_version and add this info to UPDATING?
>
> The purpose of UPDATING is to document changes that break backward
> compatibility, i.e. running old binaries on a newer world.  What
> happened here is that you tried to run newer binaries on an older world,
> an issue of _forward_ compatibility, which we've never promised.
> Besides, an entry in UPDATING wouldn't have helped you since your source
> tree predated the change that would have added it.
>

Also, our forward compatibility guarantees are extremely weak.  At most the
outer
bounds are around a sliding window to upgrade from source, using root in
single user
on the console. So having to revert to an old kernel to build a new kernel
when there's
a problem, or having to revert to an old kernel to rebuild old sources. And
even then
it's not something we test, so it's likely broken or broken once you get a
hair's width
away from that path. Plus, with BEs and the easy ability to roll back to
the prior BE,
even this level of forward compat is likely to decay further in the future.

Warner


Re: devd nomatch does not load uslcom anymore

2024-05-21 Thread Warner Losh
On Tue, May 21, 2024, 1:38 AM Ronald Klop  wrote:

> Hi,
>
> May 16th upgraded the kernel of my RPI4. Previous kernel was fom April
> 10th.
>
> From:
> FreeBSD 15.0-CURRENT #35 main-5716d902ae1: Wed Apr 10 22:59:37 CEST 2024
>
> To:
> FreeBSD 15.0-CURRENT #36 main-42b28f81521: Thu May 16 07:54:05 CEST 2024
>
> Today I noticed my USB serial cable to my RPI3 was not available anymore.
> It hadn't loaded 'uslcom' at boot.
> Adding 'hw.bus.devctl_nomatch_enabled=1' to /boot/loader.conf resolves the
> issue for now.
>
> The proper output during boot is:
> Starting devd.
> Autoloading module: uslcom
> uslcom0 on uhub1
> uslcom0:  rev 1.10/1.00, addr 2> on usbus0
>
> Does anybody need more information about this?
>

I committed a fix a couple of days ago that defaults to always generating
the events.

Or were you wanting a deep dive on usb?

Warner

Regards,
> Ronald.
>
>


devd nomatch does not load uslcom anymore

2024-05-21 Thread Ronald Klop

Hi,

May 16th upgraded the kernel of my RPI4. Previous kernel was fom April 10th.

From:
FreeBSD 15.0-CURRENT #35 main-5716d902ae1: Wed Apr 10 22:59:37 CEST 2024

To:
FreeBSD 15.0-CURRENT #36 main-42b28f81521: Thu May 16 07:54:05 CEST 2024

Today I noticed my USB serial cable to my RPI3 was not available anymore. It 
hadn't loaded 'uslcom' at boot.
Adding 'hw.bus.devctl_nomatch_enabled=1' to /boot/loader.conf resolves the 
issue for now.

The proper output during boot is:
Starting devd.
Autoloading module: uslcom
uslcom0 on uhub1
uslcom0:  on usbus0

Does anybody need more information about this?

Regards,
Ronald.


Re: __memcpy_chk family of functions

2024-05-21 Thread Dag-Erling Smørgrav
Marcin Cieslak  writes:
> I have updated some binary packages using pkg(8) but neglected to
> rebuild the world and my favourite web browsers no longer started
> complaining about the undefined symbol __memcpy_chk@FBSD_1.8
>
> Would that be a good idea to add that information to the Handbook and
> possible bump FreeBSD_version and add this info to UPDATING?

The purpose of UPDATING is to document changes that break backward
compatibility, i.e. running old binaries on a newer world.  What
happened here is that you tried to run newer binaries on an older world,
an issue of _forward_ compatibility, which we've never promised.
Besides, an entry in UPDATING wouldn't have helped you since your source
tree predated the change that would have added it.

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



__memcpy_chk family of functions

2024-05-20 Thread Marcin Cieslak

Hello,

I have updated some binary packages using pkg(8)
but neglected to rebuild the world and my favourite
web browsers no longer started complaining
about the undefined symbol __memcpy_chk@FBSD_1.8

Would that be a good idea to add that information
to the Handbook and possible bump FreeBSD_version
and add this info to UPDATING?

I fully accept that running -CURRENT as a daily
driver leads to surprises like this, but it took
me a bit long to figure out which change
has caused this [1].

I was also thinking, would that be ok to add
the synopsis of _memcpy_chk and other
functions to the relevant memcpy(3) etc. manpages?

Only when viewing the diff I found out I could
learn about those from ssp(3).

Thanks for keeping FreeBSD alive,
Marcin

[1] spoiler alert: https://reviews.freebsd.org/D32306



Re: bsdinstall wifi setup is broken on CURRENT

2024-05-20 Thread Renato Botelho

On 18/05/24 11:33, Alfonso S. Siciliano wrote:

On 5/16/24 20:40, Renato Botelho wrote:
I saw some users on a .br group complaining bsdinstall was failing to 
setup wifi network on 15.0 snapshots and tried it myself.  I was able 
to reproduce the problem and also noticed another one.




Thank you for your report, the video is highly appreciated to understand 
the problem quickly and exactly.


I noticed Network Selection screen only shows one line, it's not 
beautiful to navigate through items this way.  On 14.1-BETA2 it shows 
multiple lines so it seems to be a regression.


Problem 1. Looking at wlanconfig it seems related to $height $width 
$rows for the selecting menu. Please could you open a PR adding me, so 
we can test and solve.


I've fixed it locally and submitted a fix for review

https://reviews.freebsd.org/D45271



The problem users reported was: after selecting desired network it 
just starts over instead of asking for password.  I made a video [1] 
showing the problem.


Problem 2. I know this issue about --mixedform, my last import 2 day ago 
should solve a6d8be451f62d425b71a4874f7d4e133b9fb393c.
You could try the last main snapshot (yesterday 17 May), please let me 
know any problem.


Last snapshot still contains bsddialog 1.0 so I'll wait for the next one 
and give it a try.




Jessica, I've cc'd you because git shows you were the last person 
making changes in this area.  If it's not related and I made a 
mistake, just ignore me.


[1] https://youtube.com/shorts/Gmeckokw2a0


Again thanks for the video.

Best Regards,
Alfonso




--
Renato Botelho



RES: RES: RES: usb mouse not work on boot

2024-05-20 Thread Ivan Quitschal


> -Mensagem original-
> De: owner-freebsd-curr...@freebsd.org  curr...@freebsd.org> Em nome de Dag-Erling Smørgrav
> Enviada em: segunda-feira, 20 de maio de 2024 06:01
> Para: Ivan Quitschal 
> Cc: Vladimir Kondratyev ; Warner Losh
> ; Oleksandr Kryvulia ; FreeBSD
> Current 
> Assunto: Re: RES: RES: usb mouse not work on boot
> 
> Ivan Quitschal  writes:
> > > Ivan Quitschal  writes:
> > > > diff --git a/sys/dev/usb/input/usbhid.c
> > > > b/sys/dev/usb/input/usbhid.c index 174e1c28ae96..7b19d713c943
> > > > 100644
> > > > --- a/sys/dev/usb/input/usbhid.c
> > > > +++ b/sys/dev/usb/input/usbhid.c
> > > > @@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
> > > > if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
> > > > return (ENXIO);
> > > > +// return (BUS_PROBE_GENERIC + 1);
> > > > return (BUS_PROBE_DEFAULT + 1);  }
> > > You realize this diff does nothing at all, right?
> > Yeap, i also said it worked in 14-current old code only ,and has more
> > than 2 years already
> 
> No, I mean all this does is add a comment.  It has no effect on the code.
> 
> DES
> --
> Dag-Erling Smørgrav - d...@freebsd.org


Oh ok,, sorry

But actually it did change one return for another 

Usbhid.ko used to return this 
return (BUS_PROBE_GENERIC + 1);

and ums.ko used to take place instead , messing up our multimedia kbds and all
Was a priority issue when it shouldn’t matter

Then Vladmir changed to this
return (BUS_PROBE_DEFAULT + 1);  

and everything went to "voil" 


sorry for the miss communication
regards

tzk


Re: RES: RES: usb mouse not work on boot

2024-05-20 Thread Dag-Erling Smørgrav
Ivan Quitschal  writes:
> > Ivan Quitschal  writes:
> > > diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c
> > > index 174e1c28ae96..7b19d713c943 100644
> > > --- a/sys/dev/usb/input/usbhid.c
> > > +++ b/sys/dev/usb/input/usbhid.c
> > > @@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
> > > if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
> > > return (ENXIO);
> > > +// return (BUS_PROBE_GENERIC + 1);
> > > return (BUS_PROBE_DEFAULT + 1);
> > >  }
> > You realize this diff does nothing at all, right?
> Yeap, i also said it worked in 14-current old code only ,and has more
> than 2 years already

No, I mean all this does is add a comment.  It has no effect on the
code.

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



RES: RES: usb mouse not work on boot

2024-05-19 Thread Ivan Quitschal
Hans participated on that, and in that one too,  he was the last person I spoke 
on this forum, then I found out the tragic news, then I lost some interest, not 
about bsd, bot about those things I remember he was directly involved , like 
that one for example .. 

but yes I know its not a proper patch , he told me that too .. with all the 
9 and such ...

last usb thing I spoke about on this list
https://lists.freebsd.org/archives/freebsd-current/2022-September/002580.html
so please 
not necessary


Thanks
tzk


> -Mensagem original-
> De: owner-freebsd-curr...@freebsd.org  curr...@freebsd.org> Em nome de Ivan Quitschal
> Enviada em: domingo, 19 de maio de 2024 19:49
> Para: Dag-Erling Smørgrav 
> Cc: Vladimir Kondratyev ; Warner Losh
> ; Oleksandr Kryvulia ; FreeBSD
> Current 
> Assunto: RES: RES: usb mouse not work on boot
> 
> Yeap, i also said it worked in 14-current old code only ,and has more than  2 
> years
> already
> 
> Only point was whether freebsd had
> this
> return (BUS_PROBE_DEFAULT + 1); }
> or that
> return (BUS_PROBE_GENERIC + 1);
> 
> glad we have the first one , aka the right return
> 
> We have an entire email chain about this day back in the day august 2022 don’t
> remember correctly
> 
> 
> 
> > -Mensagem original-
> > De: Dag-Erling Smørgrav  Enviada em: domingo, 19 de
> > maio de 2024 08:04
> > Para: Ivan Quitschal 
> > Cc: Vladimir Kondratyev ; Warner Losh
> > ; Oleksandr Kryvulia ; FreeBSD
> > Current 
> > Assunto: Re: RES: usb mouse not work on boot
> >
> > Ivan Quitschal  writes:
> > > diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c
> > > index 174e1c28ae96..7b19d713c943 100644
> > > --- a/sys/dev/usb/input/usbhid.c
> > > +++ b/sys/dev/usb/input/usbhid.c
> > > @@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
> > > if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
> > > return (ENXIO);
> > > +// return (BUS_PROBE_GENERIC + 1);
> > > return (BUS_PROBE_DEFAULT + 1); }
> >
> > You realize this diff does nothing at all, right?
> >
> > DES
> > --
> > Dag-Erling Smørgrav - d...@freebsd.org


RES: RES: usb mouse not work on boot

2024-05-19 Thread Ivan Quitschal
Yeap, i also said it worked in 14-current old code only ,and has more than  2 
years already 

Only point was whether freebsd had 
this
return (BUS_PROBE_DEFAULT + 1); }
or that
return (BUS_PROBE_GENERIC + 1);

glad we have the first one , aka the right return 

We have an entire email chain about this day back in the day august 2022 don’t 
remember correctly



> -Mensagem original-
> De: Dag-Erling Smørgrav 
> Enviada em: domingo, 19 de maio de 2024 08:04
> Para: Ivan Quitschal 
> Cc: Vladimir Kondratyev ; Warner Losh
> ; Oleksandr Kryvulia ; FreeBSD
> Current 
> Assunto: Re: RES: usb mouse not work on boot
> 
> Ivan Quitschal  writes:
> > diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c
> > index 174e1c28ae96..7b19d713c943 100644
> > --- a/sys/dev/usb/input/usbhid.c
> > +++ b/sys/dev/usb/input/usbhid.c
> > @@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
> > if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
> > return (ENXIO);
> > +// return (BUS_PROBE_GENERIC + 1);
> > return (BUS_PROBE_DEFAULT + 1); }
> 
> You realize this diff does nothing at all, right?
> 
> DES
> --
> Dag-Erling Smørgrav - d...@freebsd.org


Re: RES: usb mouse not work on boot

2024-05-19 Thread Dag-Erling Smørgrav
Ivan Quitschal  writes:
> diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c
> index 174e1c28ae96..7b19d713c943 100644
> --- a/sys/dev/usb/input/usbhid.c
> +++ b/sys/dev/usb/input/usbhid.c
> @@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
> if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
> return (ENXIO);
> +// return (BUS_PROBE_GENERIC + 1);
> return (BUS_PROBE_DEFAULT + 1);
> }

You realize this diff does nothing at all, right?

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



Re: usb mouse not work on boot

2024-05-18 Thread Chris

On 2024-05-18 08:33, Warner Losh wrote:

On Sat, May 18, 2024, 9:22 AM Oleksandr Kryvulia 
wrote:


18.05.24 16:06, Warner Losh:



On Sat, May 18, 2024, 6:51 AM Oleksandr Kryvulia 
wrote:


18.05.24 12:59, Oleksandr Kryvulia:

18.05.24 12:55, Dag-Erling Smørgrav:

Oleksandr Kryvulia   
writes:


Gary Jennejohn   writes:

Try adding uhid_load="YES" to your /boot/loader.conf.  With that
added the module should be automatically loaded during the kernel
boot.

As workaround I already have kld_list+="uhid" in /etc/rc.conf.

I hope you don't mean that literally, because /etc/rc.conf is a shell
script and += is not valid shell syntax.  On the other hand, something
like

kld_list="${kld_list} uhid"

Yes, you are right. I mean
sysrc kld_list+="uhid"


One more correction. Via kld_list I need load ums(4), loading only
uhid(4) does not solve a problem.




You don't need to change kld_list. In fact, you should undo any changes
you've made there. Undo everything in loader.conf you've done.

This is a bug in the boot optimization stuff. Or rather, this exposes a
long standing bug in the USB code where there's an asymmetry between the
nomatch events and the bus tree it presents to devctl causing devmatch to
fail when the nomatch events aren't present on boot.

Just set hw.bus.devctl_nomatch_enabled=1 in /boot/loader.conf and reboot.
Or update to the change I'm about to make.


Thanks for the detailed explanation, Warner. Interesting that on my system
hw.bus.devctl_nomatch_enabled=1 is set by /etc/rc.d/devmatch but only
explicit set it in /boot/loader.conf did the trick. That is why I think
this sysctl don't work in my case.



Yea. That's the optimization. We don't start generating events until it is
one. Setting it in the bootloader causes all events to coke through.
Setting it in devmatch turns them on after we run devmatch the first time,
omitting all of the ones generated on boot.

Why is sysctl.conf(5) not the best location for this?



Warner




--Chris



Re: RES: RES: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 21:39, Ivan Quitschal:


Not sure, im on 14-current because of my synergy  insists on crashing  
after version synergy-1.14.0.4,3


But that’s pretty simple to check

Just do a
# grep ‘return (BUS_PROBE_’ /user/src/sys/dev/usb/input/usbhidc in 
your own kernel source tree to see what line is there




That's from my source tree:

root@thinkpad:/usr/src # grep 'return (BUS_PROBE_' 
/usr/src/sys/dev/usb/input/usbhid.c

   return (BUS_PROBE_DEFAULT + 1);

RES: RES: usb mouse not work on boot

2024-05-18 Thread Ivan Quitschal
Not sure, im on 14-current because of my synergy  insists on crashing  after 
version synergy-1.14.0.4,3
But that's pretty simple to check

Just do a
# grep 'return (BUS_PROBE_' /user/src/sys/dev/usb/input/usbhid.c in your own 
kernel source tree to see what line is there

Thanks

Ivan


De: owner-freebsd-curr...@freebsd.org  Em 
nome de Oleksandr Kryvulia
Enviada em: sábado, 18 de maio de 2024 15:29
Para: freebsd-current@freebsd.org
Assunto: Re: RES: usb mouse not work on boot

18.05.24 19:29, Ivan Quitschal:

Hi Warner /  WBR / Oleksandr

Im not sure if that's the case with this uhid.ko, but you guys remember I had a 
priority issue with this module and Vladimir made me a patch to fix the attach 
priority?

Warner, was it fixed since then?


Let me show the patch I use to this very day important line is this, the patch 
might be wrong , because im still on 14-current

+// return (BUS_PROBE_GENERIC + 1);
return (BUS_PROBE_DEFAULT + 1);



diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c
index 174e1c28ae96..7b19d713c943 100644
--- a/sys/dev/usb/input/usbhid.c
+++ b/sys/dev/usb/input/usbhid.c
@@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
return (ENXIO);
+// return (BUS_PROBE_GENERIC + 1);
return (BUS_PROBE_DEFAULT + 1);
}


If I correctly understand this patch alredy in main with 
975407b1d8dcceac2b54e2c4df96aadec7dc4c3a



Re: RES: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 19:29, Ivan Quitschal:


Hi Warner /  WBR / Oleksandr

Im not sure if that’s the case with this uhid.ko, but you guys 
remember I had a priority issue with this module and Vladimir made me 
a patch to fix the attach priority?


Warner, was it fixed since then?

Let me show the patch I use to this very day important line is this, 
the patch might be wrong , because im still on 14-current


+// return (BUS_PROBE_GENERIC + 1);

    return (BUS_PROBE_DEFAULT + 1);

diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c

index 174e1c28ae96..7b19d713c943 100644

--- a/sys/dev/usb/input/usbhid.c

+++ b/sys/dev/usb/input/usbhid.c

@@ -802,6 +802,7 @@ usbhid_probe(device_t dev)

    if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))

    return (ENXIO);

+// return (BUS_PROBE_GENERIC + 1);

    return (BUS_PROBE_DEFAULT + 1);

}



If I correctly understand this patch alredy in main with 
975407b1d8dcceac2b54e2c4df96aadec7dc4c3a


RES: usb mouse not work on boot

2024-05-18 Thread Ivan Quitschal
Hi Warner /  WBR / Oleksandr

Im not sure if that’s the case with this uhid.ko, but you guys remember I had a 
priority issue with this module and Vladimir made me a patch to fix the attach 
priority?

Warner, was it fixed since then?


Let me show the patch I use to this very day important line is this, the patch 
might be wrong , because im still on 14-current

+// return (BUS_PROBE_GENERIC + 1);
return (BUS_PROBE_DEFAULT + 1);



diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c
index 174e1c28ae96..7b19d713c943 100644
--- a/sys/dev/usb/input/usbhid.c
+++ b/sys/dev/usb/input/usbhid.c
@@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
return (ENXIO);
+// return (BUS_PROBE_GENERIC + 1);
return (BUS_PROBE_DEFAULT + 1);
}

Thanks

--tzk

De: owner-freebsd-curr...@freebsd.org  Em 
nome de Warner Losh
Enviada em: sábado, 18 de maio de 2024 12:33
Para: Oleksandr Kryvulia 
Cc: FreeBSD Current 
Assunto: Re: usb mouse not work on boot


On Sat, May 18, 2024, 9:22 AM Oleksandr Kryvulia 
mailto:shur...@shurik.kiev.ua>> wrote:
18.05.24 16:06, Warner Losh:


On Sat, May 18, 2024, 6:51 AM Oleksandr Kryvulia 
mailto:shur...@shurik.kiev.ua>> wrote:
18.05.24 12:59, Oleksandr Kryvulia:

18.05.24 12:55, Dag-Erling Smørgrav:


Oleksandr Kryvulia  
writes:

Gary Jennejohn  writes:

Try adding uhid_load="YES" to your /boot/loader.conf.  With that

added the module should be automatically loaded during the kernel

boot.

As workaround I already have kld_list+="uhid" in /etc/rc.conf.

I hope you don't mean that literally, because /etc/rc.conf is a shell

script and += is not valid shell syntax.  On the other hand, something

like



kld_list="${kld_list} uhid"
Yes, you are right. I mean
sysrc kld_list+="uhid"

One more correction. Via kld_list I need load ums(4), loading only uhid(4) does 
not solve a problem.


You don't need to change kld_list. In fact, you should undo any changes you've 
made there. Undo everything in loader.conf you've done.

This is a bug in the boot optimization stuff. Or rather, this exposes a long 
standing bug in the USB code where there's an asymmetry between the nomatch 
events and the bus tree it presents to devctl causing devmatch to fail when the 
nomatch events aren't present on boot.

Just set hw.bus.devctl_nomatch_enabled=1 in /boot/loader.conf and reboot. Or 
update to the change I'm about to make.


Thanks for the detailed explanation, Warner. Interesting that on my system 
hw.bus.devctl_nomatch_enabled=1 is set by /etc/rc.d/devmatch but only explicit 
set it in /boot/loader.conf did the trick. That is why I think this sysctl 
don't work in my case.

Yea. That's the optimization. We don't start generating events until it is one. 
Setting it in the bootloader causes all events to coke through. Setting it in 
devmatch turns them on after we run devmatch the first time, omitting all of 
the ones generated on boot.

Warner


Re: usb mouse not work on boot

2024-05-18 Thread Warner Losh
On Sat, May 18, 2024, 9:22 AM Oleksandr Kryvulia 
wrote:

> 18.05.24 16:06, Warner Losh:
>
>
>
> On Sat, May 18, 2024, 6:51 AM Oleksandr Kryvulia 
> wrote:
>
>> 18.05.24 12:59, Oleksandr Kryvulia:
>>
>> 18.05.24 12:55, Dag-Erling Smørgrav:
>>
>> Oleksandr Kryvulia   writes:
>>
>> Gary Jennejohn   writes:
>>
>> Try adding uhid_load="YES" to your /boot/loader.conf.  With that
>> added the module should be automatically loaded during the kernel
>> boot.
>>
>> As workaround I already have kld_list+="uhid" in /etc/rc.conf.
>>
>> I hope you don't mean that literally, because /etc/rc.conf is a shell
>> script and += is not valid shell syntax.  On the other hand, something
>> like
>>
>> kld_list="${kld_list} uhid"
>>
>> Yes, you are right. I mean
>> sysrc kld_list+="uhid"
>>
>>
>> One more correction. Via kld_list I need load ums(4), loading only
>> uhid(4) does not solve a problem.
>>
>
>
> You don't need to change kld_list. In fact, you should undo any changes
> you've made there. Undo everything in loader.conf you've done.
>
> This is a bug in the boot optimization stuff. Or rather, this exposes a
> long standing bug in the USB code where there's an asymmetry between the
> nomatch events and the bus tree it presents to devctl causing devmatch to
> fail when the nomatch events aren't present on boot.
>
> Just set hw.bus.devctl_nomatch_enabled=1 in /boot/loader.conf and reboot.
> Or update to the change I'm about to make.
>
>
> Thanks for the detailed explanation, Warner. Interesting that on my system
> hw.bus.devctl_nomatch_enabled=1 is set by /etc/rc.d/devmatch but only
> explicit set it in /boot/loader.conf did the trick. That is why I think
> this sysctl don't work in my case.
>

Yea. That's the optimization. We don't start generating events until it is
one. Setting it in the bootloader causes all events to coke through.
Setting it in devmatch turns them on after we run devmatch the first time,
omitting all of the ones generated on boot.

Warner

>


Re: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 16:06, Warner Losh:



On Sat, May 18, 2024, 6:51 AM Oleksandr Kryvulia 
 wrote:


18.05.24 12:59, Oleksandr Kryvulia:

18.05.24 12:55, Dag-Erling Smørgrav:

Oleksandr Kryvulia   
 writes:

Gary Jennejohn    writes:

Try adding uhid_load="YES" to your /boot/loader.conf.  With that
added the module should be automatically loaded during the kernel
boot.

As workaround I already have kld_list+="uhid" in /etc/rc.conf.

I hope you don't mean that literally, because /etc/rc.conf is a shell
script and += is not valid shell syntax.  On the other hand, something
like

kld_list="${kld_list} uhid"

Yes, you are right. I mean
sysrc kld_list+="uhid"


One more correction. Via kld_list I need load ums(4), loading only
uhid(4) does not solve a problem.



You don't need to change kld_list. In fact, you should undo any 
changes you've made there. Undo everything in loader.conf you've done.


This is a bug in the boot optimization stuff. Or rather, this exposes 
a long standing bug in the USB code where there's an asymmetry between 
the nomatch events and the bus tree it presents to devctl causing 
devmatch to fail when the nomatch events aren't present on boot.


Just set hw.bus.devctl_nomatch_enabled=1 in /boot/loader.conf and 
reboot. Or update to the change I'm about to make.




Thanks for the detailed explanation, Warner. Interesting that on my 
system hw.bus.devctl_nomatch_enabled=1 is set by /etc/rc.d/devmatch but 
only explicit set it in /boot/loader.conf did the trick. That is why I 
think this sysctl don't work in my case.

Re: bsdinstall wifi setup is broken on CURRENT

2024-05-18 Thread Alfonso S. Siciliano

On 5/16/24 20:40, Renato Botelho wrote:
I saw some users on a .br group complaining bsdinstall was failing to 
setup wifi network on 15.0 snapshots and tried it myself.  I was able to 
reproduce the problem and also noticed another one.




Thank you for your report, the video is highly appreciated to understand 
the problem quickly and exactly.


I noticed Network Selection screen only shows one line, it's not 
beautiful to navigate through items this way.  On 14.1-BETA2 it shows 
multiple lines so it seems to be a regression.


Problem 1. Looking at wlanconfig it seems related to $height $width 
$rows for the selecting menu. Please could you open a PR adding me, so 
we can test and solve.




The problem users reported was: after selecting desired network it just 
starts over instead of asking for password.  I made a video [1] showing 
the problem.


Problem 2. I know this issue about --mixedform, my last import 2 day ago 
should solve a6d8be451f62d425b71a4874f7d4e133b9fb393c.
You could try the last main snapshot (yesterday 17 May), please let me 
know any problem.




Jessica, I've cc'd you because git shows you were the last person making 
changes in this area.  If it's not related and I made a mistake, just 
ignore me.


[1] https://youtube.com/shorts/Gmeckokw2a0


Again thanks for the video.

Best Regards,
Alfonso




Re: usb mouse not work on boot

2024-05-18 Thread Warner Losh
On Sat, May 18, 2024 at 6:51 AM Oleksandr Kryvulia 
wrote:

> 18.05.24 12:59, Oleksandr Kryvulia:
>
> 18.05.24 12:55, Dag-Erling Smørgrav:
>
> Oleksandr Kryvulia   writes:
>
> Gary Jennejohn   writes:
>
> Try adding uhid_load="YES" to your /boot/loader.conf.  With that
> added the module should be automatically loaded during the kernel
> boot.
>
> As workaround I already have kld_list+="uhid" in /etc/rc.conf.
>
> I hope you don't mean that literally, because /etc/rc.conf is a shell
> script and += is not valid shell syntax.  On the other hand, something
> like
>
> kld_list="${kld_list} uhid"
>
> Yes, you are right. I mean
> sysrc kld_list+="uhid"
>
>
> One more correction. Via kld_list I need load ums(4), loading only uhid(4)
> does not solve a problem.
>

Also, in this case, kld_list is a terrible place to load the files. You're
better off loading them with xxx_load=YES in loader.conf. The reason is
that both uhid and ums will match your mouse. kld_list loads these in a
random order (effectively) and the first one to load will claim the device,
since there's no re-probe when the next one loads. You should never use it,
unless the module you're loading isn't supported by the boot loader (like
drm-kmod). The old advice was to put everything in kld_list and it would
speed up boot, but all the performance bugs in the boot loader have been
fixed by a combination of moving to UEFI (which is generally faster),
BIOSes with performance bugs disappearing 10 years ago and block caching
being added to the boot loader. It should almost always be empty or just
drm-mod these days (unless you somehow have special needs).

By adding uhid last to this list in this way, you're guaranteeing you'll
hit this bug because it's not after ums, and that things won't work.

Warner


Re: usb mouse not work on boot

2024-05-18 Thread Warner Losh
On Sat, May 18, 2024, 6:51 AM Oleksandr Kryvulia 
wrote:

> 18.05.24 12:59, Oleksandr Kryvulia:
>
> 18.05.24 12:55, Dag-Erling Smørgrav:
>
> Oleksandr Kryvulia   writes:
>
> Gary Jennejohn   writes:
>
> Try adding uhid_load="YES" to your /boot/loader.conf.  With that
> added the module should be automatically loaded during the kernel
> boot.
>
> As workaround I already have kld_list+="uhid" in /etc/rc.conf.
>
> I hope you don't mean that literally, because /etc/rc.conf is a shell
> script and += is not valid shell syntax.  On the other hand, something
> like
>
> kld_list="${kld_list} uhid"
>
> Yes, you are right. I mean
> sysrc kld_list+="uhid"
>
>
> One more correction. Via kld_list I need load ums(4), loading only uhid(4)
> does not solve a problem.
>


You don't need to change kld_list. In fact, you should undo any changes
you've made there. Undo everything in loader.conf you've done.

This is a bug in the boot optimization stuff. Or rather, this exposes a
long standing bug in the USB code where there's an asymmetry between the
nomatch events and the bus tree it presents to devctl causing devmatch to
fail when the nomatch events aren't present on boot.

Just set hw.bus.devctl_nomatch_enabled=1 in /boot/loader.conf and reboot.
Or update to the change I'm about to make.

Warner


Re: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 12:59, Oleksandr Kryvulia:

18.05.24 12:55, Dag-Erling Smørgrav:

Oleksandr Kryvulia  writes:

Gary Jennejohn  writes:

Try adding uhid_load="YES" to your /boot/loader.conf.  With that
added the module should be automatically loaded during the kernel
boot.

As workaround I already have kld_list+="uhid" in /etc/rc.conf.

I hope you don't mean that literally, because /etc/rc.conf is a shell
script and += is not valid shell syntax.  On the other hand, something
like

kld_list="${kld_list} uhid"

Yes, you are right. I mean
sysrc kld_list+="uhid"


One more correction. Via kld_list I need load ums(4), loading only 
uhid(4) does not solve a problem.


Re: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 12:55, Dag-Erling Smørgrav:

Oleksandr Kryvulia  writes:

Gary Jennejohn  writes:

Try adding uhid_load="YES" to your /boot/loader.conf.  With that
added the module should be automatically loaded during the kernel
boot.

As workaround I already have kld_list+="uhid" in /etc/rc.conf.

I hope you don't mean that literally, because /etc/rc.conf is a shell
script and += is not valid shell syntax.  On the other hand, something
like

kld_list="${kld_list} uhid"

Yes, you are right. I mean
sysrc kld_list+="uhid"

Re: usb mouse not work on boot

2024-05-18 Thread Dag-Erling Smørgrav
Oleksandr Kryvulia  writes:
> Gary Jennejohn  writes:
> > Try adding uhid_load="YES" to your /boot/loader.conf.  With that
> > added the module should be automatically loaded during the kernel
> > boot.
> As workaround I already have kld_list+="uhid" in /etc/rc.conf.

I hope you don't mean that literally, because /etc/rc.conf is a shell
script and += is not valid shell syntax.  On the other hand, something
like

kld_list="${kld_list} uhid"

should work, and is preferable to Gary's suggestion since loading
modules pre-boot is significantly slower and should only be done for
modules which are required to boot or mount the root filesystem, such as
zfs.

> But IMHO it some regression.

I agree, and 6437872c1d66 should be reverted until devmatch is capable
of loading uhid.

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



Re: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 12:42, Tomek CEDRO:

does it also affect usb keyboard in single boot mode?


Good question. I don't have usb keyboerd right now and will check it a 
bit later.





Re: usb mouse not work on boot

2024-05-18 Thread Nuno Teixeira
Hello,

To fix my setup with usb mouse and audio dac on both amd64 (laptop) and
rpi4:

/boot/loader.conf.local:
snd_uaudio_load="YES"
ums_load="YES"

This restores previous behaviour as it detects mouse before login prompt
and audio dac that it is processed correctly by sysctl.

Cheers,

Oleksandr Kryvulia  escreveu (sábado, 18/05/2024
à(s) 09:24):

> 18.05.24 10:26, Gary Jennejohn:
> > On Sat, 18 May 2024 09:20:24 +0300
> > Oleksandr Kryvulia  wrote:
> >
> >> After 6437872c1d665c2605f54e8ff040b0ba41edad07 my usb mouse no longer
> >> works on boot because uhid(4) is not autoloaded. To make it work I need
> >> manualy load uhid or replug my usb mouse.
> >>
> > Try adding uhid_load="YES" to your /boot/loader.conf.  With that added
> > the module should be automatically loaded during the kernel boot.
>
> As workaround I already have kld_list+="uhid" in /etc/rc.conf. But IMHO
> it some regression.
>
>
>

-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


Re: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 10:26, Gary Jennejohn:

On Sat, 18 May 2024 09:20:24 +0300
Oleksandr Kryvulia  wrote:


After 6437872c1d665c2605f54e8ff040b0ba41edad07 my usb mouse no longer
works on boot because uhid(4) is not autoloaded. To make it work I need
manualy load uhid or replug my usb mouse.


Try adding uhid_load="YES" to your /boot/loader.conf.  With that added
the module should be automatically loaded during the kernel boot.


As workaround I already have kld_list+="uhid" in /etc/rc.conf. But IMHO 
it some regression.





Re: usb mouse not work on boot

2024-05-18 Thread Gary Jennejohn
On Sat, 18 May 2024 09:20:24 +0300
Oleksandr Kryvulia  wrote:

> After 6437872c1d665c2605f54e8ff040b0ba41edad07 my usb mouse no longer
> works on boot because uhid(4) is not autoloaded. To make it work I need
> manualy load uhid or replug my usb mouse.
>

Try adding uhid_load="YES" to your /boot/loader.conf.  With that added
the module should be automatically loaded during the kernel boot.

--
Gary Jennejohn



usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia
After 6437872c1d665c2605f54e8ff040b0ba41edad07 my usb mouse no longer 
works on boot because uhid(4) is not autoloaded. To make it work I need 
manualy load uhid or replug my usb mouse.




usb devices discovery delay

2024-05-17 Thread Nuno Teixeira
Hello all,

At  recent main-n270203-2790ff21452f usb devices mouse and audio dac get
detected 30sec after login prompt.

Don't see anything relevant on dmesg but I see that:

sysctl.conf
dev.pcm.4.play.vchanmode=passthrough

gives an error on not existing dev.pcm.4 (usb audio dac) what means that
usb devices was not detected at this time.

Anyone experience it?

Thanks

-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


Re: kldload tpm: Fail to load: "link_elf_obj: symbol tpm_bus_driver undefined"

2024-05-17 Thread Nuno Teixeira
Working fine!

Thanks for fast fix.

Justin Hibbits  escreveu (sexta, 17/05/2024 à(s)
13:57):

> On Fri, 17 May 2024 11:09:00 +0100
> Nuno Teixeira  wrote:
>
> > Hello,
> >
> > tpm kernel module fails to load starting on main from May 9.
> > Updated today and same error:
> >
> > ```
> > $ kldload tpm
> > kldload: an error occurred while loading module tpm. Please check
> > dmesg(8) for more details.
> >
> > (dmesg)
> > link_elf_obj: symbol tpm_bus_driver undefined
> > linker_load_file: /boot/kernel/tpm.ko - unsupported file type
> > ```
> >
> > I believe it is related to:
> >
> > ---
> > commit 10eea8dc8c4f3d2a3495e7fb08837d91adf465e9
> > Author: Justin Hibbits 
> > Date:   Thu May 9 15:27:35 2024 -0400
> >
> > tpm20: Support partial reads
> >
> > Summary:
> > In some cases the TPM utilities may read only a partial block,
> > instead of a full block.  If a new command starts while in the middle
> > of a read it may cause the TPM to go catatonic and no longer respond
> > to SPI.
> >
> > Reviewed by:kd
> > Obtained from:  Juniper Networks, Inc.
> > Differential Revision: https://reviews.freebsd.org/D45140
> > ---
> >
> > I use tpm for bhyve/Win11.
> >
> > Thanks,
>
> Sorry for the breakage.  Should be fixed by 62adeb92.
>
> - Justin
>


-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


Re: kldload tpm: Fail to load: "link_elf_obj: symbol tpm_bus_driver undefined"

2024-05-17 Thread Justin Hibbits
On Fri, 17 May 2024 11:09:00 +0100
Nuno Teixeira  wrote:

> Hello,
> 
> tpm kernel module fails to load starting on main from May 9.
> Updated today and same error:
> 
> ```
> $ kldload tpm
> kldload: an error occurred while loading module tpm. Please check
> dmesg(8) for more details.
> 
> (dmesg)
> link_elf_obj: symbol tpm_bus_driver undefined
> linker_load_file: /boot/kernel/tpm.ko - unsupported file type
> ```
> 
> I believe it is related to:
> 
> ---
> commit 10eea8dc8c4f3d2a3495e7fb08837d91adf465e9
> Author: Justin Hibbits 
> Date:   Thu May 9 15:27:35 2024 -0400
> 
> tpm20: Support partial reads
> 
> Summary:
> In some cases the TPM utilities may read only a partial block,
> instead of a full block.  If a new command starts while in the middle
> of a read it may cause the TPM to go catatonic and no longer respond
> to SPI.
> 
> Reviewed by:kd
> Obtained from:  Juniper Networks, Inc.
> Differential Revision: https://reviews.freebsd.org/D45140
> ---
> 
> I use tpm for bhyve/Win11.
> 
> Thanks,

Sorry for the breakage.  Should be fixed by 62adeb92.

- Justin



Re: Panic: lock "lnxspin" 0xfffff800176c0730 already initialized

2024-05-17 Thread David Wolfskill
On Fri, May 17, 2024 at 08:00:05AM +0200, Emmanuel Vadot wrote:
> ...
>  Indeed, even if I know that I tested with GENERIC and amdgpu I think
> that I've only tested GENERIC-NODEBUG with i915kms.
>  Anyway, I've pushed both patches now. Sorry for the breakage.
> 
>  Cheers,
> 

Success:

g1-70(15.0-C)[1] uname -aUK
FreeBSD g1-70.catwhisker.org 15.0-CURRENT FreeBSD 15.0-CURRENT #147 
main-n270199-cd3681011001: Fri May 17 11:10:47 UTC 2024 
r...@g1-70.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY amd64 
1500018 1500018

Thank you! :-)

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Please do not mistake "authoritarian" for "conservative" -- or vice versa.

See https://www.catwhisker.org/~david/publickey.gpg for my public key.


signature.asc
Description: PGP signature


kldload tpm: Fail to load: "link_elf_obj: symbol tpm_bus_driver undefined"

2024-05-17 Thread Nuno Teixeira
Hello,

tpm kernel module fails to load starting on main from May 9.
Updated today and same error:

```
$ kldload tpm
kldload: an error occurred while loading module tpm. Please check dmesg(8)
for more details.

(dmesg)
link_elf_obj: symbol tpm_bus_driver undefined
linker_load_file: /boot/kernel/tpm.ko - unsupported file type
```

I believe it is related to:

---
commit 10eea8dc8c4f3d2a3495e7fb08837d91adf465e9
Author: Justin Hibbits 
Date:   Thu May 9 15:27:35 2024 -0400

tpm20: Support partial reads

Summary:
In some cases the TPM utilities may read only a partial block, instead
of a full block.  If a new command starts while in the middle of a read
it may cause the TPM to go catatonic and no longer respond to SPI.

Reviewed by:kd
Obtained from:  Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D45140
---

I use tpm for bhyve/Win11.

Thanks,
-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


Re: Panic: lock "lnxspin" 0xfffff800176c0730 already initialized

2024-05-17 Thread Emmanuel Vadot
On Thu, 16 May 2024 22:10:16 -0700
Ryan Libby  wrote:

> On Thu, May 16, 2024 at 9:56?PM Emmanuel Vadot  wrote:
> >
> > On Thu, 16 May 2024 10:27:40 -0700
> > Ryan Libby  wrote:
> >
> > > On Thu, May 16, 2024 at 6:00?AM David Wolfskill  
> > > wrote:
> > > >
> > > > This is running main-n270174-abb1a1340e3f (built in-place from
> > > > main-n270163-154ad8e0f88f), with ports at main-n663685-3f732745ab06;
> > > > the ports-resident kernel modules were rebuilt with the kernel,
> > > > courtesy (e.g.):
> > > >
> > > > g1-70(14.1-S)[4] grep '^PORT' /etc/src.conf
> > > > PORTS_MODULES+=graphics/drm-61-kmod
> > > >
> > > > And since I dislike "sample sizes of one," I have this result on
> > > > two different laptops, each of which has both Nvidia & Intel graphics
> > > > (but for the older one (M4800), I stopped using (& building) the
> > > > Nvidia driver, since enabling it appears to disable GLX).
> > > >
> > > > Anyway: photos of the backtraces are at
> > > > https://www.catwhisker.org/~david/FreeBSD/head/n270174/
> > > > as are copies of the build typescripts.
> > > >
> > > > Unfortunately, the panic message itself had (just) scrolled off the
> > > > top at the time I took the photos, but I hand-typed it (from the
> > > > M4800) in the Subject.
> > > >
> > > > Peace,
> > > > david
> > > > --
> > > > David H. Wolfskill  da...@catwhisker.org
> > > > Please do not mistake "authoritarian" for "conservative" -- or vice 
> > > > versa.
> > > >
> > > > See https://www.catwhisker.org/~david/publickey.gpg for my public key.
> > >
> > > Maybe regression from ae38a1a1bfdf320089c254e4dbffdf4769d89110 by manu.
> > >
> > > It looks like spin_lock_init was changed to no longer zero out the
> > > mutex before calling mtx_init, but the MTX_NEW flag was not added.
> > >
> > > Ryan
> > >
> >
> > Could be, I cannot reproduce this here (either with i915kms or amdgpu)
> > but I guess that depending on the hardware version or number of screens
> > etc ... code path is different and might trigger this.
> >  David can you test with
> > https://people.freebsd.org/~manu/0001-linuxkpi-Fix-spin_lock_init.patch
> > just to be sure that it fixes this issue ?
> >
> >  Cheers,
> >
> > --
> > Emmanuel Vadot  
> 
> It may depend on getting lucky with the uninitialized junk too, and you would
> need a kernel with KASSERTs enabled.
> 
> manu, I think the rwlock patch 5c0a1923486e65cd47398e52c03cb289d6120a78
> may need the same treatment with RW_NEW.
> 
> Ryan
> 

 Indeed, even if I know that I tested with GENERIC and amdgpu I think
that I've only tested GENERIC-NODEBUG with i915kms.
 Anyway, I've pushed both patches now. Sorry for the breakage.

 Cheers,

-- 
Emmanuel Vadot  



Re: Panic: lock "lnxspin" 0xfffff800176c0730 already initialized

2024-05-16 Thread Ryan Libby
On Thu, May 16, 2024 at 9:56 PM Emmanuel Vadot  wrote:
>
> On Thu, 16 May 2024 10:27:40 -0700
> Ryan Libby  wrote:
>
> > On Thu, May 16, 2024 at 6:00?AM David Wolfskill  
> > wrote:
> > >
> > > This is running main-n270174-abb1a1340e3f (built in-place from
> > > main-n270163-154ad8e0f88f), with ports at main-n663685-3f732745ab06;
> > > the ports-resident kernel modules were rebuilt with the kernel,
> > > courtesy (e.g.):
> > >
> > > g1-70(14.1-S)[4] grep '^PORT' /etc/src.conf
> > > PORTS_MODULES+=graphics/drm-61-kmod
> > >
> > > And since I dislike "sample sizes of one," I have this result on
> > > two different laptops, each of which has both Nvidia & Intel graphics
> > > (but for the older one (M4800), I stopped using (& building) the
> > > Nvidia driver, since enabling it appears to disable GLX).
> > >
> > > Anyway: photos of the backtraces are at
> > > https://www.catwhisker.org/~david/FreeBSD/head/n270174/
> > > as are copies of the build typescripts.
> > >
> > > Unfortunately, the panic message itself had (just) scrolled off the
> > > top at the time I took the photos, but I hand-typed it (from the
> > > M4800) in the Subject.
> > >
> > > Peace,
> > > david
> > > --
> > > David H. Wolfskill  da...@catwhisker.org
> > > Please do not mistake "authoritarian" for "conservative" -- or vice versa.
> > >
> > > See https://www.catwhisker.org/~david/publickey.gpg for my public key.
> >
> > Maybe regression from ae38a1a1bfdf320089c254e4dbffdf4769d89110 by manu.
> >
> > It looks like spin_lock_init was changed to no longer zero out the
> > mutex before calling mtx_init, but the MTX_NEW flag was not added.
> >
> > Ryan
> >
>
> Could be, I cannot reproduce this here (either with i915kms or amdgpu)
> but I guess that depending on the hardware version or number of screens
> etc ... code path is different and might trigger this.
>  David can you test with
> https://people.freebsd.org/~manu/0001-linuxkpi-Fix-spin_lock_init.patch
> just to be sure that it fixes this issue ?
>
>  Cheers,
>
> --
> Emmanuel Vadot  

It may depend on getting lucky with the uninitialized junk too, and you would
need a kernel with KASSERTs enabled.

manu, I think the rwlock patch 5c0a1923486e65cd47398e52c03cb289d6120a78
may need the same treatment with RW_NEW.

Ryan



Re: Panic: lock "lnxspin" 0xfffff800176c0730 already initialized

2024-05-16 Thread Emmanuel Vadot
On Thu, 16 May 2024 10:27:40 -0700
Ryan Libby  wrote:

> On Thu, May 16, 2024 at 6:00?AM David Wolfskill  wrote:
> >
> > This is running main-n270174-abb1a1340e3f (built in-place from
> > main-n270163-154ad8e0f88f), with ports at main-n663685-3f732745ab06;
> > the ports-resident kernel modules were rebuilt with the kernel,
> > courtesy (e.g.):
> >
> > g1-70(14.1-S)[4] grep '^PORT' /etc/src.conf
> > PORTS_MODULES+=graphics/drm-61-kmod
> >
> > And since I dislike "sample sizes of one," I have this result on
> > two different laptops, each of which has both Nvidia & Intel graphics
> > (but for the older one (M4800), I stopped using (& building) the
> > Nvidia driver, since enabling it appears to disable GLX).
> >
> > Anyway: photos of the backtraces are at
> > https://www.catwhisker.org/~david/FreeBSD/head/n270174/
> > as are copies of the build typescripts.
> >
> > Unfortunately, the panic message itself had (just) scrolled off the
> > top at the time I took the photos, but I hand-typed it (from the
> > M4800) in the Subject.
> >
> > Peace,
> > david
> > --
> > David H. Wolfskill  da...@catwhisker.org
> > Please do not mistake "authoritarian" for "conservative" -- or vice versa.
> >
> > See https://www.catwhisker.org/~david/publickey.gpg for my public key.
> 
> Maybe regression from ae38a1a1bfdf320089c254e4dbffdf4769d89110 by manu.
> 
> It looks like spin_lock_init was changed to no longer zero out the
> mutex before calling mtx_init, but the MTX_NEW flag was not added.
> 
> Ryan
> 

Could be, I cannot reproduce this here (either with i915kms or amdgpu)
but I guess that depending on the hardware version or number of screens
etc ... code path is different and might trigger this.
 David can you test with
https://people.freebsd.org/~manu/0001-linuxkpi-Fix-spin_lock_init.patch
just to be sure that it fixes this issue ?

 Cheers,

-- 
Emmanuel Vadot  



Re: gcc behavior of init priority of .ctors and .dtors section

2024-05-16 Thread Zhenlei Huang


> On May 17, 2024, at 2:26 AM, Konstantin Belousov  wrote:
> 
> On Thu, May 16, 2024 at 08:06:46PM +0800, Zhenlei Huang wrote:
>> Hi,
>> 
>> I'm recently working on https://reviews.freebsd.org/D45194 and got noticed
>> that gcc behaves weirdly.
>> 
>> A simple source file to demonstrate that.
>> 
>> ```
>> # cat ctors.c
>> 
>> #include 
>> 
>> __attribute__((constructor(101))) void init_101() { puts("init 1"); }
>> __attribute__((constructor(65535))) void init_65535() { puts("init 3"); }
>> __attribute__((constructor)) void init() { puts("init 4"); }
>> __attribute__((constructor(65535))) void init_65535_2() { puts("init 5"); }
>> __attribute__((constructor(65534))) void init_65534() { puts("init 2"); }
>> 
>> int main() { puts("main"); }
>> 
>> __attribute__((destructor(65534))) void fini_65534() { puts("fini 2"); }
>> __attribute__((destructor(65535))) void fini_65535() { puts("fini 3"); }
>> __attribute__((destructor)) void fini() { puts("fini 4"); }
>> __attribute__((destructor(65535))) void fini_65535_2() { puts("fini 5"); }
>> __attribute__((destructor(101))) void fini_101() { puts("fini 1"); }
>> 
>> # clang ctors.c && ./a.out
>> init 1
>> init 2
>> init 3
>> init 4
>> init 5
>> main
>> fini 5
>> fini 4
>> fini 3
>> fini 2
>> fini 1
>> ```
>> 
>> clang with the option -fno-use-init-array and run will produce the same 
>> result, which
>> is what I expected.
> Why do you add that switch?

gcc13 in ports is not configured with option --enable-initfini-array then it 
only produces .ctors / .dtors sections but
not .init_array / .fini_array sections. So I add that switch for clang to 
produce `.ctors` sections instead as
a baseline ( .ctors produced by clang indeed works as expected, the same with 
.init_array ).

> 
>> 
>> gcc13 from ports
>> ```
>> # gcc ctors.c && ./a.out
>> init 1
>> init 2
>> init 5
>> init 4
>> init 3
>> main
>> fini 3
>> fini 4
>> fini 5
>> fini 2
>> fini 1
>> ```
>> 
>> The above order is not expected. I think clang's one is correct.
>> 
>> Further hacking with readelf shows that clang produces the right order of
>> section .rela.ctors but gcc does not.
>> 
>> ```
>> # clang -fno-use-init-array -c ctors.c && readelf -r ctors.o | grep 
>> 'Relocation section with addend (.rela.ctors)' -A5 > clang.txt
>> # gcc -c ctors.c && readelf -r ctors.o | grep 'Relocation section with 
>> addend (.rela.ctors)' -A5 > gcc.txt
>> # diff clang.txt gcc.txt
>> 3,5c3,5
>> <  00080001 R_X86_64_64 0060 
>> init_65535_2 + 0
>> < 0008 00070001 R_X86_64_64 0040 init + 0
>> < 0010 00060001 R_X86_64_64 0020 init_65535 
>> + 0
>> ---
>>>  00060001 R_X86_64_64 0011 init_65535 + >>> 0
>>> 0008 00070001 R_X86_64_64 0022 init + 0
>>> 0010 00080001 R_X86_64_64 0033 init_65535_2 
>>> + 0
>> ```
>> 
>> The above show clearly gcc produces the wrong order of section `.rela.ctors`.
>> 
>> Is that expected behavior ?
>> 
>> I have not tried Linux version of gcc.
> Note that init array vs. init function behavior is encoded by a note added
> by crt1.o.  I suspect that the problem is that gcc port is built without
> --enable-initfini-array configure option.





Re: bsdinstall wifi setup is broken on CURRENT

2024-05-16 Thread Dag-Erling Smørgrav
Renato Botelho  writes:
> I'm not sure about a good way to test it on a running system instead.

Update your source tree, build and install world, run `sudo bsdconfig`,
scroll down and select “Network Management”, then select “Wireless
Networks”.

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



Re: gcc behavior of init priority of .ctors and .dtors section

2024-05-16 Thread Konstantin Belousov
On Thu, May 16, 2024 at 08:05:57PM +, Lorenzo Salvadore wrote:
> On Thursday, May 16th, 2024 at 20:26, Konstantin Belousov 
>  wrote:
> > > gcc13 from ports
> > > `# gcc ctors.c && ./a.out init 1 init 2 init 5 init 4 init 3 main fini 3 
> > > fini 4 fini 5 fini 2 fini 1`
> > > 
> > > The above order is not expected. I think clang's one is correct.
> > > 
> > > Further hacking with readelf shows that clang produces the right order of
> > > section .rela.ctors but gcc does not.
> > > 
> > > ```
> > > # clang -fno-use-init-array -c ctors.c && readelf -r ctors.o | grep 
> > > 'Relocation section with addend (.rela.ctors)' -A5 > clang.txt
> > > # gcc -c ctors.c && readelf -r ctors.o | grep 'Relocation section with 
> > > addend (.rela.ctors)' -A5 > gcc.txt
> > > # diff clang.txt gcc.txt
> > > 3,5c3,5
> > > <  00080001 R_X86_64_64 0060 init_65535_2 + 0
> > > < 0008 00070001 R_X86_64_64 0040 init + 0
> > > < 0010 00060001 R_X86_64_64 0020 init_65535 + 0
> > > ---
> > > 
> > > >  00060001 R_X86_64_64 0011 init_65535 + 0
> > > > 0008 00070001 R_X86_64_64 0022 init + 0
> > > > 0010 00080001 R_X86_64_64 0033 init_65535_2 + 0
> > > > ```
> > > 
> > > The above show clearly gcc produces the wrong order of section 
> > > `.rela.ctors`.
> > > 
> > > Is that expected behavior ?
> > > 
> > > I have not tried Linux version of gcc.
> > 
> > Note that init array vs. init function behavior is encoded by a note added
> > by crt1.o. I suspect that the problem is that gcc port is built without
> > --enable-initfini-array configure option.
> 
> Indeed, support for .init_array and .fini_array has been added to the GCC 
> ports
> but is present in the *-devel ports only for now. I will
> soon proceed to enable it for the GCC standard ports too. lang/gcc14 is soon
> to be added to the ports tree and will have it since the beginning.
It is not 'support', but a bug.  For very long time, crt1.o instructs rtld
to use initarray instead of initfunc.  gcc generates broken binaries trying
to use initfunc.

> 
> If this is indeed the issue, switching to a -devel GCC port should fix it.
> 
> Cheers,
> 
> Lorenzo Salvadore



Re: gcc behavior of init priority of .ctors and .dtors section

2024-05-16 Thread Lorenzo Salvadore
On Thursday, May 16th, 2024 at 20:26, Konstantin Belousov  
wrote:
> > gcc13 from ports
> > `# gcc ctors.c && ./a.out init 1 init 2 init 5 init 4 init 3 main fini 3 
> > fini 4 fini 5 fini 2 fini 1`
> > 
> > The above order is not expected. I think clang's one is correct.
> > 
> > Further hacking with readelf shows that clang produces the right order of
> > section .rela.ctors but gcc does not.
> > 
> > ```
> > # clang -fno-use-init-array -c ctors.c && readelf -r ctors.o | grep 
> > 'Relocation section with addend (.rela.ctors)' -A5 > clang.txt
> > # gcc -c ctors.c && readelf -r ctors.o | grep 'Relocation section with 
> > addend (.rela.ctors)' -A5 > gcc.txt
> > # diff clang.txt gcc.txt
> > 3,5c3,5
> > <  00080001 R_X86_64_64 0060 init_65535_2 + 0
> > < 0008 00070001 R_X86_64_64 0040 init + 0
> > < 0010 00060001 R_X86_64_64 0020 init_65535 + 0
> > ---
> > 
> > >  00060001 R_X86_64_64 0011 init_65535 + 0
> > > 0008 00070001 R_X86_64_64 0022 init + 0
> > > 0010 00080001 R_X86_64_64 0033 init_65535_2 + 0
> > > ```
> > 
> > The above show clearly gcc produces the wrong order of section 
> > `.rela.ctors`.
> > 
> > Is that expected behavior ?
> > 
> > I have not tried Linux version of gcc.
> 
> Note that init array vs. init function behavior is encoded by a note added
> by crt1.o. I suspect that the problem is that gcc port is built without
> --enable-initfini-array configure option.

Indeed, support for .init_array and .fini_array has been added to the GCC ports
but is present in the *-devel ports only for now. I will
soon proceed to enable it for the GCC standard ports too. lang/gcc14 is soon
to be added to the ports tree and will have it since the beginning.

If this is indeed the issue, switching to a -devel GCC port should fix it.

Cheers,

Lorenzo Salvadore



Re: bsdinstall wifi setup is broken on CURRENT

2024-05-16 Thread Nuno Teixeira
Hello Renato,

I will give it a try this weekend with bhyve since I have a passtrhu for
iwlwifi card.

Cheers,

Renato Botelho  escreveu (quinta, 16/05/2024 à(s) 19:56):

> On 16/05/24 15:47, Jessica Clarke wrote:
> > On 16 May 2024, at 19:40, Renato Botelho  wrote:
> >>
> >> I saw some users on a .br group complaining bsdinstall was failing to
> setup wifi network on 15.0 snapshots and tried it myself.  I was able to
> reproduce the problem and also noticed another one.
> >>
> >> I noticed Network Selection screen only shows one line, it's not
> beautiful to navigate through items this way.  On 14.1-BETA2 it shows
> multiple lines so it seems to be a regression.
> >>
> >> The problem users reported was: after selecting desired network it just
> starts over instead of asking for password.  I made a video [1] showing the
> problem.
> >>
> >> Jessica, I've cc'd you because git shows you were the last person
> making changes in this area.  If it's not related and I made a mistake,
> just ignore me.
> >
> > Hi Renato,
> > I touched the code that lets you select the wireless interface in the
> > first place, but not the script that then gets called to set it up and
> > is responsible for the dialogs you see. Given the behaviour, I wonder
> > if this is what today’s import of bsddialog[1] fixes? From reading the
> > script the next dialog uses --mixedform, and restarts the script on
> > error, which it looks like is what you observe.
>
> Thanks for pointing that out, Jessica.  I'll wait for the next 15
> snapshot and will check.
>
> I'm not sure about a good way to test it on a running system instead.
>
> --
> Renato Botelho
>
>

-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


bsdinstall wifi setup is broken on CURRENT

2024-05-16 Thread SAH



Thank you for the information. The right email address is

i...@aktionheizung.de

Pay information exclusively to this email address. Thanks


-
On 16 May 2024, at 19:40, Renato Botelho  wrote:

I saw some users on a .br group complaining bsdinstall was failing to setup 
wifi network on 15.0 snapshots and tried it myself.  I was able to reproduce 
the problem and also noticed another one.

I noticed Network Selection screen only shows one line, it's not beautiful to 
navigate through items this way.  On 14.1-BETA2 it shows multiple lines so it 
seems to be a regression.

The problem users reported was: after selecting desired network it just starts 
over instead of asking for password.  I made a video [1] showing the problem.

Jessica, I've cc'd you because git shows you were the last person making 
changes in this area.  If it's not related and I made a mistake, just ignore me.


Hi Renato,
I touched the code that lets you select the wireless interface in the
first place, but not the script that then gets called to set it up and
is responsible for the dialogs you see. Given the behaviour, I wonder
if this is what today’s import of bsddialog[1] fixes? From reading the
script the next dialog uses --mixedform, and restarts the script on
error, which it looks like is what you observe.

Jess

[1]https://cgit.freebsd.org/src/commit/?id=a6d8be451f62d425b71a4874f7d4e133b9fb393c


[1]https://youtube.com/shorts/Gmeckokw2a0
--
Renato Botelho


Re: bsdinstall wifi setup is broken on CURRENT

2024-05-16 Thread Renato Botelho

On 16/05/24 15:47, Jessica Clarke wrote:

On 16 May 2024, at 19:40, Renato Botelho  wrote:


I saw some users on a .br group complaining bsdinstall was failing to setup 
wifi network on 15.0 snapshots and tried it myself.  I was able to reproduce 
the problem and also noticed another one.

I noticed Network Selection screen only shows one line, it's not beautiful to 
navigate through items this way.  On 14.1-BETA2 it shows multiple lines so it 
seems to be a regression.

The problem users reported was: after selecting desired network it just starts 
over instead of asking for password.  I made a video [1] showing the problem.

Jessica, I've cc'd you because git shows you were the last person making 
changes in this area.  If it's not related and I made a mistake, just ignore me.


Hi Renato,
I touched the code that lets you select the wireless interface in the
first place, but not the script that then gets called to set it up and
is responsible for the dialogs you see. Given the behaviour, I wonder
if this is what today’s import of bsddialog[1] fixes? From reading the
script the next dialog uses --mixedform, and restarts the script on
error, which it looks like is what you observe.


Thanks for pointing that out, Jessica.  I'll wait for the next 15 
snapshot and will check.


I'm not sure about a good way to test it on a running system instead.

--
Renato Botelho



Re: bsdinstall wifi setup is broken on CURRENT

2024-05-16 Thread Jessica Clarke
On 16 May 2024, at 19:40, Renato Botelho  wrote:
> 
> I saw some users on a .br group complaining bsdinstall was failing to setup 
> wifi network on 15.0 snapshots and tried it myself.  I was able to reproduce 
> the problem and also noticed another one.
> 
> I noticed Network Selection screen only shows one line, it's not beautiful to 
> navigate through items this way.  On 14.1-BETA2 it shows multiple lines so it 
> seems to be a regression.
> 
> The problem users reported was: after selecting desired network it just 
> starts over instead of asking for password.  I made a video [1] showing the 
> problem.
> 
> Jessica, I've cc'd you because git shows you were the last person making 
> changes in this area.  If it's not related and I made a mistake, just ignore 
> me.

Hi Renato,
I touched the code that lets you select the wireless interface in the
first place, but not the script that then gets called to set it up and
is responsible for the dialogs you see. Given the behaviour, I wonder
if this is what today’s import of bsddialog[1] fixes? From reading the
script the next dialog uses --mixedform, and restarts the script on
error, which it looks like is what you observe.

Jess

[1] 
https://cgit.freebsd.org/src/commit/?id=a6d8be451f62d425b71a4874f7d4e133b9fb393c

> [1] https://youtube.com/shorts/Gmeckokw2a0
> -- 
> Renato Botelho




bsdinstall wifi setup is broken on CURRENT

2024-05-16 Thread Renato Botelho
I saw some users on a .br group complaining bsdinstall was failing to 
setup wifi network on 15.0 snapshots and tried it myself.  I was able to 
reproduce the problem and also noticed another one.


I noticed Network Selection screen only shows one line, it's not 
beautiful to navigate through items this way.  On 14.1-BETA2 it shows 
multiple lines so it seems to be a regression.


The problem users reported was: after selecting desired network it just 
starts over instead of asking for password.  I made a video [1] showing 
the problem.


Jessica, I've cc'd you because git shows you were the last person making 
changes in this area.  If it's not related and I made a mistake, just 
ignore me.


[1] https://youtube.com/shorts/Gmeckokw2a0
--
Renato Botelho



Re: gcc behavior of init priority of .ctors and .dtors section

2024-05-16 Thread Konstantin Belousov
On Thu, May 16, 2024 at 08:06:46PM +0800, Zhenlei Huang wrote:
> Hi,
> 
> I'm recently working on https://reviews.freebsd.org/D45194 and got noticed
> that gcc behaves weirdly.
> 
> A simple source file to demonstrate that.
> 
> ```
> # cat ctors.c
> 
> #include 
> 
> __attribute__((constructor(101))) void init_101() { puts("init 1"); }
> __attribute__((constructor(65535))) void init_65535() { puts("init 3"); }
> __attribute__((constructor)) void init() { puts("init 4"); }
> __attribute__((constructor(65535))) void init_65535_2() { puts("init 5"); }
> __attribute__((constructor(65534))) void init_65534() { puts("init 2"); }
> 
> int main() { puts("main"); }
> 
> __attribute__((destructor(65534))) void fini_65534() { puts("fini 2"); }
> __attribute__((destructor(65535))) void fini_65535() { puts("fini 3"); }
> __attribute__((destructor)) void fini() { puts("fini 4"); }
> __attribute__((destructor(65535))) void fini_65535_2() { puts("fini 5"); }
> __attribute__((destructor(101))) void fini_101() { puts("fini 1"); }
> 
> # clang ctors.c && ./a.out
> init 1
> init 2
> init 3
> init 4
> init 5
> main
> fini 5
> fini 4
> fini 3
> fini 2
> fini 1
> ```
> 
> clang with the option -fno-use-init-array and run will produce the same 
> result, which
> is what I expected.
Why do you add that switch?

> 
> gcc13 from ports
> ```
> # gcc ctors.c && ./a.out
> init 1
> init 2
> init 5
> init 4
> init 3
> main
> fini 3
> fini 4
> fini 5
> fini 2
> fini 1
> ```
> 
> The above order is not expected. I think clang's one is correct.
> 
> Further hacking with readelf shows that clang produces the right order of
> section .rela.ctors but gcc does not.
> 
> ```
> # clang -fno-use-init-array -c ctors.c && readelf -r ctors.o | grep 
> 'Relocation section with addend (.rela.ctors)' -A5 > clang.txt
> # gcc -c ctors.c && readelf -r ctors.o | grep 'Relocation section with addend 
> (.rela.ctors)' -A5 > gcc.txt
> # diff clang.txt gcc.txt
> 3,5c3,5
> <  00080001 R_X86_64_64 0060 init_65535_2 
> + 0
> < 0008 00070001 R_X86_64_64 0040 init + 0
> < 0010 00060001 R_X86_64_64 0020 init_65535 + > 0
> ---
> >  00060001 R_X86_64_64 0011 init_65535 + > > 0
> > 0008 00070001 R_X86_64_64 0022 init + 0
> > 0010 00080001 R_X86_64_64 0033 init_65535_2 
> > + 0
> ```
> 
> The above show clearly gcc produces the wrong order of section `.rela.ctors`.
> 
> Is that expected behavior ?
> 
> I have not tried Linux version of gcc.
Note that init array vs. init function behavior is encoded by a note added
by crt1.o.  I suspect that the problem is that gcc port is built without
--enable-initfini-array configure option.



Re: Panic: lock "lnxspin" 0xfffff800176c0730 already initialized

2024-05-16 Thread Ryan Libby
On Thu, May 16, 2024 at 6:00 AM David Wolfskill  wrote:
>
> This is running main-n270174-abb1a1340e3f (built in-place from
> main-n270163-154ad8e0f88f), with ports at main-n663685-3f732745ab06;
> the ports-resident kernel modules were rebuilt with the kernel,
> courtesy (e.g.):
>
> g1-70(14.1-S)[4] grep '^PORT' /etc/src.conf
> PORTS_MODULES+=graphics/drm-61-kmod
>
> And since I dislike "sample sizes of one," I have this result on
> two different laptops, each of which has both Nvidia & Intel graphics
> (but for the older one (M4800), I stopped using (& building) the
> Nvidia driver, since enabling it appears to disable GLX).
>
> Anyway: photos of the backtraces are at
> https://www.catwhisker.org/~david/FreeBSD/head/n270174/
> as are copies of the build typescripts.
>
> Unfortunately, the panic message itself had (just) scrolled off the
> top at the time I took the photos, but I hand-typed it (from the
> M4800) in the Subject.
>
> Peace,
> david
> --
> David H. Wolfskill  da...@catwhisker.org
> Please do not mistake "authoritarian" for "conservative" -- or vice versa.
>
> See https://www.catwhisker.org/~david/publickey.gpg for my public key.

Maybe regression from ae38a1a1bfdf320089c254e4dbffdf4769d89110 by manu.

It looks like spin_lock_init was changed to no longer zero out the
mutex before calling mtx_init, but the MTX_NEW flag was not added.

Ryan



Panic: lock "lnxspin" 0xfffff800176c0730 already initialized

2024-05-16 Thread David Wolfskill
This is running main-n270174-abb1a1340e3f (built in-place from
main-n270163-154ad8e0f88f), with ports at main-n663685-3f732745ab06;
the ports-resident kernel modules were rebuilt with the kernel,
courtesy (e.g.):

g1-70(14.1-S)[4] grep '^PORT' /etc/src.conf
PORTS_MODULES+=graphics/drm-61-kmod

And since I dislike "sample sizes of one," I have this result on
two different laptops, each of which has both Nvidia & Intel graphics
(but for the older one (M4800), I stopped using (& building) the
Nvidia driver, since enabling it appears to disable GLX).

Anyway: photos of the backtraces are at
https://www.catwhisker.org/~david/FreeBSD/head/n270174/
as are copies of the build typescripts.

Unfortunately, the panic message itself had (just) scrolled off the
top at the time I took the photos, but I hand-typed it (from the
M4800) in the Subject.

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Please do not mistake "authoritarian" for "conservative" -- or vice versa.

See https://www.catwhisker.org/~david/publickey.gpg for my public key.


signature.asc
Description: PGP signature


gcc behavior of init priority of .ctors and .dtors section

2024-05-16 Thread Zhenlei Huang
Hi,

I'm recently working on https://reviews.freebsd.org/D45194 and got noticed
that gcc behaves weirdly.

A simple source file to demonstrate that.

```
# cat ctors.c

#include 

__attribute__((constructor(101))) void init_101() { puts("init 1"); }
__attribute__((constructor(65535))) void init_65535() { puts("init 3"); }
__attribute__((constructor)) void init() { puts("init 4"); }
__attribute__((constructor(65535))) void init_65535_2() { puts("init 5"); }
__attribute__((constructor(65534))) void init_65534() { puts("init 2"); }

int main() { puts("main"); }

__attribute__((destructor(65534))) void fini_65534() { puts("fini 2"); }
__attribute__((destructor(65535))) void fini_65535() { puts("fini 3"); }
__attribute__((destructor)) void fini() { puts("fini 4"); }
__attribute__((destructor(65535))) void fini_65535_2() { puts("fini 5"); }
__attribute__((destructor(101))) void fini_101() { puts("fini 1"); }

# clang ctors.c && ./a.out
init 1
init 2
init 3
init 4
init 5
main
fini 5
fini 4
fini 3
fini 2
fini 1
```

clang with the option -fno-use-init-array and run will produce the same result, 
which
is what I expected.

gcc13 from ports
```
# gcc ctors.c && ./a.out
init 1
init 2
init 5
init 4
init 3
main
fini 3
fini 4
fini 5
fini 2
fini 1
```

The above order is not expected. I think clang's one is correct.

Further hacking with readelf shows that clang produces the right order of
section .rela.ctors but gcc does not.

```
# clang -fno-use-init-array -c ctors.c && readelf -r ctors.o | grep 'Relocation 
section with addend (.rela.ctors)' -A5 > clang.txt
# gcc -c ctors.c && readelf -r ctors.o | grep 'Relocation section with addend 
(.rela.ctors)' -A5 > gcc.txt
# diff clang.txt gcc.txt
3,5c3,5
<  00080001 R_X86_64_64 0060 init_65535_2 + 0
< 0008 00070001 R_X86_64_64 0040 init + 0
< 0010 00060001 R_X86_64_64 0020 init_65535 + 0
---
>  00060001 R_X86_64_64 0011 init_65535 + 0
> 0008 00070001 R_X86_64_64 0022 init + 0
> 0010 00080001 R_X86_64_64 0033 init_65535_2 + > 0
```

The above show clearly gcc produces the wrong order of section `.rela.ctors`.

Is that expected behavior ?

I have not tried Linux version of gcc.


Best regards,
Zhenlei




Re: pkg scripts need updating

2024-05-15 Thread Stefan Esser




Am 15.05.24 um 02:21 schrieb Enji Cooper:



On May 14, 2024, at 7:19 AM, Michael Butler  wrote:

After commit aa48259f337100e79933d660fec8856371f761ed to src which removed 
security_daily_compat_var, I get these warnings daily..

aaron.protected-networks.net login failures:

aaron.protected-networks.net refused connections:
/usr/local/etc/periodic/security/405.pkg-base-audit: security_daily_compat_var: 
not found
/usr/local/etc/periodic/security/405.pkg-base-audit: security_daily_compat_var: 
not found
/usr/local/etc/periodic/security/405.pkg-base-audit: security_daily_compat_var: 
not found
/usr/local/etc/periodic/security/405.pkg-base-audit: security_daily_compat_var: 
not found
/usr/local/etc/periodic/security/405.pkg-base-audit: security_daily_compat_var: 
not found

Checking for security vulnerabilities in base (userland & kernel):
Database fetched: 2024-05-12T14:16-04:00
0 problem(s) in 0 installed package(s) found.
0 problem(s) in 0 installed package(s) found.
/usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: not 
found
/usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: not 
found
/usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: not 
found
/usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: not 
found
/usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: not 
found

Checking for packages with security vulnerabilities:
Database fetched: 2024-05-12T14:16-04:00
/usr/local/etc/periodic/security/460.pkg-checksum: security_daily_compat_var: 
not found
/usr/local/etc/periodic/security/460.pkg-checksum: security_daily_compat_var: 
not found
/usr/local/etc/periodic/security/460.pkg-checksum: security_daily_compat_var: 
not found

Checking for packages with mismatched checksums:


Have you tried emailing the issue to the committer/filing a bug report to bring 
this to their attention?
Cheers,


The messages are caused by running:

/usr/local/etc/periodic/security/405.pkg-base-audit
/usr/local/etc/periodic/security/460.pkg-checksum
/usr/local/etc/periodic/security/410.pkg-audit

These scripts have been installed by pkg-1.12.2 on my system ...

Best regards, STefan



Re: Unfamiliar console message: in prompt_tty(): caught signal 2

2024-05-14 Thread Enji Cooper

> On Apr 21, 2024, at 1:48 PM, bob prohaska  wrote:
> 
> On Sun, Apr 21, 2024 at 10:16:55PM +0200, Dag-Erling Smørgrav wrote:
>> bob prohaska  writes:
>>> Apr 20 22:14:37 www su[30398]: in prompt_tty(): caught signal 2
>> 
>> This means someone ran `su` and pressed Ctrl-C instead of entering a
>> password when prompted.
> 
> Ahh, that would have been me. Thank you!

Logging SIGINT seems kind of odd, given that it would probably be a regular 
occurrence (to me at least)…
-Enji

signature.asc
Description: Message signed with OpenPGP


Re: pkg scripts need updating

2024-05-14 Thread Enji Cooper

> On May 14, 2024, at 7:19 AM, Michael Butler  
> wrote:
> 
> After commit aa48259f337100e79933d660fec8856371f761ed to src which removed 
> security_daily_compat_var, I get these warnings daily..
> 
> aaron.protected-networks.net login failures:
> 
> aaron.protected-networks.net refused connections:
> /usr/local/etc/periodic/security/405.pkg-base-audit: 
> security_daily_compat_var: not found
> /usr/local/etc/periodic/security/405.pkg-base-audit: 
> security_daily_compat_var: not found
> /usr/local/etc/periodic/security/405.pkg-base-audit: 
> security_daily_compat_var: not found
> /usr/local/etc/periodic/security/405.pkg-base-audit: 
> security_daily_compat_var: not found
> /usr/local/etc/periodic/security/405.pkg-base-audit: 
> security_daily_compat_var: not found
> 
> Checking for security vulnerabilities in base (userland & kernel):
> Database fetched: 2024-05-12T14:16-04:00
> 0 problem(s) in 0 installed package(s) found.
> 0 problem(s) in 0 installed package(s) found.
> /usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: 
> not found
> /usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: 
> not found
> /usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: 
> not found
> /usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: 
> not found
> /usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: 
> not found
> 
> Checking for packages with security vulnerabilities:
> Database fetched: 2024-05-12T14:16-04:00
> /usr/local/etc/periodic/security/460.pkg-checksum: security_daily_compat_var: 
> not found
> /usr/local/etc/periodic/security/460.pkg-checksum: security_daily_compat_var: 
> not found
> /usr/local/etc/periodic/security/460.pkg-checksum: security_daily_compat_var: 
> not found
> 
> Checking for packages with mismatched checksums:

Have you tried emailing the issue to the committer/filing a bug report to bring 
this to their attention?
Cheers,
-Enji

signature.asc
Description: Message signed with OpenPGP


Re: Graph of the FreeBSD memory fragmentation

2024-05-14 Thread Ryan Libby
On Tue, May 14, 2024 at 9:09 AM Ryan Libby  wrote:
>
> On Tue, May 14, 2024 at 1:14 AM Alexander Leidinger
>  wrote:
> >
> > Am 2024-05-14 03:54, schrieb Ryan Libby:
> > > That was a long winded way of saying: the "UMA bucket" axis is
> > > actually "vm phys free list order".
> > >
> > > That said, I find that dimension confusing because in fact there's
> > > just one piece of information there, the average size of a free list
> > > entry, and it doesn't actually depend on the free list order.  The
> > > graph could be 2D.
> >
> > It evolved into that...
> > At first I had a 3 dimensional dataset and the first try was to plot it
> > as is (3D). The outcome (as points) was not as good as I wanted it to
> > be, and plotting as lines gave the wrong direction of lines. I massaged
> > the plotting instructions until it looked good enough. I did not try a
> > 2D plot. I agree, with different colors for each free list order a 2D
> > plot may work too. If a 2D plot is better than a 3D plot in this case,
> > depends on the mental model of the topic the viewer has. One size may
> > not fit all. Feel free to experiment with other plotting styles.
> >
>
> What I mean is that the 13 values in the depth dimension (now "freelist
> size") are actually all showing the same information -- except for
> integer truncation issues and having clamped the negative values at
> -1000.  Each index value for a given order completely determines the
> values for the other orders at a given time point.
>
> In the patch (D40575) this is
> return (1000 -
> ((info.free_pages * 1000) / (1 << order) / info.free_blocks));
> but notice that free_pages and free_blocks don't depend on order, they
> are computed across all free list entries, of all orders, and are the
> same for a calculation for any order.  So for example we could solve for
> the average free list entry size by taking the value from order of 0:
> index_0 = 1000 - 1000 / 1 * free_pages / free_blocks
> avg_pages = free_pages / free_blocks = -(index_0 - 1000) / 1000
> and from that you can calculate all the other values.  Or just display
> it directly.  I'd suggest try plotting log2(avg_pages).
>
> In other words, I think just considering one value per time point is
> simpler and doesn't lose any information.
>
> > > The paper that defines this fragmentation index also says that "the
> > > fragmentation index is only meaningful when an allocation fails".  Are
> > > you actually seeing any contiguous allocations failures in your
> > > measurements?
> >
> > I'm not aware of such.
> > The index may only be meaningful for the purposes of the goal of the
> > paper when there are such failures, but if you look at the graph and how
> > it changed when Bojan changed the guard pages, I see value in the graph
> > for more than what the paper suggests.
> >
> > > Without that context, it seems like what the proposed sysctl reports
> > > is indirectly just the average size of free list entries.  We could
> > > just report that.
> >
> > The calculation of the value is part of a bigger picture. The value
> > returned is used by some other code to make decisions.
> >
> > Bye,
> > Alexander.
> >
> > --
> > http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
> > http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF
>
> Okay I see that D40772 uses it, but always passes order=9, and compares
> it with threshold=300.

I see that it is not "always" as the order is actually arch dependent.

> Rearranging, it asks if the average free list
> entry size is at least 1.4 MiB.

..on amd64.

>
> Personally I'd prefer to consider values that are easy to interpret
> rather than an arbitrary index value.
>
> Ryan



Re: Graph of the FreeBSD memory fragmentation

2024-05-14 Thread Ryan Libby
On Tue, May 14, 2024 at 1:14 AM Alexander Leidinger
 wrote:
>
> Am 2024-05-14 03:54, schrieb Ryan Libby:
> > That was a long winded way of saying: the "UMA bucket" axis is
> > actually "vm phys free list order".
> >
> > That said, I find that dimension confusing because in fact there's
> > just one piece of information there, the average size of a free list
> > entry, and it doesn't actually depend on the free list order.  The
> > graph could be 2D.
>
> It evolved into that...
> At first I had a 3 dimensional dataset and the first try was to plot it
> as is (3D). The outcome (as points) was not as good as I wanted it to
> be, and plotting as lines gave the wrong direction of lines. I massaged
> the plotting instructions until it looked good enough. I did not try a
> 2D plot. I agree, with different colors for each free list order a 2D
> plot may work too. If a 2D plot is better than a 3D plot in this case,
> depends on the mental model of the topic the viewer has. One size may
> not fit all. Feel free to experiment with other plotting styles.
>

What I mean is that the 13 values in the depth dimension (now "freelist
size") are actually all showing the same information -- except for
integer truncation issues and having clamped the negative values at
-1000.  Each index value for a given order completely determines the
values for the other orders at a given time point.

In the patch (D40575) this is
return (1000 -
((info.free_pages * 1000) / (1 << order) / info.free_blocks));
but notice that free_pages and free_blocks don't depend on order, they
are computed across all free list entries, of all orders, and are the
same for a calculation for any order.  So for example we could solve for
the average free list entry size by taking the value from order of 0:
index_0 = 1000 - 1000 / 1 * free_pages / free_blocks
avg_pages = free_pages / free_blocks = -(index_0 - 1000) / 1000
and from that you can calculate all the other values.  Or just display
it directly.  I'd suggest try plotting log2(avg_pages).

In other words, I think just considering one value per time point is
simpler and doesn't lose any information.

> > The paper that defines this fragmentation index also says that "the
> > fragmentation index is only meaningful when an allocation fails".  Are
> > you actually seeing any contiguous allocations failures in your
> > measurements?
>
> I'm not aware of such.
> The index may only be meaningful for the purposes of the goal of the
> paper when there are such failures, but if you look at the graph and how
> it changed when Bojan changed the guard pages, I see value in the graph
> for more than what the paper suggests.
>
> > Without that context, it seems like what the proposed sysctl reports
> > is indirectly just the average size of free list entries.  We could
> > just report that.
>
> The calculation of the value is part of a bigger picture. The value
> returned is used by some other code to make decisions.
>
> Bye,
> Alexander.
>
> --
> http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
> http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF

Okay I see that D40772 uses it, but always passes order=9, and compares
it with threshold=300.  Rearranging, it asks if the average free list
entry size is at least 1.4 MiB.

Personally I'd prefer to consider values that are easy to interpret
rather than an arbitrary index value.

Ryan



pkg scripts need updating

2024-05-14 Thread Michael Butler
After commit aa48259f337100e79933d660fec8856371f761ed to src which 
removed security_daily_compat_var, I get these warnings daily..


aaron.protected-networks.net login failures:

aaron.protected-networks.net refused connections:
/usr/local/etc/periodic/security/405.pkg-base-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/405.pkg-base-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/405.pkg-base-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/405.pkg-base-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/405.pkg-base-audit: 
security_daily_compat_var: not found


Checking for security vulnerabilities in base (userland & kernel):
Database fetched: 2024-05-12T14:16-04:00
0 problem(s) in 0 installed package(s) found.
0 problem(s) in 0 installed package(s) found.
/usr/local/etc/periodic/security/410.pkg-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/410.pkg-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/410.pkg-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/410.pkg-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/410.pkg-audit: 
security_daily_compat_var: not found


Checking for packages with security vulnerabilities:
Database fetched: 2024-05-12T14:16-04:00
/usr/local/etc/periodic/security/460.pkg-checksum: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/460.pkg-checksum: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/460.pkg-checksum: 
security_daily_compat_var: not found


Checking for packages with mismatched checksums:




Re: Graph of the FreeBSD memory fragmentation

2024-05-14 Thread Alexander Leidinger

Am 2024-05-14 03:54, schrieb Ryan Libby:

That was a long winded way of saying: the "UMA bucket" axis is
actually "vm phys free list order".

That said, I find that dimension confusing because in fact there's
just one piece of information there, the average size of a free list
entry, and it doesn't actually depend on the free list order.  The
graph could be 2D.


It evolved into that...
At first I had a 3 dimensional dataset and the first try was to plot it 
as is (3D). The outcome (as points) was not as good as I wanted it to 
be, and plotting as lines gave the wrong direction of lines. I massaged 
the plotting instructions until it looked good enough. I did not try a 
2D plot. I agree, with different colors for each free list order a 2D 
plot may work too. If a 2D plot is better than a 3D plot in this case, 
depends on the mental model of the topic the viewer has. One size may 
not fit all. Feel free to experiment with other plotting styles.



The paper that defines this fragmentation index also says that "the
fragmentation index is only meaningful when an allocation fails".  Are
you actually seeing any contiguous allocations failures in your
measurements?


I'm not aware of such.
The index may only be meaningful for the purposes of the goal of the 
paper when there are such failures, but if you look at the graph and how 
it changed when Bojan changed the guard pages, I see value in the graph 
for more than what the paper suggests.



Without that context, it seems like what the proposed sysctl reports
is indirectly just the average size of free list entries.  We could
just report that.


The calculation of the value is part of a bigger picture. The value 
returned is used by some other code to make decisions.


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


signature.asc
Description: OpenPGP digital signature


  1   2   3   4   5   6   7   8   9   10   >