On Mon, 30 Oct 2023, Thomas Klausner wrote:
RVP looked at this some more and it seems related to time-after-booting or perhaps RAM churn. It starts happening on RVP's machine too after some uptime.
OK. I found some time this weekend to look into this and I think I see what's responsible (and it's neither time-after-booting nor RAM-churn). Compare the NetBSD copyinstr() in sys/arch/amd64/amd64/copy.S vs. the dtrace_copystr() in external/cddl/osnet/dev/dtrace/amd64/dtrace_asm.S. The NetBSD copyinstr() _disables_ SMAP before copying data from userspace. The dtrace version _does not_. I think this is what fails on some CPUs. My Intel CPU's more than 10 years old so it doesn't support SMAP (only SMEP), dtrace works for me. If you and bch tell me that your CPUs support SMAP, then that would be the smoking gun. I also checked both FreeBSD-13.2's dtrace_copystr() and Illumos's: they _both_ disable SMAP now.
Still looking for a dtrace guru to help out here :)
Yep. A kernel guru should be able to compe up with a patch for the SMAP bug pronto (my assembly is rusty), but dtrace failing after a time until restarted appears to be a separate issue. HTH, -RVP