Top posting going in a different direction that established a way to control the behavior in my context . . .
I changed USE_TMPFS=all to USE_TMPFS=no : USE_TMPFS=all gets the failure vs. USE_TMPFS=no works just fine So it is a FreeBSD system error associated with use of tmpfs . Now back to what I looked at before trying the above . . . On Nov 25, 2024, at 17:05, Mark Millard <mark...@yahoo.com> wrote: > On Nov 25, 2024, at 15:21, Mark Millard <mark...@yahoo.com> wrote: > >> On Nov 25, 2024, at 14:23, Guido Falsi <m...@madpilot.net> wrote: >> >>> On 25/11/24 23:15, Dimitry Andric wrote: >>>> On 25 Nov 2024, at 23:12, Mark Millard <mark...@yahoo.com> wrote: >>>>> >>>>> On Nov 25, 2024, at 13:27, Guido Falsi <m...@madpilot.net> wrote: >>>>> >>>>>> On 25/11/24 22:18, Dag-Erling Smørgrav wrote: >>>>>>> Mark Millard <mark...@yahoo.com> writes: >>>>>>>> Guido Falsi <m...@madpilot.net> writes: >>>>>>>>> On 25/11/24 09:17, Dag-Erling Smørgrav wrote: >>>>>>>>>> Dimitry Andric <d...@freebsd.org> writes: >>>>>>>>>>> Probably best to create a bugzilla ticket, but as I said before, I >>>>>>>>>>> cannot reproduce this. >>>>>>>>>> I can. My builder is running 15 and sees segfaults while building >>>>>>>>>> packages for 14 and 15 but not for 13. >>>>>>>>> BTW removing optimizations (CPUTYPE) for only the affected ports made >>>>>>>>> guile2 work again. Did not solve the issue with sassc though. [...] >>>>>>>>> I'm also using ccache, but that does not look relevant. >>>>>>>> I've never used ccache or analogous and get the libsass.so.1.0.0 >>>>>>>> .got.plt corruption that I've reported on the lists anyway. >>>>>>> I don't use ccache or optimizations. Here's an example of sassc >>>>>>> segfaulting in a 14.1-RELEASE-p6 jail: >>>>>>> https://pkg.des.dev/logs/data/14amd64-default/2024-11-24_19h29m04s/logs/errors/plasma5-breeze-gtk-5.27.11.log >>>>>>> which matches the following entry from `/var/log/messages`: >>>>>>> Nov 24 21:23:06 pkg kernel: pid 71277 (sassc), jid 253, uid 65534: >>>>>>> exited on signal 11 (core dumped) >>>>>>> The poudriere host is a bhyve VM with 48 cores and 192 GB RAM on a >>>>>>> 32c/64t AMD EPYC 7502P with 256 GB RAM. >>>>>> >>>>>> I sincerely hope this is not relevant but my CPU is also AMD: AMD Ryzen >>>>>> 5 5600G >>>>> >>>>> The amd64 system type that I have access to and used >>>>> for my testing: >>>>> >>>>> AMD 7950X3D (16 core, 32 thread, so 32 FreeBSD-cpus) with 192 GiBytes of >>>>> RAM >>>> I'm on Intel, and I don't see any crashes at all. So, are we looking at >>>> some CPU specific issue here? >>> >>> We can't say for sure, but we definitely have all people reporting the >>> issue on the same CPU brand, so it's some indication I guess. >>> >>> I was hoping it would not come to this because I suspect such issues are >>> quite difficult to diagnose. >> >> Unfortunately, for amd64 I only have access to: >> >> ) An old ThreadRipper 1950X system (untested so far) >> ) The 7950X3D system >> >> No Intel systems. >> >> If someone had both AMD and Intel and could have >> boot&operate media that should work for both, say >> USB that can be simply moved between machines, >> running test on both would be appropriate. >> (Implication: the media not being tailored to the >> cpu specifics so the same system software is >> tested in both places.) >> >> I'll note that the media in my context is PCIe Optane, >> ZFS based. I could try a U.2 Optane in a PCIe adaptor >> that has UFS instead for building textproc/libsass . >> (The U.2 content is an basically a rsync of the ZFS >> Optane media's live directory tree, with node naming >> and such adjusted afterwards.) >> >> What do other folks have for the file system(s) >> involved? > > I get the sassc failure from a a pure UFS live-context as > well. > > Interestingly, there is variation in the .got.plt oddity. > > Earlier: > > Bad .got.plt: > > Contents of section .got.plt: > 2bed60 00000000 00000000 00000000 00000000 ................ > . . . > 2befc0 00000000 00000000 00000000 00000000 ................ > 2befd0 00000000 00000000 00000000 00000000 ................ > 2befe0 00000000 00000000 00000000 00000000 ................ > 2beff0 00000000 00000000 00000000 00000000 ................ > 2bf000 96ab2a00 00000000 a6ab2a00 00000000 ..*.......*..... > 2bf010 b6ab2a00 00000000 c6ab2a00 00000000 ..*.......*..... > 2bf020 d6ab2a00 00000000 e6ab2a00 00000000 ..*.......*..... > 2bf030 f6ab2a00 00000000 06ac2a00 00000000 ..*.......*..... > . . . Interestingly, a later retest of the ZFS context did not get the above. Instead it ended up like the below bad case. I'll also note that scrubbing reports: # zpool status pool: zoptb state: ONLINE scan: scrub repaired 0B in 00:00:47 with 0 errors on Mon Nov 25 17:50:44 2024 config: NAME STATE READ WRITE CKSUM zoptb ONLINE 0 0 0 gpt/OptBzfs ONLINE 0 0 0 errors: No known data errors This should mean that the unexpected zeros were present before zfs did its checksum prior to writing the data. > The new bad .got.plt ended up with a bigger zero area, > the nonzero area again being nicely aligned for where > it starts. (The .got.plt starts at the same address > as above.) > > Contents of section .got.plt: > 2bed60 00000000 00000000 00000000 00000000 ................ > . . . > 2befc0 00000000 00000000 00000000 00000000 ................ > 2befd0 00000000 00000000 00000000 00000000 ................ > 2befe0 00000000 00000000 00000000 00000000 ................ > 2beff0 00000000 00000000 00000000 00000000 ................ > 2bf000 00000000 00000000 00000000 00000000 ................ > 2bf010 00000000 00000000 00000000 00000000 ................ > 2bf020 00000000 00000000 00000000 00000000 ................ > 2bf030 00000000 00000000 00000000 00000000 ................ > . . . > 2bffc0 00000000 00000000 00000000 00000000 ................ > 2bffd0 00000000 00000000 00000000 00000000 ................ > 2bffe0 00000000 00000000 00000000 00000000 ................ > 2bfff0 00000000 00000000 00000000 00000000 ................ > 2c0000 96cb2a00 00000000 a6cb2a00 00000000 ..*.......*..... > 2c0010 b6cb2a00 00000000 c6cb2a00 00000000 ..*.......*..... > 2c0020 d6cb2a00 00000000 e6cb2a00 00000000 ..*.......*..... > 2c0030 f6cb2a00 00000000 06cc2a00 00000000 ..*.......*..... > . . . > Adding the comparison of the good .got.plt from the PkgBase based chroot with the official packages installed: Contents of section .got.plt: 2bed60 78ba2b00 00000000 00000000 00000000 x.+............. 2bed70 00000000 00000000 86a62a00 00000000 ..........*..... 2bed80 96a62a00 00000000 a6a62a00 00000000 ..*.......*..... 2bed90 b6a62a00 00000000 c6a62a00 00000000 ..*.......*..... . . . 2befc0 16ab2a00 00000000 26ab2a00 00000000 ..*.....&.*..... 2befd0 36ab2a00 00000000 46ab2a00 00000000 6.*.....F.*..... 2befe0 56ab2a00 00000000 66ab2a00 00000000 V.*.....f.*..... 2beff0 76ab2a00 00000000 86ab2a00 00000000 v.*.......*..... 2bf000 96ab2a00 00000000 a6ab2a00 00000000 ..*.......*..... 2bf010 b6ab2a00 00000000 c6ab2a00 00000000 ..*.......*..... 2bf020 d6ab2a00 00000000 e6ab2a00 00000000 ..*.......*..... 2bf030 f6ab2a00 00000000 06ac2a00 00000000 ..*.......*..... . . . 2bffc0 16cb2a00 00000000 26cb2a00 00000000 ..*.....&.*..... 2bffd0 36cb2a00 00000000 46cb2a00 00000000 6.*.....F.*..... 2bffe0 56cb2a00 00000000 66cb2a00 00000000 V.*.....f.*..... 2bfff0 76cb2a00 00000000 86cb2a00 00000000 v.*.......*..... 2c0000 96cb2a00 00000000 a6cb2a00 00000000 ..*.......*..... 2c0010 b6cb2a00 00000000 c6cb2a00 00000000 ..*.......*..... 2c0020 d6cb2a00 00000000 e6cb2a00 00000000 ..*.......*..... 2c0030 f6cb2a00 00000000 06cc2a00 00000000 ..*.......*..... . . . The contents of the non-zero parts of any pair of the examples agree. === Mark Millard marklmi at yahoo.com