Top posting going in a different direction that
established a way to control the behavior in my
context . . .

I changed USE_TMPFS=all to USE_TMPFS=no :

USE_TMPFS=all gets the failure
vs.
USE_TMPFS=no works just fine

So it is a FreeBSD system error associated with
use of tmpfs .


Now back to what I looked at before trying the
above . . .

On Nov 25, 2024, at 17:05, Mark Millard <mark...@yahoo.com> wrote:

> On Nov 25, 2024, at 15:21, Mark Millard <mark...@yahoo.com> wrote:
> 
>> On Nov 25, 2024, at 14:23, Guido Falsi <m...@madpilot.net> wrote:
>> 
>>> On 25/11/24 23:15, Dimitry Andric wrote:
>>>> On 25 Nov 2024, at 23:12, Mark Millard <mark...@yahoo.com> wrote:
>>>>> 
>>>>> On Nov 25, 2024, at 13:27, Guido Falsi <m...@madpilot.net> wrote:
>>>>> 
>>>>>> On 25/11/24 22:18, Dag-Erling Smørgrav wrote:
>>>>>>> Mark Millard <mark...@yahoo.com> writes:
>>>>>>>> Guido Falsi <m...@madpilot.net> writes:
>>>>>>>>> On 25/11/24 09:17, Dag-Erling Smørgrav wrote:
>>>>>>>>>> Dimitry Andric <d...@freebsd.org> writes:
>>>>>>>>>>> Probably best to create a bugzilla ticket, but as I said before, I
>>>>>>>>>>> cannot reproduce this.
>>>>>>>>>> I can.  My builder is running 15 and sees segfaults while building
>>>>>>>>>> packages for 14 and 15 but not for 13.
>>>>>>>>> BTW removing optimizations (CPUTYPE) for only the affected ports made
>>>>>>>>> guile2 work again. Did not solve the issue with sassc though.  [...]
>>>>>>>>> I'm also using ccache, but that does not look relevant.
>>>>>>>> I've never used ccache or analogous and get the libsass.so.1.0.0
>>>>>>>> .got.plt corruption that I've reported on the lists anyway.
>>>>>>> I don't use ccache or optimizations.  Here's an example of sassc
>>>>>>> segfaulting in a 14.1-RELEASE-p6 jail:
>>>>>>> https://pkg.des.dev/logs/data/14amd64-default/2024-11-24_19h29m04s/logs/errors/plasma5-breeze-gtk-5.27.11.log
>>>>>>> which matches the following entry from `/var/log/messages`:
>>>>>>> Nov 24 21:23:06 pkg kernel: pid 71277 (sassc), jid 253, uid 65534: 
>>>>>>> exited on signal 11 (core dumped)
>>>>>>> The poudriere host is a bhyve VM with 48 cores and 192 GB RAM on a
>>>>>>> 32c/64t AMD EPYC 7502P with 256 GB RAM.
>>>>>> 
>>>>>> I sincerely hope this is not relevant but my CPU is also AMD: AMD Ryzen 
>>>>>> 5 5600G
>>>>> 
>>>>> The amd64 system type that I have access to and used
>>>>> for my testing:
>>>>> 
>>>>> AMD 7950X3D (16 core, 32 thread, so 32 FreeBSD-cpus) with 192 GiBytes of 
>>>>> RAM
>>>> I'm on Intel, and I don't see any crashes at all. So, are we looking at 
>>>> some CPU specific issue here?
>>> 
>>> We can't say for sure, but we definitely have all people reporting the 
>>> issue on the same CPU brand, so it's some indication I guess.
>>> 
>>> I was hoping it would not come to this because I suspect such issues are 
>>> quite difficult to diagnose.
>> 
>> Unfortunately, for amd64 I only have access to:
>> 
>> ) An old ThreadRipper 1950X system (untested so far)
>> ) The 7950X3D system
>> 
>> No Intel systems.
>> 
>> If someone had both AMD and Intel and could have
>> boot&operate media that should work for both, say
>> USB that can be simply moved between machines,
>> running test on both would be appropriate.
>> (Implication: the media not being tailored to the
>> cpu specifics so the same system software is
>> tested in both places.)
>> 
>> I'll note that the media in my context is PCIe Optane,
>> ZFS based. I could try a U.2 Optane in a PCIe adaptor
>> that has UFS instead for building textproc/libsass .
>> (The U.2 content is an basically a rsync of the ZFS
>> Optane media's live directory tree, with node naming
>> and such adjusted afterwards.)
>> 
>> What do other folks have for the file system(s)
>> involved?
> 
> I get the sassc failure from a a pure UFS live-context as
> well.
> 
> Interestingly, there is variation in the .got.plt oddity.
> 
> Earlier:
> 
> Bad .got.plt:
> 
> Contents of section .got.plt:
> 2bed60 00000000 00000000 00000000 00000000  ................
> . . .
> 2befc0 00000000 00000000 00000000 00000000  ................
> 2befd0 00000000 00000000 00000000 00000000  ................
> 2befe0 00000000 00000000 00000000 00000000  ................
> 2beff0 00000000 00000000 00000000 00000000  ................
> 2bf000 96ab2a00 00000000 a6ab2a00 00000000  ..*.......*.....
> 2bf010 b6ab2a00 00000000 c6ab2a00 00000000  ..*.......*.....
> 2bf020 d6ab2a00 00000000 e6ab2a00 00000000  ..*.......*.....
> 2bf030 f6ab2a00 00000000 06ac2a00 00000000  ..*.......*.....
> . . .

Interestingly, a later retest of the ZFS
context did not get the above. Instead it
ended up like the below bad case.

I'll also note that scrubbing reports:

# zpool status
  pool: zoptb
 state: ONLINE
  scan: scrub repaired 0B in 00:00:47 with 0 errors on Mon Nov 25 17:50:44 2024
config:

        NAME           STATE     READ WRITE CKSUM
        zoptb          ONLINE       0     0     0
          gpt/OptBzfs  ONLINE       0     0     0

errors: No known data errors

This should mean that the unexpected zeros were present
before zfs did its checksum prior to writing the data.

> The new bad .got.plt ended up with a bigger zero area,
> the nonzero area again being nicely aligned for where
> it starts. (The .got.plt starts at the same address
> as above.)
> 
> Contents of section .got.plt:
> 2bed60 00000000 00000000 00000000 00000000  ................
> . . .
> 2befc0 00000000 00000000 00000000 00000000  ................
> 2befd0 00000000 00000000 00000000 00000000  ................
> 2befe0 00000000 00000000 00000000 00000000  ................
> 2beff0 00000000 00000000 00000000 00000000  ................
> 2bf000 00000000 00000000 00000000 00000000  ................
> 2bf010 00000000 00000000 00000000 00000000  ................
> 2bf020 00000000 00000000 00000000 00000000  ................
> 2bf030 00000000 00000000 00000000 00000000  ................
> . . .
> 2bffc0 00000000 00000000 00000000 00000000  ................
> 2bffd0 00000000 00000000 00000000 00000000  ................
> 2bffe0 00000000 00000000 00000000 00000000  ................
> 2bfff0 00000000 00000000 00000000 00000000  ................
> 2c0000 96cb2a00 00000000 a6cb2a00 00000000  ..*.......*.....
> 2c0010 b6cb2a00 00000000 c6cb2a00 00000000  ..*.......*.....
> 2c0020 d6cb2a00 00000000 e6cb2a00 00000000  ..*.......*.....
> 2c0030 f6cb2a00 00000000 06cc2a00 00000000  ..*.......*.....
> . . .
> 

Adding the comparison of the good .got.plt from
the PkgBase based chroot with the official packages
installed:

Contents of section .got.plt:
 2bed60 78ba2b00 00000000 00000000 00000000  x.+.............
 2bed70 00000000 00000000 86a62a00 00000000  ..........*.....
 2bed80 96a62a00 00000000 a6a62a00 00000000  ..*.......*.....
 2bed90 b6a62a00 00000000 c6a62a00 00000000  ..*.......*.....
. . .
 2befc0 16ab2a00 00000000 26ab2a00 00000000  ..*.....&.*.....
 2befd0 36ab2a00 00000000 46ab2a00 00000000  6.*.....F.*.....
 2befe0 56ab2a00 00000000 66ab2a00 00000000  V.*.....f.*.....
 2beff0 76ab2a00 00000000 86ab2a00 00000000  v.*.......*.....
 2bf000 96ab2a00 00000000 a6ab2a00 00000000  ..*.......*.....
 2bf010 b6ab2a00 00000000 c6ab2a00 00000000  ..*.......*.....
 2bf020 d6ab2a00 00000000 e6ab2a00 00000000  ..*.......*.....
 2bf030 f6ab2a00 00000000 06ac2a00 00000000  ..*.......*.....
. . .
 2bffc0 16cb2a00 00000000 26cb2a00 00000000  ..*.....&.*.....
 2bffd0 36cb2a00 00000000 46cb2a00 00000000  6.*.....F.*.....
 2bffe0 56cb2a00 00000000 66cb2a00 00000000  V.*.....f.*.....
 2bfff0 76cb2a00 00000000 86cb2a00 00000000  v.*.......*.....
 2c0000 96cb2a00 00000000 a6cb2a00 00000000  ..*.......*.....
 2c0010 b6cb2a00 00000000 c6cb2a00 00000000  ..*.......*.....
 2c0020 d6cb2a00 00000000 e6cb2a00 00000000  ..*.......*.....
 2c0030 f6cb2a00 00000000 06cc2a00 00000000  ..*.......*.....
. . .

The contents of the non-zero parts of any pair
of the examples agree.

===
Mark Millard
marklmi at yahoo.com


Reply via email to