On 9/17/23 18:03, Alan Mackenzie wrote:
Hello, Fernando.
On Sun, Sep 17, 2023 at 17:49:22 -0400, Fernando Rodriguez wrote:
A few months ago after updating my kernel I started getting an invalid
opcode error during boot on the init process on my initramfs which I did
rebuilt. Switching to the old kernel and initramfs fixed the problem so
I kept that kernel for a few months for lack of time.
Today I rebuilt the whole system using `emerge -e @world` and after that
I'm able to boot the new kernel but now some pre-compiled packages (and
some that emerge -e missed because the ebuild was masked) crash with
illegal opcode. In the case of chrome it's not crashing but it only
renders garbage for webpages.
Does anyone have a clue what is happening? It's like the instruction set
changed after the kernel update (or was it the microcode?)
Could it be that you've got a sporadic RAM failure? Running the
standard RAM test (the one you boot into, I've forgotten its name) for
many hours might pin down the problem.
I ran the test to be sure but it's not sporadic. It happens all the time
with the same pre-built binaries. My last working kernel was 5.15.122,
if I boot from that kernel everything works. Before the update
everything was built with -march=native and before the 'emerge -e' I
switched to -mtune=generic but I don't think it was the flags that
messed it up but the act of rebuilding because after rebuilding the
whole system I'm still having issues with pre-compiled binaries and
those should be generic builds. Strangely the same binaries that crash
on the host system run fine on a VM using hw virtualization.
I will try to run it on gdb to find out which instruction is triggering
the fault.
Thanks,
Fernando