On Fri, 9 May 2025 at 10:45, Reimar Döffinger <[email protected]> wrote: > > On May 9, 2025 2:02:49 PM GMT+07:00, Damien Stewart <[email protected]> > wrote: > >Hi guys. > > > >So I this is really a follow up to the "What are the current available > >browser options for debian-ppc64?" thread where it there was a technical > >discussion on why Firefox was crashing which ended up being rather > >anti-climatic. But I wanted to check myself since I'm aware the last few > >years all Firefox does is crash on load. At the time PPC was last officially > >supported on Jessie, Firefox was becoming unstable then. It loaded and > >worked but would easily crash and exit. Now it's much worse. > > > >So unlike most of the PPC people out there I don't have a quad G5 power > >horse. I do have a rather rare X1000 with a PASemi PA6T. Only dual but 64 > >bit and does the job. I soon found out running Firefox under GDB needs over > >6GB RAM and I only had 4GB with HDD swap space. I rarely need swap on PPC, > >unlike my laptop. But I had some spare backup RAM and decided to max it out. > >After wrestling with DDR2 RAM slots I managed to get it working. A 64 bit > >PowerPC machine with 8GB RAM and Debian 64 installed on SSD. Okay I've > >broken the 32 bit barrier and now I'm talking. :-D > > > >My results to summarise it are that it is the same crash. Different day, > >same code. That streqci() function again. This time in firefox_138.0.1. But > >here's some info I picked up that may help to close in on it. With a running > >commentary. :-) > > > >damien@ubuntu:~$ gdb firefox.real > >GNU gdb (Debian 16.3-1) 16.3 > > > >... > >Reading symbols from firefox.real... > >Reading symbols from > >/usr/lib/debug/.build-id/fd/6adabdb8b6655f970f65deffcea09f8d7dac41.debug... > > > >(gdb) run > >Starting program: /usr/bin/firefox.real > >[Thread debugging using libthread_db enabled] > >Using host libthread_db library "/lib/powerpc64-linux-gnu/libthread_db.so.1". > > > >... A minute or two filling up 6GB of RAM... > > > >Thread 1 "firefox.real" received signal SIGSEGV, Segmentation fault. > >w2c_rlbox_streqci (var_p0=var_p0@entry=262000, var_p1=2016478208, > > instance=<optimized out>) at rlbox.wasm.c:55615 > >warning: 55615 rlbox.wasm.c: No such file or directory > > > >As you can see different day, same code. Same function but without that > >i32_load8_u. I don't like the look of that instance. Why is instance > >optimized out? The frame is omitted. > > > >Back trace... > > > >(gdb) bt > >#0 w2c_rlbox_streqci (var_p0=var_p0@entry=262000, var_p1=2016478208, > > instance=<optimized out>) at rlbox.wasm.c:55615 > >#1 0x00003fffe8e1e268 in w2c_rlbox_getEncodingIndex ( > > instance=<optimized out>, var_p0=<optimized out>) at rlbox.wasm.c:55548 > >#2 w2c_rlbox_getEncodingIndex (instance=0x3fffda90f000, var_p0=262000) > > at rlbox.wasm.c:55531 > >#3 w2c_rlbox_MOZ_XmlInitEncodingNS_0 (instance=0x3fffda90f000, > >var_p0=325428, > > var_p1=325424, var_p2=262000) at rlbox.wasm.c:57164 > >#4 0x00003fffe8e4ce1c in w2c_rlbox_initializeEncoding ( > > instance=instance@entry=0x3fffda90f000, var_p0=var_p0@entry=325280) > > at rlbox.wasm.c:37816 > > > >The hit... > > > >(gdb) disas > >Dump of assembler code for function w2c_rlbox_streqci: > > 0x00003fffe8e1e150 <+0>: ld r3,0(r3) > > 0x00003fffe8e1e154 <+4>: subf r4,r5,r4 > > 0x00003fffe8e1e158 <+8>: nop > > 0x00003fffe8e1e15c <+12>: nop > >=> 0x00003fffe8e1e160 <+16>: lbzx r9,r3,r5 > > 0x00003fffe8e1e164 <+20>: add r10,r4,r5 > > 0x00003fffe8e1e168 <+24>: clrlwi r9,r9,24 > > 0x00003fffe8e1e16c <+28>: clrldi r10,r10,32 > > 0x00003fffe8e1e170 <+32>: lbzx r10,r3,r10 > > > >Why is there nop? Does it mean ori? PPC doesn't have nop. Why doesn't gdb > >list the machine code as standard? Supposed to be a debugger. This code > >looks sus. > > > >Registers... > >(gdb) info r > >r0 0x3fffe8e4ce1c 70368356519452 > >r1 0x3fffffffbe90 70368744160912 > >r2 0x3ffff413c500 70368544146688 > >r3 0x3ffb00000000 70347269341184 > >r4 0xffffffff87d2fb70 18446744071693335408 > >r5 0x78310400 2016478208 > > > >Ok so it doesn't like r9 = [r3 + r5]. What's wrong with 3FFB78310400? Apart > >from r5 being a large 32 bit integer. > > > >I had apt sourced the source but gdb couldn't see it so needed to so some > >digging... > > > >damien@ubuntu:~/Applications/firefox-debug/firefox-138.0.1$ grep -ir > >"streqci" . > >./parser/expat/expat/lib/xmltok.c:streqci(const char *s1, const char *s2) { > >./parser/expat/expat/lib/xmltok.c: /* The following line will never get > >executed. streqci() is > >./parser/expat/expat/lib/xmltok.c: if (streqci(name, encodingNames[i])) > >./parser/expat/expat/lib/xmltok_ns.c: if (streqci(buf, KW_UTF_16) && > >enc->minBytesPerChar == 2) > > > >The source: > >static int FASTCALL > >streqci(const char *s1, const char *s2) { > > for (;;) { > > char c1 = *s1++; > > char c2 = *s2++; > > if (ASCII_a <= c1 && c1 <= ASCII_z) > > c1 += ASCII_A - ASCII_a; > > if (ASCII_a <= c2 && c2 <= ASCII_z) > > /* The following line will never get executed. streqci() is > > * only called from two places, both of which guarantee to put > > * upper-case strings into s2. > > */ > > c2 += ASCII_A - ASCII_a; /* LCOV_EXCL_LINE */ > > if (c1 != c2) > > return 0; > > if (! c1) > > break; > > } > > return 1; > >} > > > >This code appears to be poor quality. It doesn't validate the input strings > >nor check for null bytes. Not to mention that icky for ever. That if test is > >in a strange order making some sort of coding palindrome. Funny. :-) > > > >Perhaps for an internal API, not checking input when input must be given is > >acceptable, but these are the reasons C lib str*() functions are criticised > >now days. This streqci() is uncommon in my search and particular to XML > >parsing. Ok, so what is going wrong with it? Given it's embedded into this > >rlbox.wasm.c where is it generated from? The build itself? I don't see that > >exact file in source. > > > >From what I can tell it wants to use r5 as an index with lbzx but instead > >does something funky with r5 instead of zeroing it. Before doing nothing > >twice. It would have been better off using lbzu! So what kind of contraption > >caused the C compiler to generate asm code like that? The C code looks > >straight forward enough for a C compiler to understand but the binary code > >is corrupted. Is this is a result of the build process wrecking it? Or does > >PPC GCC have some rare bug causing a code side effect of broken code? I know > >this is old news by now but I just don't know how it ended up generating > >broken code that is only broken on PPC. :-? > > > > > > I don't think you are quite on the right track. > w2c_rlbox_streqci has 3 arguments, not 2 like the source you quote, so that > will make it real hard to match those up.
I think it's because w2c_rlbox_streqci is an "rlbox" sandboxed version of the expat streqci function (https://rlbox.dev). It's transformed somewhere and not necessarily even compiled by gcc. I don't know how this web stuff, wasm, rlbox etc works, the toolchain is a total mystery to me... I'm not even sure if new C source is generated or if sandboxed functions are generated at compile time skipping C entirely. But it does look as though there's a fair chance the source code presented is what is eventually transformed into the asm code presented. However the bug could equally be in the toolchain somewhere...

