On 01/07/2020 19:49, Hauke Mehrtens wrote:
On 7/1/20 6:50 PM, Henrique de Moraes Holschuh wrote:
On 12/05/2020 18:46, Hauke Mehrtens wrote:
On 5/12/20 12:24 PM, Bjørn Mork wrote:
Hauke Mehrtens <[email protected]> writes:
I also get this problem with mainline kernel.
See here for some more details:
https://bugs.openwrt.org/index.php?do=details&task_id=2928
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94506
Hello,
I wondered what the current state of this is? Reading that GCC bug it
looks like all the results we've seen are just arbitrary, and either
triggering a latent bug or not.
How do we proceed from there?
FWIW, I just built current master (146e18af568a) with 5.4 kernel using
GCC 9.3. And like others have reported, this booted just fine without
any other tricks.
But it seems a bit too fragile for any release if it could break any
time we do a GCC or kernel update...
Hi,
I would prefer if we know the root cause, more code could be compiled
wrong and just silently fail.
Here's a data point. We *do* have a bug in the toolchain that causes
it to generate broken binaries (maybe only the kernel?) in certain
situations, and it is *not* new.
Since 18.06 (maybe earlier), I can 100% reproduce a broken image here
that will crash-loop on boot on ar71xx *or* ath79 if I do the following
steps:
0. Arch is ar71xx (18.06) or ath79 (19.07). Target device tplink
Archer C7v4 (MIPS 74kc).
1. Clean the toolchain and build tree so that it is going to be rebuilt
from scratch
2. enable TARGET_OPTIONS using menuconfig
3. change CONFIG_TARGET_OPTIMIZATION, replace -mtune=24kc with -mtune=74kc
build. The resulting image will crash-loop if installed. Note: I did
not try with the default options though. I enable LuCi and a few other
packages as built-in, etc. If anyone fails to reproduce, I can send my
standard .config.
DISABLE TARGET_OPTIONS (so that defaults are used), clean everything (so
that the toolchain and all binaries will be rebuilt). rebuild, so that
the toolchain is built without TARGET_OPTIONS.
it works.
Enable (2) and (3) again, do a new build, this time *not* rebuilding the
toolchain. i.e. only the kernel and packages are rebuilt with
TARGET_OPTIONS enabled, and -mtune=74kc.
it works.
So, the problem seems to be on the toolchain build itself, that creates
a faulty toolchain when TARGET_OPTIONS is active (I don't recall if it
causes trouble if you just enable TARGET_OPTIONS but don't touch
CONFIG_TARGET_OPTIONS, sorry).
Hi,
I think some MIPS hazard handling in the kernel is wrong. This would
also explain why we see this only on the MIPS 74K CPU which is an out of
order CPU and not on the MIPS 24K CPUs.
When I add a call to ehb(); at the end of the configure_status()
function the kernel does not crash any more.
This adds the ehb instruction, but it also generates a little bit
different instructions for the configure_exception_vector() function.
If the ST0_MX bit goes into effect only some instructions later, this
should not be a problem as it takes pretty long till we call the first
DSP instruction.
I think the problem is somewhere in the configure_exception_vector()
function.
Is it detecting that it should try to optimize for 74kc from the
*toolchain* ?
i.e. does the kernel build ignores the current TARGET_OPTIONS +
CONFIG_TARGET_OPTIONS?
If it just looks at how the toolchain was compiled, the bug you found
likely explains everything. Otherwise, there is at least one more bug
at play, likely in the toolchain itself.
--
Henrique de Moraes Holschuh
_______________________________________________
openwrt-devel mailing list
[email protected]
https://lists.openwrt.org/mailman/listinfo/openwrt-devel