[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 Richard Earnshaw changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |WONTFIX --- Comment #17 from Richard Earnshaw --- OK, I've found the erratum number, it's 720247; but it's specific to the 11MPcore and is fixed in r2p1 silicon. The erratum workaround description states that there is *no* robust software fix when the MMU is disabled. Some hardware fixes on your platform might be possible, but are off-topic here. Also, using r2p1 silicon fixes the problem. Either way, there's nothing we can do in GCC to address this, given the nature of the problem.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 Richard Earnshaw changed: What|Removed |Added Last reconfirmed||2021-01-29 Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING --- Comment #16 from Richard Earnshaw --- What's the erratum number?
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 david.welch at netronome dot com changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID |--- --- Comment #15 from david.welch at netronome dot com --- Please read the errata and not blow off this ticket. The MMU is not being used, this is a verified problem, acknowledge by ARM as well as being independently discovered. The problem has been present and known by ARM for years, as well as being reported a while ago to gnu/gcc. Use the mmu is not a valid solution to fix a known, demonstrable, bug in the compiler.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 Richard Earnshaw changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #14 from Richard Earnshaw --- The code generated is architecturally correct. If your core is prefetching from addresses that are not valid then this is indicative that the MMU is incorrectly configured for your system. Prefetches will NOT be attempted from unmapped pages, or pages that are mapped as device memory. So you need to find out why your memory system has not been correctly set up. There's no bug in GCC here.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #13 from david.welch at netronome dot com --- Very sorry it has been years since I did this research, a simple nop wont fix it but a branch to self will. bad TEST: push {r4,lr} pop {r4,pc} bx r0 /*.hword 0x4700*/ nop nop bad TEST: push {r4,lr} pop {r4,pc} nop bx r0 /*.hword 0x4700*/ nop nop good TEST: push {r4,lr} pop {r4,pc} b . bx r0 /*.hword 0x4700*/ nop nop
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #12 from david.welch at netronome dot com --- I my case this was found with a hang, but the problem exists as a read, which means it can cause a read to a read sensitive peripheral causing adverse affects.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #11 from david.welch at netronome dot com --- I wish I had know this when I filed this ticket, there is an ARM Errata for this issue that was issued before or in 2009. 720247: Speculative Instruction fetches can be made anywhere in the memory map I have researched this bug on this core and provided a workaround that ARM was not able or willing. (put a nop after unconditional branch instructions that modify the pc like pop {r4,pc}, but not bx lr...,anything other than another branch instruction that causes a speculative fetch). So if you require an ARM Errata in order to fix something, there you go it exists. It is still present in gcc 10 (has been present all this time). I have not examined gcc 11 yet as it has not been formally released. unsigned int more_fun ( unsigned int ); unsigned int fun ( void ) { return(more_fun(0x12344700)+1); } Disassembly of section .text: : 0: b510push{r4, lr} 2: 4802ldr r0, [pc, #8]; (c ) 4: f7ff fffe bl 0 8: 3001addsr0, #1 a: bd10pop {r4, pc} c: 12344700.word 0x12344700 .thumb .inst.n 0x4700 Disassembly of section .text: <.text>: 0: 4700bx r0 and there is the speculative execution that causes a read (that can be anywhere in the address space) arm-none-eabi-gcc --version arm-none-eabi-gcc (GCC) 10.2.0 Copyright (C) 2020 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. One could examine everything after a branch like this for another branch as a real instruction or embedded in the top of the pool a nop may be simpler after each of the at-risk instructions.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 david.welch at netronome dot com changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID |--- --- Comment #10 from david.welch at netronome dot com --- How do I get some feedback on this? Do I need to create a new ticket? This is not about a system hang, this is about GCC output that causes data to be executed as code in the pipeline. Was detected through a hang, but perfectly valid address spaces are affected. Quite clearly a gcc bug. The root cause is GCC is feeding data into the pipeline to be executed. Just because ARM didnt publish it doesnt mean their core is without other undocumented problems. The MMU is too late the data has started to execute, so that at best is a hack, not a solution.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #9 from david.welch at netronome dot com --- Basically gcc is generating a sequence where data starts to execute in the pipe. I cant imagine that is a good idea to let the processor execute data when you can avoid it instead of a pop {...pc} ; some data a pop { ... lr} ; bx lr creates a data hazard, the bx doesnt execute until the register change has resolved. Other cores might not execute the words after a pop in the pipeline if pc is one of the popped values but this core does. Patching this instruction sequence after the execution has started is just a kludge.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #8 from david.welch at netronome dot com --- gcc is treating these instructions as unconditional branches, but the core does NOT treat these instructions as unconditional branches. The disconnect is quite clear between the code produced and the core behavior, kludges and workarounds are interesting, but the volume of other similar situations that gcc has responded to in its code generation is confusing here. Why generate code that works for the core in one case but not in another. Can you please elaborate?
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #7 from david.welch at netronome dot com --- This is an armv6 not an armv7. So far I have not seen that the mmu or cache or branch prediction is required for proper operation of the core. I have so far not see this on other cores, but still working on that it is very much present on this core. I would rather not have to use the mmu as a kludge.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||ramana at gcc dot gnu.org Resolution|--- |INVALID --- Comment #6 from Ramana Radhakrishnan --- To answer the question open, no options have been added to "avoid" this behaviour. The code generated by the compiler is as per the architecture specification, there's nothing wrong here and thus this is not a valid report. In order to prevent such speculation by the ARM11 MPCore, one needs to set the XN bit as Matt referred to above. Look also at the execute-never bit in the ARM-ARM - ARMv7-AR Section B3.7.2 (Issue C - DDI0406C, page B3-1351) where it covers what happens with Speculation and no-execute, and thus given that's in the architecture implementations have to follow that. regards Ramana
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #5 from david.welch at netronome dot com --- it is definitely doing prefetching by not realizing those instructions are unconditional branches. most likely going with strongly ordered rather than the XN bit but noted as a workaround. Since the armv4t does not support the pop pc and there are runtime flags, wanted to first know what options are there or would they have to be added. What other cores have been reported as having this issue, where there any compiler additions made for them?
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 mgretton at gcc dot gnu.org changed: What|Removed |Added CC||mgretton at gcc dot gnu.org --- Comment #4 from mgretton at gcc dot gnu.org --- As you suggest in your original comment this hang could be coming from the instruction pre-fetch going to some place in memory that is mapped (and executable) but the memory system is not giving a response to memory accesses to that location. As a general point all read sensitive devices must be marked as XN to prevent speculative corruption of those read sensitive devices by instruction fetch (this is true on future versions of the architecture as well). Can you ensure that the XN-bit is set on memory pages mapped to read-sensitive devices? (XN description ARM11 MP Core TRM: http://infocenter.arm.com/help/topic/com.arm.doc.ddi0360f/CACHFICI.html)
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #3 from david.welch at netronome dot com --- The problem exists as well with ldr pc,[something]. I have not dug through gcc but did some compilation experiments, not nearly enough to be 100% sure, but for switch statements the code generated always appears to do a comparison (perhaps after a subtract or other modification, an ldrls pc,[], then an unconditional branch to deal with the last item (or a default). If that is always the rule that is safe. And for a function table, an array of function pointers, it did the math using gprs and then a mov lr,pc ; bx rn. an ldr pc,[] literal pool data will cause this undesired prefetch.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #2 from david.welch at netronome dot com --- ARM does not have an errata on this for this core from what I was given. Dont know why they would, at best it would fall into the "unpredictable results" category. Errata or not was hoping there could be an option if not one already. the armv4t one is an option but affects more than just this one thing I would assume but dont know gcc internals, so to big of a hammer.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #1 from Andrew Pinski --- This sounds like an errata in the core you are using. There seems like the best way to fix this is via an option which worksaround this errata if there is not one already.