[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 --- Comment #19 from Richard Biener --- (In reply to Alexander Monakov from comment #18) > From my perspective, the main blocker for a nice and clean solution is lack > of "birth" statements on GIMPLE. They were not needed until now ;) > Without them, expansion to RTL would either need to insert initialization at > the top of the function (which is silly, extends lifetimes of pseudos that > only live in a small region, complicating RA), or compute something like a > lowest common dominator of all uses and place an initialization there. But > perhaps that's the right way if "birth statements" aren't happening? Yes, note that an explicit birth also affects SSA coalescing so they would need to be introduced before liveness computation or alternatively be inserted in possibly multiple places when the reg becomes live. > Or is there some other approach? Like not trying to insert a single > initialization, but instead substituting a zero in place of each use of the > default def individually? That would of course also work - use replace_uses_by to replace default defs with zero for modes that have a NaT (I don't think we have an existing way to check that though). But maybe one wants to preserve NaT trapping for uninit uses (though that would require active seeding of default defs with a NaT rather than with a non-NaT since otherwise the register content is just random and not a NaT).
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 --- Comment #18 from Alexander Monakov --- >From my perspective, the main blocker for a nice and clean solution is lack of "birth" statements on GIMPLE. Without them, expansion to RTL would either need to insert initialization at the top of the function (which is silly, extends lifetimes of pseudos that only live in a small region, complicating RA), or compute something like a lowest common dominator of all uses and place an initialization there. But perhaps that's the right way if "birth statements" aren't happening? Or is there some other approach? Like not trying to insert a single initialization, but instead substituting a zero in place of each use of the default def individually?
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 --- Comment #17 from Richard Biener --- So to sum up, a SSA default def would have to be expanded to an explicit (zero) initialization. I suppose memory doesn't have a NaT state so stack allocation for locals doesn't have to change but decls eventually expanding to pseudos would have to go through the same exercise. Since we now have -ftrivial-auto-var-init=zero that might serve as a workaround for GCC 12. But I guess trying to fix this in RTL expansion shouldn't be too difficult (gated with a new target hook), if anyone really cares. Do any better maintained archs than ia64 have NaTs? NVPTX was mentioned, but I guess that's not really easier to handle since it's also a meta-ISA.
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 Eric Gallager changed: What|Removed |Added CC||glaubitz at physik dot fu-berlin.d ||e, jrtc27 at jrtc27 dot com See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=87338 --- Comment #16 from Eric Gallager --- (In reply to Jim Wilson from comment #14) > https://wiki.debian.org/Ports/ia64 > > James Clarke has been active recently on the binutils and/or gcc mailing > lists. My IA-64 work has dwindled down to nothing, as RISC-V work has kept > me too busy to do anything else. With the architecture already set to > end-of-life next year, I'm not planning to do anymore IA-64 work. (In reply to Jim Wilson from comment #15) > See also PR 87338 which has a response earlier today from John Paul Adrian > Glaubitz. ok, thanks
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 --- Comment #15 from Jim Wilson --- See also PR 87338 which has a response earlier today from John Paul Adrian Glaubitz.
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 --- Comment #14 from Jim Wilson --- https://wiki.debian.org/Ports/ia64 James Clarke has been active recently on the binutils and/or gcc mailing lists. My IA-64 work has dwindled down to nothing, as RISC-V work has kept me too busy to do anything else. With the architecture already set to end-of-life next year, I'm not planning to do anymore IA-64 work.
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 --- Comment #13 from Eric Gallager --- (In reply to Jim Wilson from comment #12) > I no longer have access to IA-64 hardware. I was leaving myself as > maintainer just so that there was someone responsible for answering > questions. I don't care if the port survives or not. I can resign if that > makes things easier. > > There is an ia64 debian group that has hardware, and is still doing debian > work. They would be disappointed if the ia64 port was deprecated. I get > the occasional binutils and gcc bug reports from them, and am occasionally > able to fix them even though I don't have hardware. Do they have a bugzilla account here that could be cc-ed? > > I think that solving this bug requires that all locals get initialized to 0 > when defined. Otherwise, you may accidentally read a register with the nat > bit set when doing bit-field operations on a register. I don't care enough > about ia64 to try to fix this.
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 --- Comment #12 from Jim Wilson --- I no longer have access to IA-64 hardware. I was leaving myself as maintainer just so that there was someone responsible for answering questions. I don't care if the port survives or not. I can resign if that makes things easier. There is an ia64 debian group that has hardware, and is still doing debian work. They would be disappointed if the ia64 port was deprecated. I get the occasional binutils and gcc bug reports from them, and am occasionally able to fix them even though I don't have hardware. I think that solving this bug requires that all locals get initialized to 0 when defined. Otherwise, you may accidentally read a register with the nat bit set when doing bit-field operations on a register. I don't care enough about ia64 to try to fix this.
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 --- Comment #11 from Alexander Monakov --- (In reply to Richard Biener from comment #10) > Hmm, I don't remember whether uninit reads invoke undefined behavior, esp. > result in a trap representation, but the original testcase doesn't seem to > read uninitialized things. Neither the original nor the updated testcase invoke UB, but to answer the question, C11 says uninit reads invoke UB if the object could have been declared with the 'register' keyword, otherwise it results in an indeterminate value (which can be a trap representation if its type allows that). > For it I suspect that store_bitfield is at fault given it will read possibly > uninitialized values for the RMW cycle it performs. I think store_bit_field is fine. I'd say the problem is that "birth" of an object does not result in any tidying/preparing for the underlying storage on RTL. But (except for specific case of probing the stack for large objects) other targets don't need any such tidying afaik (except for a related SIMT issue on NVPTX). > So one of the questions is whether we have to / want to preserve any NaT > in non-speculated code which means simply clearing NaT everywhere in a > mdreorg-like pass? Given the above, I suspect it cannot be fully solved with a late RTL pass: one hardreg may be reused for two different objects, and an attempted speculation on the first object may leave the NaT bit set for the second object.
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 --- Comment #10 from Richard Biener --- Hmm, I don't remember whether uninit reads invoke undefined behavior, esp. result in a trap representation, but the original testcase doesn't seem to read uninitialized things. For it I suspect that store_bitfield is at fault given it will read possibly uninitialized values for the RMW cycle it performs. So one of the questions is whether we have to / want to preserve any NaT in non-speculated code which means simply clearing NaT everywhere in a mdreorg-like pass? And yes, pass_initialize_regs papers over a lot of issues here but that pass should go away and the fallout fixed since it masks a lot of otherwise latent issues given it only initializes must-undefs. There's a PR for that as well. Jim, if you no longer have access to IA64 HW does it make sense for you to be the port maintainer? Would you be willing to step down and let the port be deprecated for GCC 9 so we can remove it for GCC 10?
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 Alexander Monakov changed: What|Removed |Added Status|RESOLVED|NEW Resolution|INVALID |--- --- Comment #9 from Alexander Monakov --- OK, so on the following testcase: struct s {int f:1;}; struct s f(int x, int y) { struct s r; if (x) { r.f = 1; asm(""); } if (y) { r.f = 1; asm(""); } return r; } with -O -fno-tree-sra we emit: .proc f# f: .prologue .body cmp4.eq p6, p7 = 0, r32 (p6) br.cond.dptk .L2 addl r8 = 1, r0 ;; .L2: ;; cmp4.eq p6, p7 = 0, r33 (p6) br.cond.dptk .L3 dep r8 = -1, r8, 0, 1 ;; .L3: ;; br.ret.sptk.many b0 ;; .endp f# Thus the call f(0, 1) won't clear the NaT bit on r8. So the bug should be valid.
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Comment #8 from Alexander Monakov --- (In reply to Jim Wilson from comment #7) > An possible fix might be to force all locals to zero, to ensure that we > never read from an uninit register. My understanding is that with introduction of pass_initialize_regs in 4.3 gcc did that for must-uninit uses, masking off many potential issues. I don't know how hard it would be to produce a testcase that breaks today, but it would need to involve may-uninit accesses, not must-uninit.
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 --- Comment #7 from Jim Wilson --- I must have failed to realize that this was a bug I had created when I got the earlier message. Looking now, I see that I can still reproduce with mainline, though the problem is not as obvious with this testcase. Compiling with -O -S -fdump-rtl-all, I see for the bitfield insert of i into foo.i (insn 8 5 9 2 (set (reg:DI 348) (zero_extend:DI (reg/v:SI 345 [ i ]))) "tmp.c":10 -1 (nil)) (insn 9 8 10 2 (set (reg:DI 350) (const_int 2147483647 [0x7fff])) "tmp.c":10 -1 (nil)) (insn 10 9 11 2 (set (reg:DI 349) (and:DI (reg:DI 348) (reg:DI 350))) "tmp.c":10 -1 (nil)) (insn 11 10 12 2 (set (reg:DI 352) (const_int -2147483648 [0x8000])) "tmp.c":10 -1 (nil)) (insn 12 11 13 2 (set (reg:DI 351) (and:DI (reg:DI 343 [ D.1463 ]) (reg:DI 352))) "tmp.c":10 -1 (nil)) (insn 13 12 14 2 (set (reg:DI 353) (ior:DI (reg:DI 351) (reg:DI 349))) "tmp.c":10 -1 (nil)) (insn 14 13 15 2 (set (reg:DI 343 [ D.1463 ]) (reg:DI 353)) "tmp.c":10 -1 (nil)) And note that in the fifth insn (reg:DI 343) is used uninitialized. This is not safe on IA-64, as an uninit register may accidentally have the NaT bit set, which would then cause a trap. For this testcase, later optimization passes clean up the mess, and we get correct code at the end. However, it is not safe to have a read of an uninit pseudo on IA-64, as there is no guarantee that a later optimization pass will accidentally fix it. Mostly the compiler works because we rarely speculate, and hence we rarely have registers with the NaT bit set, and hence it is very rare that anyone will ever trigger this problem. Andrew Pinski suggests that it is an NRV problem, but I see the same issue with NRV disabled. An possible fix might be to force all locals to zero, to ensure that we never read from an uninit register. I no longer have access to IA-64 hardware at home or at work. There is no IA-64 hardware in the compile farm either. I also no longer care whether this issue gets addressed.
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 Eric Gallager changed: What|Removed |Added Status|WAITING |RESOLVED CC||egallager at gcc dot gnu.org Resolution|--- |INVALID --- Comment #6 from Eric Gallager --- (In reply to Richard Biener from comment #5) > Is this still an issue? Guessing it isn't since the reporter never replied
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Keywords||wrong-code Status|UNCONFIRMED |WAITING Last reconfirmed||2012-01-11 Ever Confirmed|0 |1 --- Comment #5 from Richard Guenther rguenth at gcc dot gnu.org 2012-01-11 13:04:46 UTC --- Is this still an issue?
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-07-19 05:24 --- (In reply to comment #3) Subject: Re: IA-64 NaT consumption faults due to uninitialized The change I am suggesting should not hurt performance, and I expect that it would actually help performance in many cases. Currently, the first assignment to a structure is a bitfield insert. This is PR 15596 and really an expand issue with respect with NVR. Use the C++ front-end for both 3.3 and 4.0.0 and noticed that it happens there too. I would not doubt you could reproduce this with the C++ front-end before 4.0.0 also. This is all NVR related and nothing more. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
--- Additional Comments From wilson at gcc dot gnu dot org 2005-04-19 20:41 --- Created an attachment (id=8684) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8684action=view) synthetic IA-64 testcase to generate NaT consumption fault Compile this with -O, and it will fail with gcc-2.96 and gcc-4.0.0. We get an illegal instruction signal. It works with gcc-3.0 and gcc-3.3 and presumably with every other gcc-3.x release. If you use gcc-4.0.0 to compile with -O -S, and look at the assembly code, you can see that r8 is being used uninitialized, which fails if r8 happens to have the NaT bit set. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-04-19 20:47 --- To me, the target specific code should be the one to fix this problem up and not the middle-end or at least have a hook for it so you don't mess around with other targets getting the speed up. Anyways seems like someone thought it would be cool if they did this, oh well. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
--- Additional Comments From wilson at specifixinc dot com 2005-04-19 23:05 --- Subject: Re: IA-64 NaT consumption faults due to uninitialized register reads pinskia at gcc dot gnu dot org wrote: --- Additional Comments From pinskia at gcc dot gnu dot org 2005-04-19 20:47 --- To me, the target specific code should be the one to fix this problem up and not the middle-end or at least have a hook for it so you don't mess around with other targets getting the speed up. Anyways seems like someone thought it would be cool if they did this, oh well. The change I am suggesting should not hurt performance, and I expect that it would actually help performance in many cases. Currently, the first assignment to a structure is a bitfield insert. If we zero the structure before the first assignment, then combine will give us a simple assignment instead, which will be faster than a bitfield insert for most targets. This may also allow other assignments to be combined in, giving further benefits. (There can be multiple first assignments if there are multiple blocks where the structure becomes live.) I agree that the optimizations being performed by tree-ssa are useful here, but one must not be confused by the big picture issues here into ignoring the details. Emitting a bit-field insert when only a simple assignment is needed is wrong. It may cause performance loss on many targets, and it causes core dumps on IA-64. Take a look at this example. struct s { unsigned long i : 32; unsigned long j : 32;}; int i; struct s sub (void) { struct s foo; foo.i = i; return foo; } Compiling this for x86-64 on mainline, I get 10 instructions, which perform two bit-field insertions. Compiling this with gcc-3.3, I get 7 instructions which perform one bit-field insertion. I think the optimal code is two instructions, one to load i into the low part of the return register, and one to return. The upper bits of the structure are don't care bits, so we can set them to anything we want. There is no need for any bitfield insertion here at all. Mainline does even worse than gcc-3 here because in order to decompose the structure it creates a fake j assignment, and then we end up emitting bitfield insertion code for the fake j assignment, even though this code is completely useless. Furthermore, the RTL optimizer is not able to delete this fake j assignment, because it is a bitfield insert. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2