[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |11.0
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Bug 57359 depends on bug 95172, which changed state. Bug 95172 Summary: [11 Regression] wrong code at -O1 on x86_64-linux-gnu since r11-272-gb6ff3ddecfa93d53 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95172 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Bug 57359 depends on bug 94988, which changed state. Bug 94988 Summary: [11 Regression] FAIL: gcc.target/i386/pr64110.c scan-assembler vmovd[\\t ] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94988 What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Bug 57359 depends on bug 94988, which changed state. Bug 94988 Summary: [11 Regression] FAIL: gcc.target/i386/pr64110.c scan-assembler vmovd[\\t ] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94988 What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |---
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Bug 57359 depends on bug 95045, which changed state. Bug 95045 Summary: [11 Regression] wrong code at -O3 on x86_64-linux-gnu https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95045 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Bug 57359 depends on bug 94988, which changed state. Bug 94988 Summary: [11 Regression] FAIL: gcc.target/i386/pr64110.c scan-assembler vmovd[\\t ] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94988 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Bug 57359 depends on bug 95025, which changed state. Bug 95025 Summary: [11 Regression] ICE in execute_sm_exit at gcc/tree-ssa-loop-im.c:2224 since r11-161-g283cb9ea6293e813 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95025 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Richard Biener changed: What|Removed |Added CC||msebor at gcc dot gnu.org --- Comment #34 from Richard Biener --- *** Bug 90668 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Bug 57359 depends on bug 90668, which changed state. Bug 90668 Summary: loop invariant moving a dependent store out of a loop https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90668 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Known to work||11.0 Known to fail||10.0 --- Comment #33 from Richard Biener --- Fixed on trunk.
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #32 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:283cb9ea6293e813e48a1b769e1e0779918ea20a commit r11-161-g283cb9ea6293e813e48a1b769e1e0779918ea20a Author: Richard Biener Date: Mon Apr 27 14:45:54 2020 +0200 tree-optimization/57359 - rewrite SM code This rewrites store-motion to process candidates where we can ensure order preserving separately and with no need to disambiguate against all stores. Those candidates we cannot handle this way are validated to be independent on all stores (w/o TBAA) and then processed as "unordered" (all conditionally executed stores are so as well). This will necessary cause FAIL: gcc.dg/graphite/pr80906.c scan-tree-dump graphite "isl AST to Gimple succeeded" because the SM previously performed is not valid for exactly the PR57359 reason, we still perform SM of qc for the innermost loop but that's not enough. There is still room for improvements because we still check some constraints for the order preserving cases that are only necessary in the current strict way for the unordered ones. Leaving that for the furture. 2020-05-07 Richard Biener PR tree-optimization/57359 * tree-ssa-loop-im.c (im_mem_ref::indep_loop): Remove. (in_mem_ref::dep_loop): Repurpose. (LOOP_DEP_BIT): Remove. (enum dep_kind): New. (enum dep_state): Likewise. (record_loop_dependence): New function to populate the dependence cache. (query_loop_dependence): New function to query the dependence cache. (memory_accesses::refs_in_loop): Rename to ... (memory_accesses::refs_loaded_in_loop): ... this and change to only record loads. (outermost_indep_loop): Adjust. (mem_ref_alloc): Likewise. (gather_mem_refs_stmt): Likewise. (mem_refs_may_alias_p): Add tbaa_p parameter and pass it down. (struct sm_aux): New. (execute_sm): Split code generation on exits, record state into new hash-map. (enum sm_kind): New. (execute_sm_exit): Exit code generation part. (sm_seq_push_down): Helper for sm_seq_valid_bb performing dependence checking on stores reached from exits. (sm_seq_valid_bb): New function gathering SM stores on exits. (hoist_memory_references): Re-implement. (refs_independent_p): Add tbaa_p parameter and pass it down. (record_dep_loop): Remove. (ref_indep_loop_p_1): Fold into ... (ref_indep_loop_p): ... this and generalize for three kinds of dependence queries. (can_sm_ref_p): Adjust according to hoist_memory_references changes. (store_motion_loop): Don't do anything if the set of SM candidates is empty. (tree_ssa_lim_initialize): Adjust. (tree_ssa_lim_finalize): Likewise. * gcc.dg/torture/pr57359-1.c: New testcase. * gcc.dg/torture/pr57359-1.c: Likewise. * gcc.dg/tree-ssa/ssa-lim-14.c: Likewise. * gcc.dg/graphite/pr80906.c: XFAIL.
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Richard Biener changed: What|Removed |Added Attachment #48381|0 |1 is obsolete|| --- Comment #31 from Richard Biener --- Created attachment 48422 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48422=edit more complex approac So this is a feature-wise finished patch for the more complex approach (some possible improvements are still marked as ???). I'm throwing this on SPEC now and appreciate correctness testing. It patches ontop of the PR39612 patch which is ready for early GCC 11.
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #30 from Richard Biener --- Created attachment 48381 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48381=edit more complex approach, POC Another testcase, this time for store ordering (IIRC we may have a duplicate for this). We currently use a fixed order of *p and *r for _both_ exits which is of course wrong. POC patch fixing all issues attached, it needs to be enhanced to not give up when not all stores we want to move are stored in the exit block. Also some memory leaks need fixing and we need to avoid doing redundant work when enhancing the scheme and eventually enhance it to prune SMs we cannot perform. extern void abort(); typedef int A; typedef float B; void __attribute__((noinline,noclone)) foo(A * p, B *r, long unk, long oh) { for (long i = 0; i < unk; ++i) { *p = 1; *r = 2; if (oh & i) break; *r = 3; *p = 4; } } int main(void) { union { A x; B f; } u; foo(, , 1, 1); if (u.x != 4) abort(); foo(, , 2, 1); if (u.f != 2) abort (); return 0; }
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #29 from Richard Biener --- And another testcase showing that with a conditional invariant store we may _never_ apply store-motion since conditional means we do not know the original order of stores in the loop. *sigh* typedef int A; typedef float B; float x[256]; void __attribute__((noinline,noclone)) foo(long unk, long ack) { B *q = x; for (long i = 0; i < unk; ++i) { q[i] = 42; if (ack & i) *((A*)q + 4) = 1; } } int main(void) { foo(5, 3); if (x[4] != 42) __builtin_abort(); return 0; } implementation-wise it should still work, in hoist_memory_references, when we computed the whole set of refs to apply store-motion to in a loop, track the sequencing of stores from entry to the latch edge and at merge points reject (parts of?) the sequence when there are mismatches in ordering. For the above there'd be { q[i], *((A*)q + 4) } and { q[i] } and thus a mismatch. If we reject only parts of the sequence those parts would need to be disambiguated against the previous stores in the sequence w/o TBAA. Trying to code that up now.
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #28 from Richard Biener --- OK, so more "advanced" testcases are a bit difficult because the ref_always_accessed_p logic is too simple and it's required for store-motion of accesses in conditional paths. Basically if we have if (test) { *p = 1; } else { *p = 0; } LIM doesn't figure *p is accessed unconditionally (that's a missed optimization). Anyway, here's a testcase circumventing that by using static storage and showing that conditionally executed SM (with and without -fallow-store-data-races) fail. It's supposed to be a testcase for a "merge" point of two different sequenced store sequences we need to check. typedef int A; typedef float B; float x[256]; void __attribute__((noinline,noclone)) foo(long unk, long ack) { B *q = x; for (long i = 0; i < unk; ++i) { if (ack & i) { *((A*)q + 4) = 1; q[i] = 42; } else { q[i] = 42; *((A*)q + 4) = 1; } } } int main(void) { foo(5, 6); if (x[4] != 42) __builtin_abort(); return 0; }
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #27 from Richard Biener --- (In reply to Richard Biener from comment #24) > (In reply to Richard Biener from comment #22) > > Created attachment 48311 [details] > > patch > > > > Note that apart from the possible bad impact on optimization when fixing > > this > > bug an actual fix is complicated by the custom "optimized" dependence > > analysis > > code in the loop invariant motion pass. > > > > A conservative "simple" patch would be the attached but that doesn't > > preserve > > store-motion for the following (because the LIM data dependence code doesn't > > care about stmt order): > > > > typedef int A; > > typedef float B; > > > > void __attribute__((noinline,noclone)) > > foo(A *p, B *q, long unk) > > { > > for (long i = 0; i < unk; ++i) { > > q[i] = 42; > > *p = 1; > > } > > } > > > > usually this bug doesn't manifest itself but of course the fix will be > > experienced everywhere. Benchmarking the simple patch might reveal > > it's not an issue (but I doubt that...). > > One case like this is gcc.dg/tree-ssa/pr81744.c which fails after the patch > because we do not SM the global induction variable update which is already > last before exit. Similarly gcc.dg/graphite/pr80906.c and > gcc.target/i386/pr64110.c - that's all of the GCC testsuite fallout on > x86_64. I do not > think those regressions are acceptable on its own but I'll throw the patch > on SPEC CPU 2006 to get more data (I fear even a solution preserving the > cited regressions will regress actual code too much). Results on a x86 Haswell CPU (-Ofast -march=native -flto), base unpatched and peak patched (current trunk): Estimated Estimated Base Base BasePeak Peak Peak Benchmarks Ref. Run Time Ratio Ref. Run Time Ratio -- -- - --- - - 410.bwaves 13590170 80.0 * 13590175 77.6 * 416.gamess 19580614 31.9 * 19580614 31.9 * 433.milc 9180335 27.4 *9180338 27.2 * 434.zeusmp 9100227 40.0 *9100228 39.8 * 435.gromacs 7140244 29.2 *7140245 29.2 * 436.cactusADM 11950225 53.2 * 11950224 53.3 * 437.leslie3d 9400217 43.4 *9400225 41.8 * 444.namd 8020304 26.4 *8020302 26.5 * 447.dealII 11440201 56.8 * 11440202 56.6 * 450.soplex 8340226 36.9 *8340227 36.7 * 453.povray 5320101 52.8 *5320101 52.9 * 454.calculix 8250265 31.1 *8250265 31.1 * 459.GemsFDTD10610316 33.5 * 10610315 33.6 * 465.tonto9840258 38.1 *9840258 38.1 * 470.lbm 13740256 53.7 * 13740261 52.7 * 481.wrf 11170235 47.5 * 11170237 47.2 * 482.sphinx3 19490370 52.7 * 19490373 52.3 * Est. SPECfp_base2006 41.3 Est. SPECfp2006 41.0 400.perlbench9770249 39.2 *9770248 39.3 * 401.bzip29650388 24.9 *9650389 24.8 * 403.gcc 8050228 35.3 *8050230 35.0 * 429.mcf 9120246 37.1 *9120241 37.9 * 445.gobmk 10490388 27.1 * 10490388 27.0 * 456.hmmer9330152 61.3 *9330151 61.7 * 458.sjeng 12100426 28.4 * 12100428 28.3 * 462.libquantum 20720314 66.0 * 20720308 67.3 * 464.h264ref 22130414 53.5 * 22130414 53.4 * 471.omnetpp 6250290 21.5 *6250301 20.8 * 473.astar7020308 22.8 *7020308 22.8 * 483.xalancbmk6900180 38.4 *6900181 38.2 * Est. SPECint(R)_base2006 35.5 Est. SPECint2006 35.5 the "positive" ones are actually noise where spec median picked not the fastest run for base. There's enough negatives to not consider this simple solution. Actual consistent negatives to look at (to make sure a better solution handles required cases) are 437.leslie3d and 471.omnetpp, the rest are too much in the noise.
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #26 from rguenther at suse dot de --- On Mon, 20 Apr 2020, pascal_cuoq at hotmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 > > --- Comment #25 from Pascal Cuoq --- > Would it be reasonable to have three settings for -fstrict-aliasing, rather > then the current two? > > - off > - strict > - assume-no-reuse > > (I would let you find a better name for the third one.) I think it's clearly a GCC bug we need to fix, for users we might want to give more control in telling the compiler about loop dependences. So no, I don't like another -f[no-]strict-* option.
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #25 from Pascal Cuoq --- Would it be reasonable to have three settings for -fstrict-aliasing, rather then the current two? - off - strict - assume-no-reuse (I would let you find a better name for the third one.) It seems to me that the wrong transformation corresponds to code patterns that a C developers familiar with the compiled code would be able to tell are used or not in the compiled code: repurposing of dynamically allocated memory and access to a union through pointers. (Note: in https://trust-in-soft.com/wp-content/uploads/2017/01/vmcai.pdf we remarked that GCC is particularly user-friendly in both documenting what uses of unions are compatible with -fstrict-aliasing and in sticking to the documented behavior. The place in the GCC documentation where this is documented would already explain a lot of the context for explaining the difference between the strict and assume-no-reuse settings.) Even if you set -fstrict-aliasing=assume-no-reuse as implied by -O2, this would still be a much welcome improvement compared to the current situation. GCC would continue to be compared fairly in benchmarks to other compilers that have the same approximation, most programs would continue to work fine because they do not rely on the dangerous patterns, and those that rely on the dangerous pattern would have a way out. It would be vaguely comparable to -ffast-math. One possible psychological drawback may be that if the “strict” option exists, some users may ask why GCC does not stick to it in the -On settings.
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #24 from Richard Biener --- (In reply to Richard Biener from comment #22) > Created attachment 48311 [details] > patch > > Note that apart from the possible bad impact on optimization when fixing this > bug an actual fix is complicated by the custom "optimized" dependence > analysis > code in the loop invariant motion pass. > > A conservative "simple" patch would be the attached but that doesn't preserve > store-motion for the following (because the LIM data dependence code doesn't > care about stmt order): > > typedef int A; > typedef float B; > > void __attribute__((noinline,noclone)) > foo(A *p, B *q, long unk) > { > for (long i = 0; i < unk; ++i) { > q[i] = 42; > *p = 1; > } > } > > usually this bug doesn't manifest itself but of course the fix will be > experienced everywhere. Benchmarking the simple patch might reveal > it's not an issue (but I doubt that...). One case like this is gcc.dg/tree-ssa/pr81744.c which fails after the patch because we do not SM the global induction variable update which is already last before exit. Similarly gcc.dg/graphite/pr80906.c and gcc.target/i386/pr64110.c - that's all of the GCC testsuite fallout on x86_64. I do not think those regressions are acceptable on its own but I'll throw the patch on SPEC CPU 2006 to get more data (I fear even a solution preserving the cited regressions will regress actual code too much).
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #23 from Richard Biener --- (In reply to Richard Biener from comment #22) > Created attachment 48311 [details] > patch > > Note that apart from the possible bad impact on optimization when fixing this > bug an actual fix is complicated by the custom "optimized" dependence > analysis > code in the loop invariant motion pass. > > A conservative "simple" patch would be the attached but that doesn't preserve > store-motion for the following (because the LIM data dependence code doesn't > care about stmt order): > > typedef int A; > typedef float B; > > void __attribute__((noinline,noclone)) > foo(A *p, B *q, long unk) > { > for (long i = 0; i < unk; ++i) { > q[i] = 42; > *p = 1; > } > } > > usually this bug doesn't manifest itself but of course the fix will be > experienced everywhere. Benchmarking the simple patch might reveal > it's not an issue (but I doubt that...). Which means a better approach might be to, in addition to the existing dependence testing, verify we can sink the stores through all exits (IIRC there is/was a similar bug involving ordering of stores sunk where we'd have to preserve the order). There's again the difficult cases like for () { *p = 1; q[i] = 42; if (test) *p = 2; } which we currently sink as for () { p_1 = 1; q[i] = 42; if (test) p_1 = 2; } *p = p_1; if we want to preserve all sinking we'd have to replay all [possibly aliased] stores - again more difficult when the store we sink is in the latch (or when multiple exits are involved). For the above example do for () { p_1 = 1; q[i] = 42; if (test) p_1 = 2; } *p = p_1; q[i] = 42; if (test) *p = p_1; with scanning for valid re-orderings starting from all exits we want to consider sinking two stores that cannot be reordered. Or we want to re-think the whole store-motion data dependence code.
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #22 from Richard Biener --- Created attachment 48311 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48311=edit patch Note that apart from the possible bad impact on optimization when fixing this bug an actual fix is complicated by the custom "optimized" dependence analysis code in the loop invariant motion pass. A conservative "simple" patch would be the attached but that doesn't preserve store-motion for the following (because the LIM data dependence code doesn't care about stmt order): typedef int A; typedef float B; void __attribute__((noinline,noclone)) foo(A *p, B *q, long unk) { for (long i = 0; i < unk; ++i) { q[i] = 42; *p = 1; } } usually this bug doesn't manifest itself but of course the fix will be experienced everywhere. Benchmarking the simple patch might reveal it's not an issue (but I doubt that...).
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Richard Biener changed: What|Removed |Added CC||pascal_cuoq at hotmail dot com --- Comment #21 from Richard Biener --- *** Bug 94658 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #20 from rguenther at suse dot de --- On Fri, 17 Jan 2020, pinskia at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 > > --- Comment #19 from Andrew Pinski --- > (In reply to Richard Biener from comment #14) > > Testcase from PR81028 > > > > extern void abort(); > > > > typedef int A; > > typedef float B; > > > > void __attribute__((noinline,noclone)) > > foo(A *p, B *q, long unk) > > { > > for (long i = 0; i < unk; ++i) { > > *p = 1; > > q[i] = 42; > > } > > } > > > > I don't see how the above function cannot be optimized to what we currently > optimize it to? The C/C++ aliasing rules say p and q cannot be a related > pointer at all. > > In the inplacement new case, there is another issue going on. Locals and > changing types was a defect report against C++ (or at least I thought I saw > one). This is about "placement new" or rather the middle-end memory model. You can also simply make p/q point to different union members of the same union where then the above is still valid with unk == 0 but we mess it up, re-ordering the stores (basically we say the stores don't conflict which is not correct for WAW or WAR dependences)
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #19 from Andrew Pinski --- (In reply to Richard Biener from comment #14) > Testcase from PR81028 > > extern void abort(); > > typedef int A; > typedef float B; > > void __attribute__((noinline,noclone)) > foo(A *p, B *q, long unk) > { > for (long i = 0; i < unk; ++i) { > *p = 1; > q[i] = 42; > } > } > I don't see how the above function cannot be optimized to what we currently optimize it to? The C/C++ aliasing rules say p and q cannot be a related pointer at all. In the inplacement new case, there is another issue going on. Locals and changing types was a defect report against C++ (or at least I thought I saw one).
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Andrew Pinski changed: What|Removed |Added CC||msl023508 at gmail dot com --- Comment #18 from Andrew Pinski --- *** Bug 93298 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #17 from Richard Biener --- We'd need to peel the last iteration as the following shows. This also means it is not enough to prove the loop actually iterates. Peeling the last iteration means we have to be able to identify that iteration. Alternatively, we might be able to re-issue all stores from the last iteration up until the exit on the exit edge though if there are conditional stores on this path this might prove interesting. There's always the option to not apply store-motion here of course. extern void abort(); typedef int A; typedef float B; void __attribute__((noinline,noclone)) foo(A *p, B *q, long unk) { for (long i = 0; i < unk; ++i) { *p = 1; q[i] = 42; } } int main(void) { char *mem = __builtin_malloc (sizeof (A) * 5); foo((A *)mem + 4, (B *)mem, 5); if (*((B *)mem + 4) != 42) abort(); return 0; }
[Bug tree-optimization/57359] store motion causes wrong code for union access at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Richard Biener changed: What|Removed |Added Keywords||alias Last reconfirmed|2013-05-31 00:00:00 |2019-1-8 Summary|wrong code for union access |store motion causes wrong |at -O3 on x86_64-linux |code for union access at ||-O3 --- Comment #16 from Richard Biener --- Re-confirmed. Not really working on this - all other compilers make the same "mistake". Cost of "fixing" is quite high (peeling one iteration).