[Bug target/43118] vld4 and vst4 intrinsics are not handled correctly
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43118 rsand...@gcc.gnu.org changed: What|Removed |Added Status|NEW |RESOLVED CC||rsandifo at gcc dot gnu.org Resolution||FIXED --- Comment #8 from rsandifo at gcc dot gnu.org 2011-07-19 08:41:04 UTC --- (In reply to comment #7) > A recent version of 4.6.1 at O1 appears to give me . That would indicate this > is fixed in trunk. Yeah, the bug was fixed as part of the load-lanes stuff. Since it isn't a regression, and since the fix is too invasive to backport, I hope it's OK to close as fixed.
[Bug target/43118] vld4 and vst4 intrinsics are not handled correctly
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43118 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org Known to fail|| --- Comment #7 from Ramana Radhakrishnan 2011-07-08 11:57:04 UTC --- A recent version of 4.6.1 at O1 appears to give me . That would indicate this is fixed in trunk. blend4: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. vld4.8{d16-d19}, [r0] vst4.8{d16-d19}, [r1] bxlr .sizeblend4, .-blend4 .ident"GCC: (GNU) 4.7.0 20110616 (experimental)" .section.note.GNU-stack,"",%progbits Ramana
[Bug target/43118] vld4 and vst4 intrinsics are not handled correctly
--- Comment #6 from generalruzzmo at gmail dot com 2010-09-15 20:54 --- this bug is bugging me too.. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43118
[Bug target/43118] vld4 and vst4 intrinsics are not handled correctly
--- Comment #5 from justin dot lebar+bug at gmail dot com 2010-04-28 21:56 --- Is there a workaround for this, short of writing inline assembly? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43118
[Bug target/43118] vld4 and vst4 intrinsics are not handled correctly
-- ramana at gcc dot gnu dot org changed: What|Removed |Added Severity|normal |enhancement Keywords||missed-optimization http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43118
[Bug target/43118] vld4 and vst4 intrinsics are not handled correctly
--- Comment #4 from rguenth at gcc dot gnu dot org 2010-02-23 10:42 --- (In reply to comment #3) > Subject: Re: vld4 and vst4 intrinsics are not handled > correctly > > On Fri, Feb 19, 2010 at 11:08:18AM -, rguenth at gcc dot gnu dot org > wrote: > > Likely because of the union in > > > > __extension__ static __inline void __attribute__ ((__always_inline__)) > > vst4_u8 (uint8_t * __a, uint8x8x4_t __b) > > { > > union { uint8x8x4_t __i; __builtin_neon_oi __o; } __bu = { __b }; > > __builtin_neon_vst4v8qi ((__builtin_neon_qi *) __a, __bu.__o); > > } > > > > which does copy-initialization of __bu. > > Right. FYI, my best idea to date of how to fix this is to convert the > multiple-vector types (like uint8x8x4_t) to builtin types. At that > point we can use the neon_reinterpret patterns to do the necessary > type punning without involving __builtin_neon_oi and the union. Ideally we'd be able to get rid of the extra temporary at the tree level. Value-numbering can in theory do that, but I suppose the testcase at hand is obfuscated enough to not do it. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43118
[Bug target/43118] vld4 and vst4 intrinsics are not handled correctly
--- Comment #3 from drow at gcc dot gnu dot org 2010-02-22 21:14 --- Subject: Re: vld4 and vst4 intrinsics are not handled correctly On Fri, Feb 19, 2010 at 11:08:18AM -, rguenth at gcc dot gnu dot org wrote: > Likely because of the union in > > __extension__ static __inline void __attribute__ ((__always_inline__)) > vst4_u8 (uint8_t * __a, uint8x8x4_t __b) > { > union { uint8x8x4_t __i; __builtin_neon_oi __o; } __bu = { __b }; > __builtin_neon_vst4v8qi ((__builtin_neon_qi *) __a, __bu.__o); > } > > which does copy-initialization of __bu. Right. FYI, my best idea to date of how to fix this is to convert the multiple-vector types (like uint8x8x4_t) to builtin types. At that point we can use the neon_reinterpret patterns to do the necessary type punning without involving __builtin_neon_oi and the union. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43118
[Bug target/43118] vld4 and vst4 intrinsics are not handled correctly
--- Comment #2 from ramana at gcc dot gnu dot org 2010-02-19 13:45 --- Trunk behaves similarly - I wonder if this is similar to 41021. Here's what trunk generates. push{r4, r5, r6, r7} vld4.8 {d16-d19}, [r0] sub sp, sp, #96 mov r7, r1 vstmia sp, {d16-d19} mov r6, sp add r5, sp, #64 add ip, sp, #32 ldmia r6!, {r0, r1, r2, r3} mov r4, r5 stmia r5!, {r0, r1, r2, r3} ldmia r6, {r0, r1, r2, r3} stmia r5, {r0, r1, r2, r3} ldmia r4!, {r0, r1, r2, r3} stmia ip!, {r0, r1, r2, r3} ldmia r4, {r0, r1, r2, r3} stmia ip, {r0, r1, r2, r3} add r3, sp, #32 vldmia r3, {d16-d19} vst4.8 {d16-d19}, [r7] add sp, sp, #96 pop {r4, r5, r6, r7} bx lr -- ramana at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Known to fail||4.4.3 4.5.0 Last reconfirmed|-00-00 00:00:00 |2010-02-19 13:45:57 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43118
[Bug target/43118] vld4 and vst4 intrinsics are not handled correctly
--- Comment #1 from rguenth at gcc dot gnu dot org 2010-02-19 11:08 --- Likely because of the union in __extension__ static __inline void __attribute__ ((__always_inline__)) vst4_u8 (uint8_t * __a, uint8x8x4_t __b) { union { uint8x8x4_t __i; __builtin_neon_oi __o; } __bu = { __b }; __builtin_neon_vst4v8qi ((__builtin_neon_qi *) __a, __bu.__o); } which does copy-initialization of __bu. Also try GCC 4.5. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43118