[Bug target/23570] [4.0/4.1 Regression] internal compiler error: in merge_assigned_reloads, at reload1.c:6091
--- Additional Comments From cvs-commit at gcc dot gnu dot org 2005-08-31 17:28 --- Subject: Bug 23570 CVSROOT:/cvs/gcc Module name:gcc Changes by: [EMAIL PROTECTED] 2005-08-31 17:27:54 Modified files: gcc: ChangeLog gcc/config/i386: sse.md Added files: gcc/testsuite/gcc.target/i386: pr23570.c Log message: PR target/23570 * config/i386/sse.md (*sse_concatv2sf): Change operand 2 constraint to "reg_or_0_operand". (sse2_loadld): Change operand 1 constraint to "reg_or_0_operand". Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.9863&r2=2.9864 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/sse.md.diff?cvsroot=gcc&r1=1.23&r2=1.24 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/gcc.target/i386/pr23570.c.diff?cvsroot=gcc&r1=NONE&r2=1.1 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23570
[Bug target/23570] [4.0/4.1 Regression] internal compiler error: in merge_assigned_reloads, at reload1.c:6091
--- Additional Comments From uros at kss-loka dot si 2005-08-31 08:48 --- Patch. -- What|Removed |Added AssignedTo|unassigned at gcc dot gnu |uros at kss-loka dot si |dot org | URL||http://gcc.gnu.org/ml/gcc- ||patches/2005- ||08/msg01819.html Status|NEW |ASSIGNED Keywords||patch Last reconfirmed|2005-08-26 03:36:35 |2005-08-31 08:48:43 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23570
[Bug target/23570] [4.0/4.1 Regression] internal compiler error: in merge_assigned_reloads, at reload1.c:6091
--- Additional Comments From uros at kss-loka dot si 2005-08-26 09:35 --- (In reply to comment #3) > Unfortunatelly, ludcompf() result (the second one) is wrong when -O1 or -O2 > is used. It is correct without optimizations. This is a problem of infamous i387 precision handling. The error can be found in this part of the code: ... if (u.sf[1] > t) { t = u.sf[1]; n = 1; } if (u.sf[2] > t) { t = u.sf[2]; n = 2; } if (u.sf[3] > t) { t = u.sf[3]; n = 3; } ... Without optimizations, the values of u.sf[1] and t that are at some moment loaded into x87 registers are: u.sf[1] = 1.00119... t = 0.99880... and branch is taken. However, with optimizations, the values are different: u.sf[1] = 0.99642... t = 0.99821... This is a problem of the i387 design and not the problem of gcc. In your case, you should use -ffloat-store or -mfpmath=sse. BTW: At the moment, I have very limited time, so I won't be able to create a patch to fix the ICE for some time... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23570
[Bug target/23570] [4.0/4.1 Regression] internal compiler error: in merge_assigned_reloads, at reload1.c:6091
--- Additional Comments From uros at kss-loka dot si 2005-08-26 07:50 --- The problem here is in the sse_concatv2sf pattern: ;; ??? In theory we can match memory for the MMX alternative, but allowing ;; nonimmediate_operand for operand 2 and *not* allowing memory for the SSE ;; alternatives pretty much forces the MMX alternative to be chosen. (define_insn "*sse_concatv2sf" [(set (match_operand:V2SF 0 "register_operand" "=x,x,*y,*y") (vec_concat:V2SF (match_operand:SF 1 "nonimmediate_operand" " 0,m, 0, m") (match_operand:SF 2 "vector_move_operand" " x,C,*y, C")))] and "vector_move_operand" operand constraint, defined as: ;; Return 1 when OP is operand acceptable for standard SSE move. (define_predicate "vector_move_operand" (ior (match_operand 0 "nonimmediate_operand") (match_operand 0 "const0_operand"))) Please note, that "vector_move_operand" allows memory operands, but register constraint doesn't. So, following pattern confuses reload: (insn:HI 63 62 64 3 (set (reg:V2SF 21 xmm0 [117]) (vec_concat:V2SF (mem:SF (plus:SI (plus:SI (reg/f:SI 68 [ ivtmp.71 ]) (reg:SI 88 [ D.1795 ])) (const_int -4 [0xfffc])) [2 S4 A32]) (mem:SF (plus:SI (plus:SI (reg/f:SI 68 [ ivtmp.71 ]) (reg:SI 89 [ D.1800 ])) (const_int -4 [0xfffc])) [2 S4 A32]))) 612 {*sse_concatv2sf} (nil) (BTW: "sse2_loadld" pattern could have the same problem, no "m" register constraint.) The immediate fix would be to define another operand constraint, similar to "vector_move_operand": ;; Same as above, but excluding memory operands. (define_predicate "vector_move_nomem_operand" (ior (match_operand 0 "register_operand") (match_operand 0 "const0_operand"))) When operand 2 of sse_concatv2sf pattern is constrained with this new constraint, gcc is able to compile both testcases, and following result is produced (for both -01 and -02): ludcompd(): SSE2 code is used. 1.00 4.00 5.00 3.00 -2.80 0.80 -1.60 -1.00 -1.00 0 2 1 3 0 5.000 6.000 10.000 78.000 0.800 -2.800 -7.000 -55.400 0.600 0.571 -1.000 -15.143 0.200 -0.286 1.000 -12.286 ludcompf(): SSE2 code is used. 1 2 1 0 3 5.000 6.000 10.000 78.000 0.800 -2.800 -7.000 -55.400 0.200 -0.286 -1.000 -27.429 0.600 0.571 1.000 12.286 Unfortunatelly, ludcompf() result (the second one) is wrong when -O1 or -O2 is used. It is correct without optimizations. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23570
[Bug target/23570] [4.0/4.1 Regression] internal compiler error: in merge_assigned_reloads, at reload1.c:6091
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-08-26 03:36 --- Reduced as far as I can get this: typedef float __v4sf __attribute__ ((__vector_size__ (16))); typedef float __m128 __attribute__ ((__vector_size__ (16))); static __inline __m128 _mm_cmpeq_ps (__m128 __A, __m128 __B) { return (__m128) __builtin_ia32_cmpeqps ((__v4sf)__A, (__v4sf)__B); } static __inline __m128 _mm_setr_ps (float __Z, float __Y, float __X, float __W) { return __extension__ (__m128)(__v4sf){__Z, __Y, __X, __W }; } typedef long long __v2di __attribute__ ((__vector_size__ (16))); static __inline __m128 _mm_and_si128 (__m128 __A, __m128 __B) { return (__m128)__builtin_ia32_pand128 ((__v2di)__A, (__v2di)__B); } static __inline __m128 _mm_or_si128 (__m128 __A, __m128 __B) { return (__m128)__builtin_ia32_por128 ((__v2di)__A, (__v2di)__B); } typedef union { __m128 xmmi; int si[4]; } __attribute__ ((aligned(16))) um128; um128 u; static inline int sse_max_abs_indexf(float *v, int step, int n) { __m128 m1, mm; __m128 mim, mi, msk; um128 u, ui; int n4, step2, step3; mm = __builtin_ia32_andps((__m128)(__v4sf){0.0, v[step], v[step2], v[step3]}, u.xmmi); if (n4) { int i; for (i = 0; i < n4; ++i) ; msk = (__m128)_mm_cmpeq_ps(m1, mm); mim = _mm_or_si128(_mm_and_si128(msk, mi), mim); } ui.xmmi = (__m128)mim; return ui.si[n]; } static void sse_swap_rowf(float *r1, float *r2, int n) { int n4 = (n / 4) * 4; float *r14end = r1 + n4; while (r1 < r14end) { *r1 = *r2; r1++; } } void ludcompf(float *m, int nw, int *prow, int n) { int i, s = 0; float *pm; for (i = 0, pm = m; i < n - 1; ++i, pm += nw) { int vi = sse_max_abs_indexf(pm + i, nw, n - i); float *pt; int j; if (vi != 0) { sse_swap_rowf(pm, pm + vi * nw, nw); swap_index(prow, i, i + vi); } for (j = i + 1, pt = pm + nw; j < n; ++j, pt += nw) sse_add_rowf(pt + i + 1, pm + i + 1, -1.0, n - i - 1); } } -- What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed||1 Last reconfirmed|-00-00 00:00:00 |2005-08-26 03:36:35 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23570
[Bug target/23570] [4.0/4.1 Regression] internal compiler error: in merge_assigned_reloads, at reload1.c:6091
-- What|Removed |Added Summary|[4.0/4.1 Regression]|[4.0/4.1 Regression] |internal compiler error: in |internal compiler error: in |import_export_decl, at |merge_assigned_reloads, at |cp/decl2.c:1726 |reload1.c:6091 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23570