[Bug target/23570] [4.0/4.1 Regression] internal compiler error: in merge_assigned_reloads, at reload1.c:6091

2005-08-31 Thread cvs-commit at gcc dot gnu dot org

--- Additional Comments From cvs-commit at gcc dot gnu dot org  2005-08-31 
17:28 ---
Subject: Bug 23570

CVSROOT:/cvs/gcc
Module name:gcc
Changes by: [EMAIL PROTECTED]   2005-08-31 17:27:54

Modified files:
gcc: ChangeLog 
gcc/config/i386: sse.md 
Added files:
gcc/testsuite/gcc.target/i386: pr23570.c 

Log message:
PR target/23570
* config/i386/sse.md (*sse_concatv2sf): Change operand 2 constraint
to "reg_or_0_operand".
(sse2_loadld): Change operand 1 constraint to "reg_or_0_operand".

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.9863&r2=2.9864
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/sse.md.diff?cvsroot=gcc&r1=1.23&r2=1.24
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/gcc.target/i386/pr23570.c.diff?cvsroot=gcc&r1=NONE&r2=1.1



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23570


[Bug target/23570] [4.0/4.1 Regression] internal compiler error: in merge_assigned_reloads, at reload1.c:6091

2005-08-31 Thread uros at kss-loka dot si

--- Additional Comments From uros at kss-loka dot si  2005-08-31 08:48 
---
Patch.

-- 
   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |uros at kss-loka dot si
   |dot org |
URL||http://gcc.gnu.org/ml/gcc-
   ||patches/2005-
   ||08/msg01819.html
 Status|NEW |ASSIGNED
   Keywords||patch
   Last reconfirmed|2005-08-26 03:36:35 |2005-08-31 08:48:43
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23570


[Bug target/23570] [4.0/4.1 Regression] internal compiler error: in merge_assigned_reloads, at reload1.c:6091

2005-08-26 Thread uros at kss-loka dot si

--- Additional Comments From uros at kss-loka dot si  2005-08-26 09:35 
---
(In reply to comment #3)

> Unfortunatelly, ludcompf() result (the second one) is wrong when -O1 or -O2
> is used. It is correct without optimizations.

This is a problem of infamous i387 precision handling. The error can be found 
in this part of the code:

  ...
  if (u.sf[1] > t) { t = u.sf[1]; n = 1; }
  if (u.sf[2] > t) { t = u.sf[2]; n = 2; }
  if (u.sf[3] > t) { t = u.sf[3]; n = 3; }
  ...

Without optimizations, the values of u.sf[1] and t that are at some moment 
loaded into x87 registers are:
u.sf[1] = 1.00119...
  t = 0.99880...

and branch is taken. However, with optimizations, the values are different:

u.sf[1] = 0.99642...
  t = 0.99821...

This is a problem of the i387 design and not the problem of gcc. In your case, 
you should use -ffloat-store or -mfpmath=sse.

BTW: At the moment, I have very limited time, so I won't be able to create a 
patch to fix the ICE for some time...

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23570


[Bug target/23570] [4.0/4.1 Regression] internal compiler error: in merge_assigned_reloads, at reload1.c:6091

2005-08-26 Thread uros at kss-loka dot si

--- Additional Comments From uros at kss-loka dot si  2005-08-26 07:50 
---
The problem here is in the sse_concatv2sf pattern:

;; ??? In theory we can match memory for the MMX alternative, but allowing
;; nonimmediate_operand for operand 2 and *not* allowing memory for the SSE
;; alternatives pretty much forces the MMX alternative to be chosen.
(define_insn "*sse_concatv2sf"
  [(set (match_operand:V2SF 0 "register_operand" "=x,x,*y,*y")
(vec_concat:V2SF
  (match_operand:SF 1 "nonimmediate_operand" " 0,m, 0, m")
  (match_operand:SF 2 "vector_move_operand"  " x,C,*y, C")))]

and "vector_move_operand" operand constraint, defined as:

;; Return 1 when OP is operand acceptable for standard SSE move.
(define_predicate "vector_move_operand"
  (ior (match_operand 0 "nonimmediate_operand")
   (match_operand 0 "const0_operand")))

Please note, that "vector_move_operand" allows memory operands, but register 
constraint doesn't. So, following pattern confuses reload:

(insn:HI 63 62 64 3 (set (reg:V2SF 21 xmm0 [117])
(vec_concat:V2SF (mem:SF (plus:SI (plus:SI (reg/f:SI 68 [ ivtmp.71 ])
(reg:SI 88 [ D.1795 ]))
(const_int -4 [0xfffc])) [2 S4 A32])
(mem:SF (plus:SI (plus:SI (reg/f:SI 68 [ ivtmp.71 ])
(reg:SI 89 [ D.1800 ]))
(const_int -4 [0xfffc])) [2 S4 A32]))) 612 
{*sse_concatv2sf} (nil)

(BTW: "sse2_loadld" pattern could have the same problem, no "m" register 
constraint.)

The immediate fix would be to define another operand constraint, similar 
to "vector_move_operand":

;; Same as above, but excluding memory operands.
(define_predicate "vector_move_nomem_operand"
  (ior (match_operand 0 "register_operand")
   (match_operand 0 "const0_operand")))

When operand 2 of sse_concatv2sf pattern is constrained with this new 
constraint, gcc is able to compile both testcases, and following result is 
produced (for both -01 and -02):

ludcompd(): SSE2 code is used.
1.00 4.00 5.00 3.00 
-2.80 0.80 -1.60 
-1.00 -1.00 
0
2 1 3 0
5.000 6.000 10.000 78.000
0.800 -2.800 -7.000 -55.400
0.600 0.571 -1.000 -15.143
0.200 -0.286 1.000 -12.286
ludcompf(): SSE2 code is used.
1
2 1 0 3
5.000 6.000 10.000 78.000
0.800 -2.800 -7.000 -55.400
0.200 -0.286 -1.000 -27.429
0.600 0.571 1.000 12.286

Unfortunatelly, ludcompf() result (the second one) is wrong when -O1 or -O2 is 
used. It is correct without optimizations.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23570


[Bug target/23570] [4.0/4.1 Regression] internal compiler error: in merge_assigned_reloads, at reload1.c:6091

2005-08-25 Thread pinskia at gcc dot gnu dot org

--- Additional Comments From pinskia at gcc dot gnu dot org  2005-08-26 
03:36 ---
Reduced as far as I can get this:
typedef float __v4sf __attribute__ ((__vector_size__ (16)));
typedef float __m128 __attribute__ ((__vector_size__ (16)));
static __inline __m128 _mm_cmpeq_ps (__m128 __A, __m128 __B)
{
  return (__m128) __builtin_ia32_cmpeqps ((__v4sf)__A, (__v4sf)__B);
}
static __inline __m128 _mm_setr_ps (float __Z, float __Y, float __X, float __W)
{
  return __extension__ (__m128)(__v4sf){__Z, __Y, __X, __W };
}
typedef long long __v2di __attribute__ ((__vector_size__ (16)));
static __inline __m128 _mm_and_si128 (__m128 __A, __m128 __B) {
  return (__m128)__builtin_ia32_pand128 ((__v2di)__A, (__v2di)__B);
}
static __inline __m128 _mm_or_si128 (__m128 __A, __m128 __B) {
  return (__m128)__builtin_ia32_por128 ((__v2di)__A, (__v2di)__B);
}
typedef union { __m128 xmmi; int si[4]; } __attribute__ ((aligned(16))) um128;
um128 u;
static inline int sse_max_abs_indexf(float *v, int step, int n)
{
  __m128 m1, mm;
  __m128 mim, mi, msk;
  um128 u, ui;
  int n4, step2, step3;
  mm = __builtin_ia32_andps((__m128)(__v4sf){0.0, v[step], v[step2], v[step3]},
u.xmmi);
  if (n4) {
int i;
for (i = 0; i < n4;  ++i) ;
msk = (__m128)_mm_cmpeq_ps(m1, mm);
mim = _mm_or_si128(_mm_and_si128(msk, mi), mim);
  }
  ui.xmmi = (__m128)mim;
  return ui.si[n];
}
static void sse_swap_rowf(float *r1, float *r2, int n) {
  int n4 = (n / 4) * 4;
  float *r14end = r1 + n4;
  while (r1 < r14end) {
*r1 = *r2;
r1++;
  }
}
void ludcompf(float *m, int nw, int *prow, int n) {
  int i, s = 0;
  float *pm;
  for (i = 0, pm = m; i < n - 1; ++i, pm += nw)
  {
int vi = sse_max_abs_indexf(pm + i, nw, n - i);
float *pt;
int j;
if (vi != 0)
{
  sse_swap_rowf(pm, pm + vi * nw, nw);
  swap_index(prow, i, i + vi);
}
for (j = i + 1, pt = pm + nw; j < n; ++j, pt += nw)
  sse_add_rowf(pt + i + 1, pm + i + 1, -1.0, n - i - 1);
  }
}


-- 
   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed||1
   Last reconfirmed|-00-00 00:00:00 |2005-08-26 03:36:35
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23570


[Bug target/23570] [4.0/4.1 Regression] internal compiler error: in merge_assigned_reloads, at reload1.c:6091

2005-08-25 Thread pinskia at gcc dot gnu dot org


-- 
   What|Removed |Added

Summary|[4.0/4.1 Regression]|[4.0/4.1 Regression]
   |internal compiler error: in |internal compiler error: in
   |import_export_decl, at  |merge_assigned_reloads, at
   |cp/decl2.c:1726 |reload1.c:6091


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23570