[Bug rtl-optimization/64286] Redundant extend removal ignores vector element type

2015-01-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64286

--- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org ---
Created attachment 34420
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34420action=edit
gcc5-pr64286.patch

Full patch I'm going to bootstrap/regtest.


[Bug rtl-optimization/64286] Redundant extend removal ignores vector element type

2015-01-09 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64286

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org ---
(In reply to Igor Zamyatin from comment #1)
 Perhaps something like below to restrict ree for such cases?
 
 diff --git a/gcc/ree.c b/gcc/ree.c
 index 3376901..92370ea 100644
 --- a/gcc/ree.c
 +++ b/gcc/ree.c
 @@ -1004,6 +1004,11 @@ add_removable_extension (const_rtx expr, rtx_insn
 *insn,
struct df_link *defs, *def;
ext_cand *cand;
  
 +  if (!SCALAR_INT_MODE_P (GET_MODE (dest))
 +(GET_MODE_UNIT_PRECISION (mode) !=
 +   GET_MODE_UNIT_PRECISION (GET_MODE (XEXP (src, 0)
 + return;
 +
/* First, make sure we can get all the reaching definitions.  */
defs = get_defs (insn, XEXP (src, 0), NULL);
if (!defs)

I think your patch is too restrictive.
Consider -O2 -mavx2:
typedef char __v16qi __attribute__((__vector_size__(16)));
typedef int __m128i __attribute__((__vector_size__(16)));
__m128i bar (__m128i);
typedef int __m256i __attribute__((__vector_size__(32)));
__m256i v;

void
foo (char *p)
{
  __m128i a = (__m128i)__builtin_ia32_loaddqu (p);
  __m128i ps1 = bar (a);
  v = (__m256i) __builtin_ia32_pmovzxbw256 ((__v16qi) a);
}

Here, there is:
(insn 19 9 11 2 (set (reg:V16QI 22 xmm1 [92])
(mem/c:V16QI (plus:DI (reg/f:DI 6 bp)
(const_int -32 [0xffe0])) [2 %sfp+-16 S16 A128]))
pr64286.i:12 1185 {*movv16qi_internal}
 (nil))
(insn 11 19 13 2 (set (reg:V16HI 22 xmm1 [orig:93 D.2299 ] [93])
(zero_extend:V16HI (reg:V16QI 22 xmm1 [92]))) pr64286.i:12 3826
{avx2_zero_extendv16qiv16hi2}
 (nil))
and there is no reason to restrict it.  I also don't understand the
GET_MODE_UNIT_PRECISION != GET_MODE_UNIT_PRECISION test, do you know about
SIGN_EXTEND/ZERO_EXTEND where the unit precision is the same?  That wouldn't be
an extension.
The important difference between vectors and scalars is that for scalars the
lowpart subreg of the zero/sign extended value is still the original value,
while for vectors that is not the case.  So, for vectors you can REE optimize
them only if all the uses are the same extension (zero vs. sign, and to the
same mode).

Therefore, supposedly for non-scalar modes (i.e. vector ones, other than scalar
int and vector int hopefully don't have zero/sign_extend) I think what should
be done is bail out if any of the defs has any uses that are not the sign resp.
zero extension that has been found.
We have there the:

  /* Second, make sure the reaching definitions don't feed another and
 different extension.  FIXME: this obviously can be improved.  */
  for (def = defs; def; def = def-next)
if ((idx = def_map[INSN_UID (DF_REF_INSN (def-ref))])
 (cand = (*insn_list)[idx - 1])
 cand-code != code)
  {
if (dump_file)
  {
fprintf (dump_file, Cannot eliminate extension:\n);
print_rtl_single (dump_file, insn);
fprintf (dump_file,  because of other extension\n);
  }
return;
  }

loop, perhaps for the vector modes we could add
else if (!SCALAR_INT_MODE_P (...)  idx == 0)
and in that case look using DU chains (which are supposedly computed) if any
uses of it other than the current insn are not a sign/zero extension at all or
are different extension or to different mode than the current instruction, and
in that case record some magic value to def_map (e.g -1U) and treat later that
magic def_map value as a sign that we should give up (disregard that
extension).

[Bug rtl-optimization/64286] Redundant extend removal ignores vector element type

2015-01-09 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64286

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2015-01-09
 Ever confirmed|0   |1

--- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org ---
Reduced testcase (-O2 -mavx2):
#include string.h
#include stdlib.h
#include x86intrin.h

__m128i v;
__m256i w;

__attribute__((noinline, noclone)) void
foo (__m128i *p, __m128i *q)
{
  __m128i a = _mm_loadu_si128 (p);
  __m128i b = _mm_xor_si128 (a, v);
  w = _mm256_cvtepu8_epi16 (a);
  *q = b;
}

int
main ()
{
  v = _mm_set1_epi8 (0x40);
  __m128i c = _mm_set_epi8 (16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2,
1);
  __m128i d;
  foo (c, d);
  __m128i e = _mm_set_epi8 (0x50, 0x4f, 0x4e, 0x4d, 0x4c, 0x4b, 0x4a, 0x49,
0x48, 0x47, 0x46, 0x45, 0x44, 0x43, 0x42, 0x41);
  __m256i f = _mm256_set_epi16 (16, 15, 14, 13, 12, 11, 10, 9,
8, 7, 6, 5, 4, 3, 2, 1);
  if (memcmp (w, f, sizeof (w)) != 0
  || memcmp (d, e, sizeof (d)) != 0)
abort ();
  return 0;
}


[Bug rtl-optimization/64286] Redundant extend removal ignores vector element type

2015-01-09 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64286

--- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org ---
Untested patch (just tried it on the testcase so far).

Trying to reduce the testcase now.

--- gcc/ree.c.jj2015-01-05 13:07:15.0 +0100
+++ gcc/ree.c2015-01-09 18:08:30.820754926 +0100
@@ -1021,6 +1021,7 @@ add_removable_extension (const_rtx expr,
  different extension.  FIXME: this obviously can be improved.  */
   for (def = defs; def; def = def-next)
 if ((idx = def_map[INSN_UID (DF_REF_INSN (def-ref))])
+ idx != -1U
  (cand = (*insn_list)[idx - 1])
  cand-code != code)
   {
@@ -1032,6 +1033,51 @@ add_removable_extension (const_rtx expr,
   }
 return;
   }
+else if (VECTOR_MODE_P (GET_MODE (XEXP (src, 0
+  {
+if (idx == 0)
+  {
+struct df_link *ref_chain, *ref_link;
+
+ref_chain = DF_REF_CHAIN (def-ref);
+for (ref_link = ref_chain; ref_link; ref_link = ref_link-next)
+  {
+if (ref_link-ref == NULL
+|| DF_REF_INSN_INFO (ref_link-ref) == NULL)
+  {
+idx = -1U;
+break;
+  }
+rtx_insn *use_insn = DF_REF_INSN (ref_link-ref);
+const_rtx use_set;
+if (use_insn == insn || DEBUG_INSN_P (use_insn))
+  continue;
+if (!(use_set = single_set (use_insn))
+|| !REG_P (SET_DEST (use_set))
+|| GET_MODE (SET_DEST (use_set)) != GET_MODE (dest)
+|| GET_CODE (SET_SRC (use_set)) != code
+|| !rtx_equal_p (XEXP (SET_SRC (use_set), 0),
+ XEXP (src, 0)))
+  {
+idx = -1U;
+break;
+  }
+  }
+if (idx == -1U)
+  def_map[INSN_UID (DF_REF_INSN (def-ref))] = idx;
+  }
+if (idx == -1U)
+  {
+if (dump_file)
+  {
+fprintf (dump_file, Cannot eliminate extension:\n);
+print_rtl_single (dump_file, insn);
+fprintf (dump_file,
+  because some vector uses aren't extension\n);
+  }
+return;
+  }
+  }

   /* Then add the candidate to the list and insert the reaching
definitions
  into the definition map.  */


[Bug rtl-optimization/64286] Redundant extend removal ignores vector element type

2015-01-08 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64286

--- Comment #2 from Jeffrey A. Law law at redhat dot com ---
On 12/16/14 09:10, izamyatin at gmail dot com wrote:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64286

 --- Comment #1 from Igor Zamyatin izamyatin at gmail dot com ---
 Perhaps something like below to restrict ree for such cases?

 diff --git a/gcc/ree.c b/gcc/ree.c
 index 3376901..92370ea 100644
 --- a/gcc/ree.c
 +++ b/gcc/ree.c
 @@ -1004,6 +1004,11 @@ add_removable_extension (const_rtx expr, rtx_insn 
 *insn,
 struct df_link *defs, *def;
 ext_cand *cand;

 +  if (!SCALAR_INT_MODE_P (GET_MODE (dest))
 +   (GET_MODE_UNIT_PRECISION (mode) !=
 +  GET_MODE_UNIT_PRECISION (GET_MODE (XEXP (src, 0)
 +return;
 +
 /* First, make sure we can get all the reaching definitions.  */
 defs = get_defs (insn, XEXP (src, 0), NULL);
 if (!defs)

Funny, this is exactly the situation Jakub was concerned about in a 
comment for PR 59754.

I think you should remove the SCALAR_INT_MODE_P test in 
combine_reaching_defs which I think should be dead code if we filter 
things in add_removable_extension.


If you could make that change and confirm with Kirill that avx512f still 
looks good (in particular avx512f-vpmovzxwd-2.c) as well as doing a 
bootstrap and regression test, it'd be appreciated.

Thanks,
jeff


[Bug rtl-optimization/64286] Redundant extend removal ignores vector element type

2014-12-16 Thread izamyatin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64286

--- Comment #1 from Igor Zamyatin izamyatin at gmail dot com ---
Perhaps something like below to restrict ree for such cases?

diff --git a/gcc/ree.c b/gcc/ree.c
index 3376901..92370ea 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -1004,6 +1004,11 @@ add_removable_extension (const_rtx expr, rtx_insn *insn,
   struct df_link *defs, *def;
   ext_cand *cand;

+  if (!SCALAR_INT_MODE_P (GET_MODE (dest))
+   (GET_MODE_UNIT_PRECISION (mode) !=
+  GET_MODE_UNIT_PRECISION (GET_MODE (XEXP (src, 0)
+return;
+
   /* First, make sure we can get all the reaching definitions.  */
   defs = get_defs (insn, XEXP (src, 0), NULL);
   if (!defs)