Re: [RFC PATCH 3/3] Misaligned MEM_REF reads

2012-02-28 Thread Richard Guenther
On Tue, 28 Feb 2012, Martin Jambor wrote:

 Hi,
 
 this patch fixes misaligned reads through MEM_REFs such as the one in
 the testcase which currently fails on both sparc64 and ia64 (again,
 the array and the loop are there to cross ia64 cache line and fail
 there too).  The patch has to be applied last in the series so that
 the current LHS expansion does not attempt to use the new code.
 
 The mechanism is again very similar, except that we call
 extract_bit_field instead now.  We do not need to care about the
 mem_ref_refers_to_non_mem_p case because that is already handled at
 this stage.  On the other hand, messing with complex types actually
 breaks currently working testcases such as the one from the previous
 patch (I have not really fully investigated what goes on so far but it
 genuinely seems to work) so again I punt for complex modes.
 
 There are two more movmisalign_optab generations in this function.
 There is the TARGET_MEM_REF case which I intend to piggy-back on in
 the same way like MEM_REF is handled in this patch once it leaves the
 RFC stage.  Finally, movmisalign_optab is also generated in
 VIEW_CONVERT_EXPR case but as far as I understand the code, misaligned
 loads are already handled there (only perhaps we should use
 SLOW_UNALIGNED_ACCESS instead of STRICT_ALIGNMENT there?).
 
 Thanks,
 
 Martin
 
 
 2012-02-28  Martin Jambor  mjam...@suse.cz
 
   * expr.c (expand_expr_real_1): handle misaligned scalar reads from
   memory through MEM_REFs by calling extract_bit_field.
 
   * testsuite/gcc.dg/misaligned-expand-1.c: New test.
 
 
 Index: src/gcc/expr.c
 ===
 --- src.orig/gcc/expr.c
 +++ src/gcc/expr.c
 @@ -9453,21 +9453,29 @@ expand_expr_real_1 (tree exp, rtx target
   if (TREE_THIS_VOLATILE (exp))
 MEM_VOLATILE_P (temp) = 1;
   if (mode != BLKmode
 -  align  GET_MODE_ALIGNMENT (mode)
 - /* If the target does not have special handling for unaligned
 -loads of mode then it can use regular moves for them.  */
 -  ((icode = optab_handler (movmisalign_optab, mode))
 - != CODE_FOR_nothing))
 +  !COMPLEX_MODE_P (mode)

I think the COMPLEX_MODE_P checks are wrong, you need to debug what
exactly goes wrong if we enter with such mode here.

Otherwise this patch looks ok as well.

Thanks,
Richard.

 +  align  GET_MODE_ALIGNMENT (mode))
 {
 - struct expand_operand ops[2];
 + if ((icode = optab_handler (movmisalign_optab, mode))
 + != CODE_FOR_nothing)
 +   {
 + struct expand_operand ops[2];
  
 - /* We've already validated the memory, and we're creating a
 -new pseudo destination.  The predicates really can't fail,
 -nor can the generator.  */
 - create_output_operand (ops[0], NULL_RTX, mode);
 - create_fixed_operand (ops[1], temp);
 - expand_insn (icode, 2, ops);
 - return ops[0].value;
 + /* We've already validated the memory, and we're creating a
 +new pseudo destination.  The predicates really can't fail,
 +nor can the generator.  */
 + create_output_operand (ops[0], NULL_RTX, mode);
 + create_fixed_operand (ops[1], temp);
 + expand_insn (icode, 2, ops);
 + return ops[0].value;
 +   }
 + else if (!COMPLEX_MODE_P (mode)
 +   SLOW_UNALIGNED_ACCESS (mode, align))
 +   temp = extract_bit_field (temp, GET_MODE_BITSIZE (mode),
 + 0, TYPE_UNSIGNED (TREE_TYPE (exp)), 
 true,
 + (modifier == EXPAND_STACK_PARM
 +  ? NULL_RTX : target),
 + mode, mode);
 }
   return temp;
}
 Index: src/gcc/testsuite/gcc.dg/misaligned-expand-1.c
 ===
 --- /dev/null
 +++ src/gcc/testsuite/gcc.dg/misaligned-expand-1.c
 @@ -0,0 +1,41 @@
 +/* Test that expand can generate correct loads of misaligned data even on
 +   strict alignment platforms.  */
 +
 +/* { dg-do run } */
 +/* { dg-options -O0 } */
 +
 +extern void abort ();
 +
 +typedef unsigned int myint __attribute__((aligned(1)));
 +
 +unsigned int
 +foo (myint *p)
 +{
 +  return *p;
 +}
 +
 +#define cst 0xdeadbeef
 +#define NUM 8
 +
 +struct blah
 +{
 +  char c;
 +  myint i[NUM];
 +};
 +
 +struct blah g;
 +
 +int
 +main (int argc, char **argv)
 +{
 +  int i, k;
 +  for (k = 0; k  NUM; k++)
 +{
 +  g.i[k] = cst;
 +  i = foo (g.i[k]);
 +
 +  if (i != cst)
 + abort ();
 +}
 +  return 0;
 +}
 
 

-- 
Richard Guenther rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

[RFC PATCH 3/3] Misaligned MEM_REF reads

2012-02-27 Thread Martin Jambor
Hi,

this patch fixes misaligned reads through MEM_REFs such as the one in
the testcase which currently fails on both sparc64 and ia64 (again,
the array and the loop are there to cross ia64 cache line and fail
there too).  The patch has to be applied last in the series so that
the current LHS expansion does not attempt to use the new code.

The mechanism is again very similar, except that we call
extract_bit_field instead now.  We do not need to care about the
mem_ref_refers_to_non_mem_p case because that is already handled at
this stage.  On the other hand, messing with complex types actually
breaks currently working testcases such as the one from the previous
patch (I have not really fully investigated what goes on so far but it
genuinely seems to work) so again I punt for complex modes.

There are two more movmisalign_optab generations in this function.
There is the TARGET_MEM_REF case which I intend to piggy-back on in
the same way like MEM_REF is handled in this patch once it leaves the
RFC stage.  Finally, movmisalign_optab is also generated in
VIEW_CONVERT_EXPR case but as far as I understand the code, misaligned
loads are already handled there (only perhaps we should use
SLOW_UNALIGNED_ACCESS instead of STRICT_ALIGNMENT there?).

Thanks,

Martin


2012-02-28  Martin Jambor  mjam...@suse.cz

* expr.c (expand_expr_real_1): handle misaligned scalar reads from
memory through MEM_REFs by calling extract_bit_field.

* testsuite/gcc.dg/misaligned-expand-1.c: New test.


Index: src/gcc/expr.c
===
--- src.orig/gcc/expr.c
+++ src/gcc/expr.c
@@ -9453,21 +9453,29 @@ expand_expr_real_1 (tree exp, rtx target
if (TREE_THIS_VOLATILE (exp))
  MEM_VOLATILE_P (temp) = 1;
if (mode != BLKmode
-align  GET_MODE_ALIGNMENT (mode)
-   /* If the target does not have special handling for unaligned
-  loads of mode then it can use regular moves for them.  */
-((icode = optab_handler (movmisalign_optab, mode))
-   != CODE_FOR_nothing))
+!COMPLEX_MODE_P (mode)
+align  GET_MODE_ALIGNMENT (mode))
  {
-   struct expand_operand ops[2];
+   if ((icode = optab_handler (movmisalign_optab, mode))
+   != CODE_FOR_nothing)
+ {
+   struct expand_operand ops[2];
 
-   /* We've already validated the memory, and we're creating a
-  new pseudo destination.  The predicates really can't fail,
-  nor can the generator.  */
-   create_output_operand (ops[0], NULL_RTX, mode);
-   create_fixed_operand (ops[1], temp);
-   expand_insn (icode, 2, ops);
-   return ops[0].value;
+   /* We've already validated the memory, and we're creating a
+  new pseudo destination.  The predicates really can't fail,
+  nor can the generator.  */
+   create_output_operand (ops[0], NULL_RTX, mode);
+   create_fixed_operand (ops[1], temp);
+   expand_insn (icode, 2, ops);
+   return ops[0].value;
+ }
+   else if (!COMPLEX_MODE_P (mode)
+ SLOW_UNALIGNED_ACCESS (mode, align))
+ temp = extract_bit_field (temp, GET_MODE_BITSIZE (mode),
+   0, TYPE_UNSIGNED (TREE_TYPE (exp)), 
true,
+   (modifier == EXPAND_STACK_PARM
+? NULL_RTX : target),
+   mode, mode);
  }
return temp;
   }
Index: src/gcc/testsuite/gcc.dg/misaligned-expand-1.c
===
--- /dev/null
+++ src/gcc/testsuite/gcc.dg/misaligned-expand-1.c
@@ -0,0 +1,41 @@
+/* Test that expand can generate correct loads of misaligned data even on
+   strict alignment platforms.  */
+
+/* { dg-do run } */
+/* { dg-options -O0 } */
+
+extern void abort ();
+
+typedef unsigned int myint __attribute__((aligned(1)));
+
+unsigned int
+foo (myint *p)
+{
+  return *p;
+}
+
+#define cst 0xdeadbeef
+#define NUM 8
+
+struct blah
+{
+  char c;
+  myint i[NUM];
+};
+
+struct blah g;
+
+int
+main (int argc, char **argv)
+{
+  int i, k;
+  for (k = 0; k  NUM; k++)
+{
+  g.i[k] = cst;
+  i = foo (g.i[k]);
+
+  if (i != cst)
+   abort ();
+}
+  return 0;
+}