On Mon, Jul 23, 2018 at 5:05 PM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Marc Glisse <marc.gli...@inria.fr> writes: > > On Fri, 20 Jul 2018, Richard Sandiford wrote: > > > >> --- gcc/match.pd 2018-07-18 18:44:22.565914281 +0100 > >> +++ gcc/match.pd 2018-07-20 11:24:33.692045585 +0100 > >> @@ -4924,3 +4924,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > >> (if (inverse_conditions_p (@0, @2) > >> && element_precision (type) == element_precision (op_type)) > >> (view_convert (cond_op @2 @3 @4 @5 (view_convert:op_type @1))))))) > >> + > >> +/* For pointers @0 and @2 and unsigned constant offset @1, look for > >> + expressions like: > >> + > >> + A: (@0 + @1 < @2) | (@2 + @1 < @0) > >> + B: (@0 + @1 <= @2) | (@2 + @1 <= @0) > >> + > >> + If pointers are known not to wrap, B checks whether @1 bytes starting > >> at > >> + @0 and @2 do not overlap, while A tests the same thing for @1 + 1 > >> bytes. > >> + A is more efficiently tested as: > >> + > >> + ((sizetype) @0 - (sizetype) @2 + @1) > (@1 * 2) > >> + > >> + as long as @1 * 2 doesn't overflow. B is the same with @1 replaced > >> + with @1 - 1. */ > >> +(for ior (truth_orif truth_or bit_ior) > >> + (for cmp (le lt) > >> + (simplify > >> + (ior (cmp (pointer_plus:s @0 INTEGER_CST@1) @2) > >> + (cmp (pointer_plus:s @2 @1) @0)) > > > > Do you want :c on cmp, in case it appears as @2 > @0 + @1 ? (may need some > > care with "cmp == LE_EXPR" below) > > Do you want :s on cmp as well? > > > >> + (if (!flag_trapv && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) > > > > Don't you want TYPE_OVERFLOW_UNDEFINED? > > Thanks, fixed below. Think the cmp == LE_EXPR stuff is still ok with :c, > since the generated code sets cmp to LE_EXPR when matching GE_EXPR. > > Tested as before. OK to install? > > Richard > > > 2018-07-23 Richard Sandiford <richard.sandif...@arm.com> > > gcc/ > * match.pd: Optimise pointer range checks. > > gcc/testsuite/ > * gcc.dg/pointer-range-check-1.c: New test. > * gcc.dg/pointer-range-check-2.c: Likewise. > > Index: gcc/match.pd > =================================================================== > --- gcc/match.pd 2018-07-23 15:56:47.000000000 +0100 > +++ gcc/match.pd 2018-07-23 15:58:33.480269844 +0100 > @@ -4924,3 +4924,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (if (inverse_conditions_p (@0, @2) > && element_precision (type) == element_precision (op_type)) > (view_convert (cond_op @2 @3 @4 @5 (view_convert:op_type @1))))))) > + > +/* For pointers @0 and @2 and unsigned constant offset @1, look for > + expressions like: > + > + A: (@0 + @1 < @2) | (@2 + @1 < @0) > + B: (@0 + @1 <= @2) | (@2 + @1 <= @0) > + > + If pointers are known not to wrap, B checks whether @1 bytes starting at > + @0 and @2 do not overlap, while A tests the same thing for @1 + 1 bytes. > + A is more efficiently tested as: > + > + ((sizetype) @0 - (sizetype) @2 + @1) > (@1 * 2) > + > + as long as @1 * 2 doesn't overflow. B is the same with @1 replaced > + with @1 - 1. */ > +(for ior (truth_orif truth_or bit_ior) > + (for cmp (le lt) > + (simplify > + (ior (cmp:cs (pointer_plus:s @0 INTEGER_CST@1) @2) > + (cmp:cs (pointer_plus:s @2 @1) @0)) > + (if (!flag_trapv && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
no need to check flag_trapv, pointer arithmetic is not covered by -ftrapv. > + /* Convert the B form to the A form. */ > + (with { offset_int off = wi::to_offset (@1) - (cmp == LE_EXPR ? 1 : 0); } > + /* Always fails for negative values. */ > + (if (wi::min_precision (off, UNSIGNED) * 2 <= TYPE_PRECISION (sizetype)) > + /* It doesn't matter whether we use @2 - @0 or @0 - @2, so let > + tree_swap_operands_p pick a canonical order. */ gimple_resimplify takes care of that - well, it doesn't since minus isn't commutative... I guess you get better CSE later when doing this thus ok, but it does look a bit off here ;) I think you shouldn't use 'sizetype' without guarding this whole thing with TYPE_PRECISION (sizetype) == TYPE_PRECISION (TREE_TYPE (@0)). Since the original IL performs an ordered compare of two eventually unrelated pointers (or pointers adjusted in a way that crosses object boundaries) (uh... ;)) I wonder if you can use POINTER_DIFF_EXPR here to avoid the sizetype conversion? Since POINTER_PLUS_EXPR offsets are supposed to be interpreted "signed" that should also handle negative offsets correctly then? Also, when @2 == @0 + (@1+1) then the original condition is true but ((sizetype) @0 - (sizetype) @2 + @1) > (@1 * 2) is not? (sizetype) @0 - (sizetype) (@0 + @1 + 1) + @1 > @1 * 2 -> -1 > @1 * 2 which is false. So I can't really see how you can apply this transform in general (it would be fine for generating alias checks but a bit more pessimistic). But maybe I am missing something? Richard. > + (with { tree ptr1 = @0, ptr2 = @2; > + if (tree_swap_operands_p (ptr1, ptr2)) > + std::swap (ptr1, ptr2); } > + (gt (plus (minus (convert:sizetype { ptr1; }) > + (convert:sizetype { ptr2; })) > + { wide_int_to_tree (sizetype, off); }) > + { wide_int_to_tree (sizetype, off * 2); })))))))) > Index: gcc/testsuite/gcc.dg/pointer-range-check-1.c > =================================================================== > --- /dev/null 2018-07-09 14:52:09.234750850 +0100 > +++ gcc/testsuite/gcc.dg/pointer-range-check-1.c 2018-07-23 > 15:58:33.480269844 +0100 > @@ -0,0 +1,37 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-ipa-icf -fdump-tree-optimized" } */ > + > +/* All four functions should be folded to: > + > + ((sizetype) a - (sizetype) b + 15) < 30. */ > + > +_Bool > +f1 (char *a, char *b) > +{ > + return (a + 16 <= b) || (b + 16 <= a); > +} > + > +_Bool > +f2 (char *a, char *b) > +{ > + return (a + 15 < b) || (b + 15 < a); > +} > + > +_Bool > +f3 (char *a, char *b) > +{ > + return (a + 16 <= b) || (b + 16 <= a); > +} > + > +_Bool > +f4 (char *a, char *b) > +{ > + return (a + 15 < b) || (b + 15 < a); > +} > + > +/* { dg-final { scan-tree-dump-times { = [^\n]* - [^\n]*;} 4 "optimized" } } > */ > +/* { dg-final { scan-tree-dump-times { = [^\n]* \+ [^\n]*;} 4 "optimized" } > } */ > +/* { dg-final { scan-tree-dump-times { = [^\n]*\ > [^\n]*;} 4 "optimized" } > } */ > +/* { dg-final { scan-tree-dump-not {=[^\n]*\ < [^\n]*;} "optimized" } } */ > +/* { dg-final { scan-tree-dump-times { \+ 15} 4 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times { > 30} 4 "optimized" } } */ > Index: gcc/testsuite/gcc.dg/pointer-range-check-2.c > =================================================================== > --- /dev/null 2018-07-09 14:52:09.234750850 +0100 > +++ gcc/testsuite/gcc.dg/pointer-range-check-2.c 2018-07-23 > 15:58:33.480269844 +0100 > @@ -0,0 +1,31 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-ipa-icf -fwrapv-pointer -fdump-tree-optimized" } */ > + > +_Bool > +f1 (char *a, char *b) > +{ > + return (a + 16 <= b) || (b + 16 <= a); > +} > + > +_Bool > +f2 (char *a, char *b) > +{ > + return (a + 15 < b) || (b + 15 < a); > +} > + > +_Bool > +f3 (char *a, char *b) > +{ > + return (a + 16 <= b) || (b + 16 <= a); > +} > + > +_Bool > +f4 (char *a, char *b) > +{ > + return (a + 15 < b) || (b + 15 < a); > +} > + > +/* { dg-final { scan-tree-dump-not { = [^\n]* - [^\n]*;} "optimized" } } */ > +/* { dg-final { scan-tree-dump-times { = [^\n]* \+ [^\n]*;} 8 "optimized" } > } */ > +/* { dg-final { scan-tree-dump-times { \+ 15} 4 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times { \+ 16} 4 "optimized" } } */