On Sun, Aug 3, 2014 at 10:40 PM, Prathamesh Kulkarni <bilbotheelffri...@gmail.com> wrote: > This patch adds the following rotate pattern: > ((T) ((T2) X << CNT1)) + ((T) ((T2) X >> CNT2)) iff CNT1 + CNT2 == B > > Depends on the following patch (not yet committed): > https://gcc.gnu.org/ml/gcc-patches/2014-08/msg00128.html > > * genmatch.c (dt_simplify::capture_max): Change value to 6. > * match-rotate.pd: Add new rotate pattern > > [gcc/testsuite/gcc.dg/tree-ssa] > * match-rotate.c (rotate_5): New test-case. > > I have come across some issues while writing the following pattern: > (X << Y) OP (X >> (B - Y)) > > I wrote the following pattern (assuming OP as plus) > (match_and_simplify > (plus (lshift @0 @1) > (rshift @0 (minus INTEGER_CST_P@2 @1)) > > (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) > && TYPE_PRECISION (type) > == GET_MODE_PRECISION (TYPE_MODE (type)) > && wi::eq_p (TYPE_PRECISION (type), @2)) > > (lrotate @0 @1)) > > a) Is the condition TYPE_PRECISION (type) == GET_MODE_...() > required ? I used it because it was also used in first pattern.
Yes, that's to guard against bitfield types - see also comments in simplify_rotate (). > I tried with the following test-case (which wasn't simplified, however > simplify_rotate simplified it correctly). > > // (X << Y) OP (X >> (B - Y)) > > unsigned > f (unsigned x, unsigned y) > { > unsigned t1 = x << y; > unsigned t2 = 32 - y; > unsigned t3 = x >> t2; > unsigned t4 = t1 + t3; > > return t4; > } > > forwprop1 input (output of ccp1 pass): > f (unsigned int x, unsigned int y) > { > unsigned int t4; > unsigned int t3; > unsigned int t2; > unsigned int t1; > int y.0_2; > int t2.1_6; > > <bb 2>: > y.0_2 = (int) y_1(D); > t1_4 = x_3(D) << y.0_2; > t2_5 = 32 - y_1(D); > t2.1_6 = (int) t2_5; > t3_7 = x_3(D) >> t2.1_6; > t4_8 = t1_4 + t3_7; > t4_9 = t4_8; > return t4_9; > > } > > It appears that implicit casts eg - (int) y_1(D) get generated > (do lshift/rshift expect signed 2nd operand ?) It's the C frontend that applies this conversion. Signed 2nd operand is not necessary. In fact removing this conversion with a separate pattern might be easiest. (for op in lshift rshift lrotate rrotate (match-and-simplify (op @0 (convert@1 @2)) if (TYPE_UNSIGNED (TREE_TYPE (@2)) && TYPE_PRECISION (TREE_TYPE (@1)) >= TYPE_PRECISION (TREE_TYPE (@2))) (op @0 @2))) > I tried with the following, however neither is this version of the > pattern matched: > > /* (X << Y) OP (X >> (B - Y)) */ > (match_and_simplify > (plus (lshift @2 (convert@0 @3)) :c on the plus? > (rshift @2 (convert@1 (minus INTEGER_CST_P@4 @3)))) Seems to literally match the IL above though (note this uses 5 captures), and it works for me: gimple_match_and_simplified to t4_8 = x_3(D) r<< y_1(D); > (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) > && TYPE_PRECISION (type) > == GET_MODE_PRECISION (TYPE_MODE (type)) > && wi::eq_p (TYPE_PRECISION (type), @4))) > (lrotate @2 @3)) > > and different types generate different sets of gimple statements > (more implicit casts in different places), > for example with unsigned char, following input was generated for > forwprop1: > > f (unsigned char x, unsigned char y) > { > unsigned char t4; > unsigned char t3; > unsigned char t2; > unsigned char t1; > int _2; > int _4; > int _5; > int _8; > int _9; > int _10; > > <bb 2>: > _2 = (int) x_1(D); > _4 = (int) y_3(D); > _5 = _2 << _4; In C all smaller types are promoted to int thus if you write char << char this gets (int)char << (int)char and only the result gets casted back to char. This is what you see here. > t1_6 = (unsigned char) _5; > t2_7 = 8 - y_3(D); Same for this, but we "optimize" that earlier from >> (int)8 - (int)y to >> (int)(8 - y) ... Conversions are the "fun" thing during pattern matching. > _8 = (int) x_1(D); > _9 = (int) t2_7; > _10 = _8 >> _9; > t3_11 = (unsigned char) _10; > t4_12 = t1_6 + t3_11; > t4_13 = t4_12; > return t4_13; > > } > > How to handle the pattern for different types ? You chose some of the more complicated transforms. Basically you need to understand the actual matching code in simplify_rotate, not just look at the patterns listed in the comment (those may not always be 100% correct). There are comments in the code that explain how the conversions appear. I think this is a good use for the conditional converts. The interesting part then is of course the proper if-expression ... Note that the first forwprop pass has a harder time than necessary for catching equivalencies as no CSE has been done. If you'd test for matching during the first CSE pass (that is the 025t.fre1 dump file) the pattern would see at least (int)x_1(D) as trivially equivalent. Note that I have to merge from trunk first to see any effect I think. I'l try to do that today. Richard. > Thanks, > Prathamesh