On Fri, Sep 6, 2013 at 3:24 PM, Cong Hou <co...@google.com> wrote: > First, thank you for your detailed comments again! Then I deeply > apologize for not explaining my patch properly and responding to your > previous comment. I didn't understand thoroughly the problem before > submitting the patch. > > Previously I only considered the following three conversions for sqrt(): > > > 1: (float) sqrt ((double) float_val) -> sqrtf (float_val) > 2: (float) sqrtl ((long double) float_val) -> sqrtf (float_val) > 3: (double) sqrtl ((long double) double_val) -> sqrt (double_val) > > > We have four types here: > > TYPE is the type to which the result of the function call is converted. > ITYPE is the type of the math call expression. > TREE_TYPE(arg0) is the type of the function argument (before type conversion). > NEWTYPE is chosen from TYPE and TREE_TYPE(arg0) with higher precision. > It will be the type of the new math call expression after conversion. > > For all three cases above, TYPE is always the same as NEWTYPE. That is > why I only considered TYPE during the precision comparison. ITYPE can > only be double_type_node or long_double_type_node depending on the > type of the math function. That is why I explicitly used those two > types instead of ITYPE (no correctness issue). But you are right, > ITYPE is more elegant and better here. > > After further analysis, I found I missed two more cases. Note that we > have the following conditions according to the code in convert.c: > > TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TYPE) > TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TREE_TYPE(arg0)) > TYPE_PRECISION (NEWTYPE) < TYPE_PRECISION (ITYPE) > > the last condition comes from the fact that we only consider > converting a math function call into another one with narrower type. > Therefore we have > > TYPE_PRECISION(TYPE) < TYPE_PRECISION (ITYPE) > TYPE_PRECISION(TREE_TYPE(arg0)) < TYPE_PRECISION (ITYPE) > > So for sqrt(), TYPE and TREE_TYPE(arg0) can only be float, and for > sqrtl(), TYPE and TREE_TYPE(arg0) can be either float or double with > four possible combinations. Therefore we have two more conversions to > consider besides the three ones I mentioned above: > > > 4: (float) sqrtl ((long double) double_val) -> (float) sqrt (double_val) > 5: (double) sqrtl ((long double) float_val) -> sqrt ((double) float_val) > > > For the first conversion here, TYPE (float) is different from NEWTYPE > (double), and my previous patch doesn't handle this case.The correct > way is to compare precisions of ITYPE and NEWTYPE now. > > To sum up, we are converting the expression > > (TYPE) sqrtITYPE ((ITYPE) expr) > > to > > (TYPE) sqrtNEWTYPE ((NEWTYPE) expr) > > and we require > > PRECISION (ITYPE) >= PRECISION (NEWTYPE) * 2 + 2 > > to make it a safe conversion. > > > The new patch is pasted below. > > I appreciate your detailed comments and analysis, and next time when I > submit a patch I will be more carefully about the reviewer's comment. > > > Thank you! > > Cong > > > > Index: gcc/convert.c > =================================================================== > --- gcc/convert.c (revision 201891) > +++ gcc/convert.c (working copy) > @@ -135,16 +135,19 @@ convert_to_real (tree type, tree expr) > CASE_MATHFN (COS) > CASE_MATHFN (ERF) > CASE_MATHFN (ERFC) > - CASE_MATHFN (FABS) > CASE_MATHFN (LOG) > CASE_MATHFN (LOG10) > CASE_MATHFN (LOG2) > CASE_MATHFN (LOG1P) > - CASE_MATHFN (LOGB) > CASE_MATHFN (SIN) > - CASE_MATHFN (SQRT) > CASE_MATHFN (TAN) > CASE_MATHFN (TANH) > + /* The above functions are not safe to do this conversion. */ > + if (!flag_unsafe_math_optimizations) > + break; > + CASE_MATHFN (SQRT) > + CASE_MATHFN (FABS) > + CASE_MATHFN (LOGB) > #undef CASE_MATHFN > { > tree arg0 = strip_float_extensions (CALL_EXPR_ARG (expr, 0)); > @@ -155,6 +158,27 @@ convert_to_real (tree type, tree expr) > if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type)) > newtype = TREE_TYPE (arg0); > > + /* We consider to convert > + > + (T1) sqrtT2 ((T2) exprT3) > + to > + (T1) sqrtT4 ((T4) exprT3)
Should this be (T4) sqrtT4 ((T4) exprT3) > + > + , where T1 is TYPE, T2 is ITYPE, T3 is TREE_TYPE (ARG0), > + and T4 is NEWTYPE. NEWTYPE is also the wider one between T1 and T3. David > All those types are of floating point types. > + T4 (NEWTYPE) should be narrower than T2 (ITYPE). This conversion > + is safe only if P1 >= P2*2+2, where P1 and P2 are precisions of > + T2 and T4. See the following URL for a reference: > + > http://stackoverflow.com/questions/9235456/determining-floating-point-square-root > + */ > + if (fcode == BUILT_IN_SQRT || fcode == BUILT_IN_SQRTL) > + { > + int p1 = REAL_MODE_FORMAT (TYPE_MODE (itype))->p; > + int p2 = REAL_MODE_FORMAT (TYPE_MODE (newtype))->p; > + if (p1 < p2 * 2 + 2 && !flag_unsafe_math_optimizations) > + break; > + } > + > /* Be careful about integer to fp conversions. > These may overflow still. */ > if (FLOAT_TYPE_P (TREE_TYPE (arg0)) > Index: gcc/testsuite/gcc.c-torture/execute/20030125-1.c > =================================================================== > --- gcc/testsuite/gcc.c-torture/execute/20030125-1.c (revision 201891) > +++ gcc/testsuite/gcc.c-torture/execute/20030125-1.c (working copy) > @@ -44,11 +44,11 @@ __attribute__ ((noinline)) > double > sin(double a) > { > - abort (); > + return a; > } > __attribute__ ((noinline)) > float > sinf(float a) > { > - return a; > + abort (); > } > > On Wed, Sep 4, 2013 at 3:26 PM, Joseph S. Myers <jos...@codesourcery.com> > wrote: >> On Wed, 4 Sep 2013, Xinliang David Li wrote: >> >>> > This patch submission still fails to pay attention to various of my >>> > comments. >>> > >>> >>> If you can provide inlined comments in the patch, that will be more >>> useful and productive. >> >> I have explained things several times in this thread already. I see no >> point in repeating things when what I would say has already been said and >> ignored. As far as I can tell, this latest patch submission has taken one >> line from the message it is in response to, and largely ignored the >> following two paragraphs (including where I explicitly say that said line >> should not appear literally in the source code at all). But, repeating >> what I said before yet again: >> >> (but you should be referring to the relevant types >> >> The patch does not do properly that. It refers explicitly to >> long_double_type_node and double_type_node. >> >> - "type", the type >> being converted to, "itype", the type of the function being called in the >> source code, "TREE_TYPE (arg0)", the type of the argument after extensions >> have been removed, and "newtype", computed from those >> >> The patch only engages with "type". I suspect "newtype" is what it really >> means there when using "type". When it uses long_double_type_node and >> double_type_node, those should be "itype". >> >> - so you should have >> expressions like the above with two or more of those four types, but not >> with long_double_type_node directly). >> >> See above. The patch uses long_double_type_node directly, contrary to >> what I explicitly said. You are free to disagree with something I say in >> a review - but in that case you need to reply specifically to the review >> comment explaining your rationale for disagreeing with it. Just ignoring >> the comment and not mentioning the disagreement wastes the time of >> reviewers. >> >> The patch submission will need to include a proper analysis to justify to >> the reader why the particular inequality with particular types from those >> four is correct in all cases where the relevant code may be executed. >> >> The comments only refer to "T1" and "T2" without explaining how they >> relate to the original expression (three types) or the variables within >> GCC dealing with various types (four types, because newtype gets >> involved). I said back in >> <http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00161.html> and >> <http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01384.html> that three types >> are involved - when I say "the patch submission needs to include its own >> analysis for the full generality of three types", again, it's >> inappropriate for a patch to omit such an analysis without justification. >> The patch submission needs to include an analysis that properly explains >> the transformation involved and the conditions under which it is safe. >> Maybe starting along the lines of: >> >> We have an expression of the form (T1) sqrtT2 ((T2) exprT3), where exprT3 >> has type T3 (TREE_TYPE (ARG0)), T2 is the type of the floating-point >> square root function being used (ITYPE), T1 is TYPE and all these types >> are binary floating-point types. We wish to optimize if possible to an >> expression of the form (T1) sqrtT4 ((T4) exprT3), where T4 (NEWTYPE) is >> narrower than T2. (Then explain the choice of T4 and the conditions under >> which the transformation is safe, with appropriate references.) >> >> I suggest that for the next patch submission you (the patch submitter) >> make sure you spend at least an hour properly understanding the issues and >> all the previous messages in the thread and writing up the detailed, >> coherent explanation of the transformation and analysis of the issues that >> I asked for some time ago, as if for a reader who hasn't read any of this >> thread or looked at this transformation in GCC before. I've certainly >> spent longer than that on review in this thread. It's normal for a patch >> involving anything at all tricky to take hours to write up (and also >> normal for one to discover, in the course of writing the detailed coherent >> analysis for people who aren't familiar with the code and the issues >> involved, that there's in fact something wrong with the patch and it needs >> revisiting before submission). >> >> -- >> Joseph S. Myers >> jos...@codesourcery.com