Re: [PHP-DEV] Floating point comparisons - A fix

Rasmus Lerdorf Tue, 19 Mar 2002 16:56:48 -0800

That'd be a pretty serious performance hit to take.  Do you know of any
language where floating point comparisons work like that?  Certainly in
both C and Perl you will never get 0.8 to equal 0.7+0.1 exactly.


Just because nobody else does it is of course not reason enough not to do
it, but doing 2 sprintf's for every floating point comparison makes me
cringe.

-Rasmus

On Tue, 19 Mar 2002, George Whiffen wrote:

> Hi Folks,
>
> 0.8 == 0.7 + 0.1 or does it?
>
> I know I've brought this up before, but it's still driving me nuts, and
> I still don't know how to explain it to a novice, so rather than go on
> ranting I thought I'd try a fix.
>
> I'm no C programmer, and what I know about php source can be written on
> the back of a very small envelope.  But at least you can laugh at my
> appalling code, and, hopefully, explain to me why it can't go in the
> next release ;).
>
> Summary
> =======
> The fix is based on a string convert and comparison. It seems to work,
> is not too serious from a performance point of view, and only involves
> two sources, four functions and a 100 lines or so of code.  It does not
> create cumulative rounding errors, is easily controlled via an existing
> ini variable i.e. "precision" and improves the consistency of php's use
> of precision.
>
> The Bugs/Undesirable Features
> =====================
> 0.8 == 0.7 + 0.1 evaluates to false
> (int) (8.2 - 0.2) evaluates 7
> intval(8.2 - 0.2) evaluates to 7
> floor(10 * (0.7 + 0.1)) evaluates to 7
> ceil (10 * (-0.7 + -0.1) evaluates to -7
>
> Basis of the Fix
> ==========
> In general these would all evaluate correctly if they were evaluated to
> the precision specified in php.ini (typically 14 for IEEE 64bit).
>
> The fix replaces the current equality test on doubles, (a - b == 0),
> with  a string compare to 'precision' decimal places e.g.
> sprintf("%.14G",a) == sprintf("%.14G",b).  This is a modification to the
> doubles section of the Zend/zend_operators.c compare_function, which
> ultimately handles all comparisons, ==, !=, >=, >, <, <=.
>
> Floor, ceil, (int), intval(), are fixed with an equality check of their
> integer result to one above or below it, (as  appropriate) via the
> modified compare_function.
>
> e.g. floor becomes
> if  a == floor(a) + 1
> return floor(a) + 1;
> else
> return floor(a)
>
> Apart from some performance tweaks, that's about it i.e.
> Zend/zend_operators.c:compare_function
> Zend/zend_operators.c:convert_to_long_base
> ext/standard/math.c:ceil
> ext/standard/math.c:floor
>
> Testing
> =======
> The fixes seem to work. However, they have only been tested on a 4.1.2
> source under Linux 2.4.8-26mdk. They solve the problems listed above
> and  do as well as a basic cgi build of 4.1.2 on run-tests.php (i.e.
> they pass everything except pow.phpt, pear_registry.phpt and 029.phpt).
> They also seem to behave identically to an unfixed version if
> 'precision' is set high enough e.g. set_ini('precision',18).
>
>
> Backward Compatibility
> ================
> I've tried and failed to come up with a realistic scenario where this
> fix compromises existing user code.
>
> The main reason is that  string, printf, sprintf already force the
> precision set in php.ini.  This means it quite difficult for someone to
> have exploited the fact that compare_function does not.  To hit problems
> with the fix they would have to have first  decided they care about the
> digits beyond 'precision',  but do not care enough to use the bc
> library.  They then have to gone to some trouble to get hold of those
> extra digits. They could not, for instance, easily get them into a page
> or database without a conversion to string automatically removing the
> extra digits along the way.
>
> In contrast to a "round to precision after each floating point
> operation" approach, (which would be a nightmare), these fixes should
> not create any new issues with cumulative rounding errors.  All the
> changed functions already return booleans, ints, or integer-rounded
> floats.
>
> In any case, everything can easily be reverted to the old functionality
> at execution time simply by  increasing the value of precision e.g.
> set_ini('precision',18).
>
>
> Performance
> ===========
> Only doubles are effected.  Long comparisons, such as integer for-loops
> (for($i=0;$i<$sizeof($aray);++$i) etc.), are unchanged and run just as
> fast.
>
> The effect on double comparisons is not negligible.  sprintf's are
> relatively expensive in terms of performance and adding the overhead of
> two sprintfs to every single double compare would have been nasty.
>
> To minimise this the new compare_function code first of all checks that
> the operands are not already equal and are "close enough" that a string
> compare is necessary before forcing the sprintfs. The test is for (a-b)
> != 0 && ((a-b)/a) < 1e-(precision-1) e.g. (0.2 - 0.1) != 0 && ((0.2 -
> 0.1)/0.1) < 1e-13.
>
> Even when the operands are close and this test is the only overhead,
> this still means an  extra floating point division which accounts for
> nearly all the performance degradation.  compare_function is so fast
> already, that this extra division seems to make comparisons of non-equal
> doubles about twice as slow.
>
> Fortunately this is pretty small in absolute terms and when compared to
> other simple comparisons. A double comparison using the fix e.g. 0.1 ==
> 0.2, is still no slower than a single character non-numeric string
> comparison e.g. 'a' == 'b'.  They remain much faster than a numeric
> string compare e.g. '0.1' == '0.2'.
>
> When operands are close e.g. 1.2345678901234e123 == 1.2345678901233e123
> then the performance hit is much bigger as the two sprintfs and a string
> compare are required.  Even then it is still faster than the old
> workaround of casting to string before the compare i.e. (string) 0.1 ==
> (string) 0.2.
>
> Further Optimisations
> =====================
>
> The code can definitely be optimised further.
>
> The main slowdown on the ordinary, non-close, comparisons comes from the
> floating point divide which is needed to make sure that it is the
> "relative" difference not the absolute difference which is compared to
> the precision.  If the precision were converted to binary and adjusted
> by the value of the double's exponent it should be possible to avoid the
> floating point division.  But this is significantly more complicated, or
> rather I haven't worked out how to do it!
>
> The sprintf/string compare could also be improved.   For example,
> sprintf("%a") seems to be about twice as fast as sprintf("%G").
> Unfortunately %a can return un-normalized formats which would need
> tweaking to stop them failing the string comparison.
>
> There are also significant performance improvements possible in the
> floor, ceil, intval functions by not using compare_function but instead
> doing an in-line comparison.  Since the results of these functions are
> themselves integers their differences can be compared directly to
> 1e(-precision) without the floating point divide, (provided of course
> they are less than 1e13 themselves).
>
>
> Many thanks for taking the time to consider this. Any feedback will be
> much appreciated.
>
>
> George
>
> Changes to Zend/zend_operators.c:
>
> compare_function
> ============
>
> OLD CODE:
> --------------
>  if ((op1->type == IS_DOUBLE || op1->type == IS_LONG)
>   && (op2->type == IS_DOUBLE || op2->type == IS_LONG)) {
>   result->value.dval = (op1->type == IS_LONG ? (double) op1->value.lval
> : op1->value.dval) - (op2->type == IS_LONG ? (double) op2->value.lval :
> op2->value.dval);
>    result->value.lval = ZEND_NORMALIZE_BOOL(result->value.dval);
>    result->type = IS_LONG;
>    return SUCCESS;
>  }
>
> NEW CODE:
> --------------
> ...
>         double dval1, dval2;
>         char   *sval1, *sval2;
>         long   slen1, slen2;
> ...
>  if ((op1->type == IS_DOUBLE || op1->type == IS_LONG)
>   && (op2->type == IS_DOUBLE || op2->type == IS_LONG)) {
>
>           if (op1->type == IS_DOUBLE) {
>             dval1 = op1->value.dval;
>           } else {
>             dval1 = (double) op1->value.lval;
>           }
>           if (op2->type == IS_DOUBLE) {
>             dval2 = op2->value.dval;
>           } else {
>             dval2 = (double) op2->value.lval;
>           }
>           result->value.dval = dval1 - dval2;
>
>           if (result->value.dval != 0 && fabs((dval1 - dval2)/(dval1 ==
> 0 ? 1 : dval1)) < pow(10.0,0 - (EG(precision) -1))) {
>        sval1 = (char *) emalloc(MAX_LENGTH_OF_DOUBLE + EG(precision) +
> 1);
>        slen1 = sprintf(sval1, "%.*G", (int) EG(precision),dval1);  /*
> SAFE */
>        sval2 = (char *) emalloc(MAX_LENGTH_OF_DOUBLE + EG(precision) +
> 1);
>        slen2 = sprintf(sval2, "%.*G", (int) EG(precision),dval2);  /*
> SAFE */
>        if (0 == zend_binary_strcmp(sval1,slen1,sval2,slen2))
>       {
>          result->value.dval = 0;
>        }
>        efree(sval1);
>        efree(sval2);
>      }
>      result->value.lval = ZEND_NORMALIZE_BOOL(result->value.dval);
>      result->type = IS_LONG;
>      return SUCCESS;
>  }
>
>
> convert_to_long_base
> ===============
>
> OLD CODE:
> --------------
> ...
>   case IS_DOUBLE:
>    DVAL_TO_LVAL(op->value.dval, op->value.lval);
>    break;
> ...
>
> NEW CODE:
> --------------
> ...
>  zval *op1, *op2, *result;
> ...
>   case IS_DOUBLE:
>     MAKE_STD_ZVAL(op1);
>     MAKE_STD_ZVAL(op2);
>     MAKE_STD_ZVAL(result);
>     op1->type = IS_DOUBLE;
>     op2->type = IS_LONG;
>                   op1->value.dval = op->value.dval;
>     DVAL_TO_LVAL(op->value.dval, op->value.lval);
>                   if (op1->value.dval >= 0)
>                    {
>                       op2->value.lval = op->value.lval + 1;
>                     } else {
>                       op2->value.lval = op->value.lval - 1;
>                     }
>     compare_function(result, op1, op2  TSRMLS_CC);
>     if (result->value.lval == 0) {
>       op->value.lval = op2->value.lval;
>     }
>     zval_dtor(result);
>     zval_dtor(op1);
>     zval_dtor(op2);
>    break;
>
> The changes to floor and ceil in ext/standard/math.c are very similar to
> the convert_to_long_base change.
>


-- 
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Floating point comparisons - A fix

Reply via email to