Re: [PHP-DEV] Floating point comparisons - A fix

George Whiffen Wed, 20 Mar 2002 04:05:46 -0800

Alan Knowles wrote:

> wouldnt ~==  ,~>, ~<, ~>= .... etc. be relivant here -
> from what I remember of maths, (to long ago) ~ was the symbol for
> similar (and 2 together for approximatly equal) - hey even found a
> reference for it...
> http://www.gomath.com/htdocs/ToGosheet/algebra/mathsymbols.html

I certainly wouldn't advocate any new operators for this.  The point is to
make string and floating point comparison consistent for php, and thereby, by
default, remove the recurring binary representation of decimal issue i.e. so
that the following all give the same result:

0.8 == 0.7 + 0.1
('Answer='. 0.8) == ('Answer=' . (0.7 + 0.1))
(string) 0.8 == (string) 0.7 + 0.1
(int) (10 * 0.8) == (int) (10 * (0.7 + 0.1))
print 0.8 and print 0.7 + 0.1
etc.

Introducing new operators would only make php more "typed" rather than
less.George


> the only thing is it would confuse perl people :) =~ (it;s the regex
> symbol in perl....)
>
> regards
> alan
>
> Rasmus Lerdorf wrote:
>
> >That'd be a pretty serious performance hit to take.  Do you know of any
> >language where floating point comparisons work like that?  Certainly in
> >both C and Perl you will never get 0.8 to equal 0.7+0.1 exactly.
> >
> >Just because nobody else does it is of course not reason enough not to do
> >it, but doing 2 sprintf's for every floating point comparison makes me
> >cringe.
> >
> >-Rasmus
> >
> >On Tue, 19 Mar 2002, George Whiffen wrote:
> >
> >>Hi Folks,
> >>
> >>0.8 == 0.7 + 0.1 or does it?
> >>
> >>I know I've brought this up before, but it's still driving me nuts, and
> >>I still don't know how to explain it to a novice, so rather than go on
> >>ranting I thought I'd try a fix.
> >>
> >>I'm no C programmer, and what I know about php source can be written on
> >>the back of a very small envelope.  But at least you can laugh at my
> >>appalling code, and, hopefully, explain to me why it can't go in the
> >>next release ;).
> >>
> >>Summary
> >>=======
> >>The fix is based on a string convert and comparison. It seems to work,
> >>is not too serious from a performance point of view, and only involves
> >>two sources, four functions and a 100 lines or so of code.  It does not
> >>create cumulative rounding errors, is easily controlled via an existing
> >>ini variable i.e. "precision" and improves the consistency of php's use
> >>of precision.
> >>
> >>The Bugs/Undesirable Features
> >>=====================
> >>0.8 == 0.7 + 0.1 evaluates to false
> >>(int) (8.2 - 0.2) evaluates 7
> >>intval(8.2 - 0.2) evaluates to 7
> >>floor(10 * (0.7 + 0.1)) evaluates to 7
> >>ceil (10 * (-0.7 + -0.1) evaluates to -7
> >>
> >>Basis of the Fix
> >>==========
> >>In general these would all evaluate correctly if they were evaluated to
> >>the precision specified in php.ini (typically 14 for IEEE 64bit).
> >>
> >>The fix replaces the current equality test on doubles, (a - b == 0),
> >>with  a string compare to 'precision' decimal places e.g.
> >>sprintf("%.14G",a) == sprintf("%.14G",b).  This is a modification to the
> >>doubles section of the Zend/zend_operators.c compare_function, which
> >>ultimately handles all comparisons, ==, !=, >=, >, <, <=.
> >>
> >>Floor, ceil, (int), intval(), are fixed with an equality check of their
> >>integer result to one above or below it, (as  appropriate) via the
> >>modified compare_function.
> >>
> >>e.g. floor becomes
> >>if  a == floor(a) + 1
> >>return floor(a) + 1;
> >>else
> >>return floor(a)
> >>
> >>Apart from some performance tweaks, that's about it i.e.
> >>Zend/zend_operators.c:compare_function
> >>Zend/zend_operators.c:convert_to_long_base
> >>ext/standard/math.c:ceil
> >>ext/standard/math.c:floor
> >>
> >>Testing
> >>=======
> >>The fixes seem to work. However, they have only been tested on a 4.1.2
> >>source under Linux 2.4.8-26mdk. They solve the problems listed above
> >>and  do as well as a basic cgi build of 4.1.2 on run-tests.php (i.e.
> >>they pass everything except pow.phpt, pear_registry.phpt and 029.phpt).
> >>They also seem to behave identically to an unfixed version if
> >>'precision' is set high enough e.g. set_ini('precision',18).
> >>
> >>
> >>Backward Compatibility
> >>================
> >>I've tried and failed to come up with a realistic scenario where this
> >>fix compromises existing user code.
> >>
> >>The main reason is that  string, printf, sprintf already force the
> >>precision set in php.ini.  This means it quite difficult for someone to
> >>have exploited the fact that compare_function does not.  To hit problems
> >>with the fix they would have to have first  decided they care about the
> >>digits beyond 'precision',  but do not care enough to use the bc
> >>library.  They then have to gone to some trouble to get hold of those
> >>extra digits. They could not, for instance, easily get them into a page
> >>or database without a conversion to string automatically removing the
> >>extra digits along the way.
> >>
> >>In contrast to a "round to precision after each floating point
> >>operation" approach, (which would be a nightmare), these fixes should
> >>not create any new issues with cumulative rounding errors.  All the
> >>changed functions already return booleans, ints, or integer-rounded
> >>floats.
> >>
> >>In any case, everything can easily be reverted to the old functionality
> >>at execution time simply by  increasing the value of precision e.g.
> >>set_ini('precision',18).
> >>
> >>
> >>Performance
> >>===========
> >>Only doubles are effected.  Long comparisons, such as integer for-loops
> >>(for($i=0;$i<$sizeof($aray);++$i) etc.), are unchanged and run just as
> >>fast.
> >>
> >>The effect on double comparisons is not negligible.  sprintf's are
> >>relatively expensive in terms of performance and adding the overhead of
> >>two sprintfs to every single double compare would have been nasty.
> >>
> >>To minimise this the new compare_function code first of all checks that
> >>the operands are not already equal and are "close enough" that a string
> >>compare is necessary before forcing the sprintfs. The test is for (a-b)
> >>!= 0 && ((a-b)/a) < 1e-(precision-1) e.g. (0.2 - 0.1) != 0 && ((0.2 -
> >>0.1)/0.1) < 1e-13.
> >>
> >>Even when the operands are close and this test is the only overhead,
> >>this still means an  extra floating point division which accounts for
> >>nearly all the performance degradation.  compare_function is so fast
> >>already, that this extra division seems to make comparisons of non-equal
> >>doubles about twice as slow.
> >>
> >>Fortunately this is pretty small in absolute terms and when compared to
> >>other simple comparisons. A double comparison using the fix e.g. 0.1 ==
> >>0.2, is still no slower than a single character non-numeric string
> >>comparison e.g. 'a' == 'b'.  They remain much faster than a numeric
> >>string compare e.g. '0.1' == '0.2'.
> >>
> >>When operands are close e.g. 1.2345678901234e123 == 1.2345678901233e123
> >>then the performance hit is much bigger as the two sprintfs and a string
> >>compare are required.  Even then it is still faster than the old
> >>workaround of casting to string before the compare i.e. (string) 0.1 ==
> >>(string) 0.2.
> >>
> >>Further Optimisations
> >>=====================
> >>
> >>The code can definitely be optimised further.
> >>
> >>The main slowdown on the ordinary, non-close, comparisons comes from the
> >>floating point divide which is needed to make sure that it is the
> >>"relative" difference not the absolute difference which is compared to
> >>the precision.  If the precision were converted to binary and adjusted
> >>by the value of the double's exponent it should be possible to avoid the
> >>floating point division.  But this is significantly more complicated, or
> >>rather I haven't worked out how to do it!
> >>
> >>The sprintf/string compare could also be improved.   For example,
> >>sprintf("%a") seems to be about twice as fast as sprintf("%G").
> >>Unfortunately %a can return un-normalized formats which would need
> >>tweaking to stop them failing the string comparison.
> >>
> >>There are also significant performance improvements possible in the
> >>floor, ceil, intval functions by not using compare_function but instead
> >>doing an in-line comparison.  Since the results of these functions are
> >>themselves integers their differences can be compared directly to
> >>1e(-precision) without the floating point divide, (provided of course
> >>they are less than 1e13 themselves).
> >>
> >>
> >>Many thanks for taking the time to consider this. Any feedback will be
> >>much appreciated.
> >>
> >>
> >>George
> >>
> >>Changes to Zend/zend_operators.c:
> >>
> >>compare_function
> >>============
> >>
> >>OLD CODE:
> >>--------------
> >> if ((op1->type == IS_DOUBLE || op1->type == IS_LONG)
> >>  && (op2->type == IS_DOUBLE || op2->type == IS_LONG)) {
> >>  result->value.dval = (op1->type == IS_LONG ? (double) op1->value.lval
> >>: op1->value.dval) - (op2->type == IS_LONG ? (double) op2->value.lval :
> >>op2->value.dval);
> >>   result->value.lval = ZEND_NORMALIZE_BOOL(result->value.dval);
> >>   result->type = IS_LONG;
> >>   return SUCCESS;
> >> }
> >>
> >>NEW CODE:
> >>--------------
> >>...
> >>        double dval1, dval2;
> >>        char   *sval1, *sval2;
> >>        long   slen1, slen2;
> >>...
> >> if ((op1->type == IS_DOUBLE || op1->type == IS_LONG)
> >>  && (op2->type == IS_DOUBLE || op2->type == IS_LONG)) {
> >>
> >>          if (op1->type == IS_DOUBLE) {
> >>            dval1 = op1->value.dval;
> >>          } else {
> >>            dval1 = (double) op1->value.lval;
> >>          }
> >>          if (op2->type == IS_DOUBLE) {
> >>            dval2 = op2->value.dval;
> >>          } else {
> >>            dval2 = (double) op2->value.lval;
> >>          }
> >>          result->value.dval = dval1 - dval2;
> >>
> >>          if (result->value.dval != 0 && fabs((dval1 - dval2)/(dval1 ==
> >>0 ? 1 : dval1)) < pow(10.0,0 - (EG(precision) -1))) {
> >>       sval1 = (char *) emalloc(MAX_LENGTH_OF_DOUBLE + EG(precision) +
> >>1);
> >>       slen1 = sprintf(sval1, "%.*G", (int) EG(precision),dval1);  /*
> >>SAFE */
> >>       sval2 = (char *) emalloc(MAX_LENGTH_OF_DOUBLE + EG(precision) +
> >>1);
> >>       slen2 = sprintf(sval2, "%.*G", (int) EG(precision),dval2);  /*
> >>SAFE */
> >>       if (0 == zend_binary_strcmp(sval1,slen1,sval2,slen2))
> >>      {
> >>         result->value.dval = 0;
> >>       }
> >>       efree(sval1);
> >>       efree(sval2);
> >>     }
> >>     result->value.lval = ZEND_NORMALIZE_BOOL(result->value.dval);
> >>     result->type = IS_LONG;
> >>     return SUCCESS;
> >> }
> >>
> >>
> >>convert_to_long_base
> >>===============
> >>
> >>OLD CODE:
> >>--------------
> >>...
> >>  case IS_DOUBLE:
> >>   DVAL_TO_LVAL(op->value.dval, op->value.lval);
> >>   break;
> >>...
> >>
> >>NEW CODE:
> >>--------------
> >>...
> >> zval *op1, *op2, *result;
> >>...
> >>  case IS_DOUBLE:
> >>    MAKE_STD_ZVAL(op1);
> >>    MAKE_STD_ZVAL(op2);
> >>    MAKE_STD_ZVAL(result);
> >>    op1->type = IS_DOUBLE;
> >>    op2->type = IS_LONG;
> >>                  op1->value.dval = op->value.dval;
> >>    DVAL_TO_LVAL(op->value.dval, op->value.lval);
> >>                  if (op1->value.dval >= 0)
> >>                   {
> >>                      op2->value.lval = op->value.lval + 1;
> >>                    } else {
> >>                      op2->value.lval = op->value.lval - 1;
> >>                    }
> >>    compare_function(result, op1, op2  TSRMLS_CC);
> >>    if (result->value.lval == 0) {
> >>      op->value.lval = op2->value.lval;
> >>    }
> >>    zval_dtor(result);
> >>    zval_dtor(op1);
> >>    zval_dtor(op2);
> >>   break;
> >>
> >>The changes to floor and ceil in ext/standard/math.c are very similar to
> >>the convert_to_long_base change.
> >>
> >
> >




-- 
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Floating point comparisons - A fix

Reply via email to