Great! Assuming all tests pass, I'm for applying it.

-Andrei

On Dec 12, 2006, at 6:58 AM, Matt Wilmas wrote:

Hi all,

I rewrote is_numeric_string/unicode to be faster and change a couple things.
The changes being:
1) Previously, large numbers (very long or "1e500") that became INF were
ignored (Bug #26349), which is not the behavior anywhere else.
2) Leading whitespace with hex numbers or ones that started with . (" .123")
also caused them to be ignored.
3) Hex strings were limited to LONG_MAX, and in scripts/parser, ULONG_MAX. I added a zend_hex_strtod() function to handle numbers > LONG_MAX in both places. From the previous comments like "strtod() messes up hex numbers,"
it seems there was desire to support them. :-)
4) Small change, but the string "0x" was considered non-numeric before, but
a partial match of the 0 now (basically to get a more accurate error
level/message with zend_parse_parameters(), for example).

Now the performance... The errno stuff has been removed from is_numeric_*
(and optimized in the parser) to save function calls with thread-safe
libraries (are they used even when ZTS is disabled?).  In my tests on
Windows, I saw a 5-15% improvement with longs (less with more digits; on
64-bit systems, it could be slower at 12-15+ digits, but they're not
common). (With HEAD, everything I checked was consistent, but in 5.2, a few random long tests were slower; must be some compiler weirdness? :-/) So not much difference there for these changes, BUT doubles are over *twice* as fast, and non-numeric string comparisons are up to nearly 3 times faster! (Slightly less % improvement in Unicode mode.) Yeah, non-numeric strings are detected very fast, which may be more significant since is_numeric_* is always used on them (from compare_function(), zendi_smart_strcmp(), etc.). Also, no number conversion is done if there's no corresponding pointer to
fill -- much faster when code is "just checking."

The larger inline function did increase the binary size by a few K... The
patches:

http://realplain.com/php/is_numeric.diff
http://realplain.com/php/is_numeric_5_2.diff

You can see that I changed MAX_LENGTH_OF_LONG to be accurate on 32-/64-bit,
which my changes rely on.  I also fixed a few places where memory
calculations that use it could be too small, in theory.

I wanted to get this in before Ilia's Thursday deadline (if it's still on :-)), in case it can be applied soon. Finally, don't know if you'd want to use it as is, but I've attached possible NEWS file updates about this stuff.

Thoughts, questions?  Thanks.


Matt
<NEWS.diff.txt>--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to