Great! Assuming all tests pass, I'm for applying it.
-Andrei
On Dec 12, 2006, at 6:58 AM, Matt Wilmas wrote:
Hi all,
I rewrote is_numeric_string/unicode to be faster and change a couple
things.
The changes being:
1) Previously, large numbers (very long or "1e500") that became INF
were
ignored (Bug #26349), which is not the behavior anywhere else.
2) Leading whitespace with hex numbers or ones that started with . ("
.123")
also caused them to be ignored.
3) Hex strings were limited to LONG_MAX, and in scripts/parser,
ULONG_MAX.
I added a zend_hex_strtod() function to handle numbers > LONG_MAX in
both
places. From the previous comments like "strtod() messes up hex
numbers,"
it seems there was desire to support them. :-)
4) Small change, but the string "0x" was considered non-numeric
before, but
a partial match of the 0 now (basically to get a more accurate error
level/message with zend_parse_parameters(), for example).
Now the performance... The errno stuff has been removed from
is_numeric_*
(and optimized in the parser) to save function calls with thread-safe
libraries (are they used even when ZTS is disabled?). In my tests on
Windows, I saw a 5-15% improvement with longs (less with more digits;
on
64-bit systems, it could be slower at 12-15+ digits, but they're not
common). (With HEAD, everything I checked was consistent, but in 5.2,
a few
random long tests were slower; must be some compiler weirdness? :-/)
So not
much difference there for these changes, BUT doubles are over *twice*
as
fast, and non-numeric string comparisons are up to nearly 3 times
faster!
(Slightly less % improvement in Unicode mode.) Yeah, non-numeric
strings
are detected very fast, which may be more significant since
is_numeric_* is
always used on them (from compare_function(), zendi_smart_strcmp(),
etc.).
Also, no number conversion is done if there's no corresponding pointer
to
fill -- much faster when code is "just checking."
The larger inline function did increase the binary size by a few K...
The
patches:
http://realplain.com/php/is_numeric.diff
http://realplain.com/php/is_numeric_5_2.diff
You can see that I changed MAX_LENGTH_OF_LONG to be accurate on
32-/64-bit,
which my changes rely on. I also fixed a few places where memory
calculations that use it could be too small, in theory.
I wanted to get this in before Ilia's Thursday deadline (if it's still
on
:-)), in case it can be applied soon. Finally, don't know if you'd
want to
use it as is, but I've attached possible NEWS file updates about this
stuff.
Thoughts, questions? Thanks.
Matt
<NEWS.diff.txt>--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php