cmbecke...@gmx.de ("Christoph M. Becker") wrote:

> [...] I tend to prefer the non-locale aware behavior, i.e. float to
> string conversion should always produce a decimal *point*.  Users still
> can explicitly use number_format() or NumberFormatter if they wish.

We all agree that the basic features of the language should NOT be
locale-aware to make easier error reporting and logging, data file writing
and parsing, session management, and libraries portability. But I would
to restate this goal more clearly:

FLOAT TO STRING CAST CONVERSION REPLACEMENT

Given a floating-point value, retrieve its canonical PHP source-code
string representation. By "canonical" I mean something that can be
parsed by the PHP interpreter like a floating-point number, not like
an int or anything else. Then, for example, 123.0 must be rendered as
"123.0" not as "123" because it looks like an int; non-finite values
NAN and INF must also be rendered as "NAN" and "INF". The "(string)"
cast and the embedded variable in string "$f" are locale-aware, and so
are all the printf() &Co. functions, including var_dump() (this latter a
big surprise; anyone willing to send a data structure dump to end user?).

The simplest way I found to get such canonical representation is

        $s = var_export($f, TRUE);

which returns exactly what I expect, does not depend on the current
locale, does not depend on exotic libraries, and it is very short and
simple.  It depends only on the current serialize_precision php.ini
parameter, which should already be set right (or you are going to have
problems elsewhere).

STRING TO FLOAT CAST CONVERSION REPLACEMENT

Given a string carrying the canonical representation of a floating-point
number, retrieve the floating-point number. Syntax errors must be
detectable. The result must be "float", not int or anything else.
Unsure about how much strict the parser should be in these edge cases:

"+1.2" (redundant plus sign)
"123" (looks like int, not a float)
"0123" (looks like int octal base)

Getting all this is bit more tricky.  The "(float)" cast does not work
because it does not support non-finite values NAN,INF and does not allow
to detect errors.  The simplest way I found is by using the serialize()
function:

/**
 * Parses the PHP canonical representation of a floating point number. This
 * function parses any valid PHP source code representation of a "float",
 * including NAN, INF, -INF and -0 (IEEE 754 zero negative). Not locale aware.
 * @param string $s String to parse. No spaces allowed, apply trim() if needed.
 * @return float Parsed floating-point number.
 * @throws InvalidArgumentException Invalid syntax.
 */
function parseFloat($s)
{
        // Security: untrusted strings must be checked against a basic syntax 
before
        // being blindly submitted to unserialize():
        if( preg_match("/^[-+]?(NAN|INF|[-+.0-9eE]++)\$/sD", $s) !== 1 )
                throw new InvalidArgumentException("cannot parse as a floating 
point number: '$s'");
        // unserialize() raises an E_NOTICE on parse error and then returns 
FALSE.
        $m = @unserialize("d:$s;");
        if( is_int($m) )
                return (float) $m; // always return what we promised
        if( is_float($m) )
                return $m;
        throw new InvalidArgumentException("cannot parse as a floating point 
number: '$s'");
}

Here again, only core libraries involved, no dependencies from the locale,
not so short but the best I found up now. Things like NumberFormatter
require the 'intl' extension be enabled, and often it isn't.

By using these functions all the possible "float" values pass the
round-trip back and forth, including NAN, INF, -INF, -0 (zero negative,
for what it worth) at the highest accuracy possible of the IEEE 754
representation.


Regards,
 ___
/_|_\  Umberto Salsi
\/_\/  www.icosaedro.it


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to