Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-07 Thread BohwaZ
AFAIK, gettext functions do depend on setlocale.

I wish so much that it wasn't the case (as you then need to have the
locale installed on the system), but it is, so setlocale definetely is
quite used in the wild and deprecating it seems a bit far-fetched
unless we can actually replace it with something else (better).

But gettext has other issues related to being cached in the current
process, as you need to restart apache if the compiled .mo files have
changed to get the new strings :(

Another function that is influenced by setlocale is strftime. This is
often the common way to display a date in a different language.

So I'm all for deprecating setlocale but before that we would need to
have something better for everything that's currently depending on it :)

BohwaZ

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-04 Thread Christoph M. Becker
On 03.01.2019 at 02:19, Stanislav Malyshev wrote:

> If this is part of a data pipeline, the difference between 1,500 and
> 1.500 can be huge (about 1000 times ;). With luck, there would be unit
> tests, so instead of broken bank account we'd have broken unit tests,
> but we all know how unit test coverage tends to lag behind...
> Number formatting difference may be a funny quirk in an average website
> context, but could be absolutely disastrous in scientific or financial
> application context.

Using floats for currency calculations may have more subtle issues.  And
for scientific applications, one may not want to have

   echo 123456789012345678.9; // 1,2345678901235E+17

-- 
Christoph M. Becker

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-04 Thread Christoph M. Becker
On 02.01.2019 at 16:51, Zeev Suraski wrote:

> If we do end up adding a new INI entry - maybe it can be a tristate -
> legacy, legacy+notice, or new.  Just a thought.  And I wouldn’t commit to
> actually removing it at any time by officially deprecating it...

I have some doubts that an INI setting would be an appropriate solution.
 If it's PHP_INI_SYSTEM (or such), libraries may have a hard time
dealing with this.  If it's PHP_INI_ALL, the same issues as now could
still happen.

-- 
Christoph M. Becker


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-03 Thread Michael Kliewe

Hi,

Am 03.01.2019 um 10:21 schrieb Zeev Suraski:

On Wed, Jan 2, 2019 at 8:57 PM Nikita Popov  wrote:


It's usually the other way around. The current behavior is prone to
breaking integration code, because data interchange layers generally do not
expect floats to use comma separators. The reason why things don't break
quite as terribly as they could is that PHP has introduced a number of
workaround over time, as these issues have been reported. That's why you
usually don't run into this when inserting float values into a DB query, at
least when using prepared statements. This issue is not handled everywhere
though (one recent example I remember is passing floats to bcmath) and I
don't think that introducing more of these special cases is how we should
be approaching this problem.

Again, I'm not disputing that the current behavior isn't desired.  I am
disputing that it's a "no big deal" to change it 20+ years after the fact,
and I am disputing that while many may not be fond of this behavior - they
can still have code that has grown to rely on it over the years.

I do think that if we do decide to change it, it should be while providing
users a long-term (and probably permanent) language level way to keep the
current behavior.  Yes, it's against our motto - but then, so is such a
widescale compatibility breakage without an easy forward path that does not
involve a full line by line code audit.


Would it be possible to write a patch where every 
float-to-string-conversion that changes the representation because of 
the locale-setting, would produce a DEPRECATION or NOTICE or similar? 
The conversion output stays as it is today. Then people could run that 
patch against their own private projects. Or run against most popular 
100 PHP Github projects. Run against 1000 random PHP projects from 
Github. Or create a specific PHP version that could run on Travis-CI or 
so...


Currently there is a lot of guessing about the potentially affected 
libraries and projects, maybe it's better to measure. Then we get 
numbers of how many affected projects there might be. Maybe it's a huge 
problem, maybe only 0.001% of the projects are affected, and if they are 
affected, most likely it's considered a bug, not "intended behaviour". 
Nobody should rely on the current behaviour, best practice is to use 
number_format() if needed I think.


If such a DEPRECATION/NOTICE would be emitted in 7.4, 8.0, and 8.x, then 
people have many years to fix it before the change finally is done in 
9.0 or 10.0. We don't have to hurry, this might be a feature with a long 
period of DEPRECATION/NOTICE, potential bugs can be fixed during the 
years, before the final change happens. If there are really huge 
problems that we see during those years (but didn't see while 
measuring), thousands of developers complaining, in the worst case the 
change could be reverted.


We should measure first, and then hopefully fix strange behaviour of PHP 
in the long run.


Michael

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-03 Thread Zeev Suraski
On Thu, Jan 3, 2019 at 3:30 PM Rowan Collins 
wrote:

> On Thu, 3 Jan 2019 at 09:21, Zeev Suraski  wrote:
>
> >
> > I agree, but the real question is how many of those who are explicitly
> > calling setlocale() are relying on this behavior
> >
>
>
> It occurs to me that we could ask the opposite question: what *other*
> reasons do people have for explicitly calling setlocale()? If we can
> provide alternatives to the majority of use cases, we can put a big fat
> warning on the setlocale() manual page suggesting that people completely
> avoid it - more prominent than the current one on thread-safety, mentioning
> some of the other undesirable side effects, and explicitly recommending
> alternatives.
>
> Obviously, that advice won't be followed over night, but it would at least
> give ammunition for people to raise PRs against libraries saying "hey,
> please use this instead of setlocale() because you broke my float
> conversions".
>

Very interesting direction, I definitely think it's worthwhile to try and
answer the question you raise and depending on what we find - implement
your proposal on deprecating or semi-deprecating setlocale().

Zeev


Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-03 Thread Rowan Collins
On Thu, 3 Jan 2019 at 09:21, Zeev Suraski  wrote:

>
> I agree, but the real question is how many of those who are explicitly
> calling setlocale() are relying on this behavior
>


It occurs to me that we could ask the opposite question: what *other*
reasons do people have for explicitly calling setlocale()? If we can
provide alternatives to the majority of use cases, we can put a big fat
warning on the setlocale() manual page suggesting that people completely
avoid it - more prominent than the current one on thread-safety, mentioning
some of the other undesirable side effects, and explicitly recommending
alternatives.

Obviously, that advice won't be followed over night, but it would at least
give ammunition for people to raise PRs against libraries saying "hey,
please use this instead of setlocale() because you broke my float
conversions".

Regards,
-- 
Rowan Collins
[IMSoP]


Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-03 Thread Zeev Suraski
On Wed, Jan 2, 2019 at 8:57 PM Nikita Popov  wrote:

> What I mean is that there are not many people who use float to string
> conversion with the express intention of receiving a locale-dependent
> result (and use a locale where the question is relevant). Those are the
> only people who would be (negatively) affected by such a change.
>

While you may very well be correct that some (maybe even most, not sure)
people don't have the express intention of receiving this behavior -
nonetheless, this is the behavior they've been seeing in the last 20
years.  Many, arguably most developers code based on the behavior they see
in practice.  Whether or not they thought this behavior is sensible, once
they saw this is the behavior in practice - it's likely that they relied on
it.  Of course, some may have been put off but what they saw and decided to
use something else (e.g. avoiding setlocale() altogether) - but I doubt
this is anywhere close to 100% of the developers.



> 2. Perhaps you meant they weren't proactively relying on this behavior,
>> which could be true - but it doesn't matter whether people were expecting
>> or otherwise desiring this behavior when they wrote the code.  Whatever the
>> current behavior is - they adjusted for it, and ended up using it,
>> consequently relying on it.
>>
>
> As said, I'm sure there are people relying on this. What I'm saying is
> that the number of people who rely on float conversions to *not* be
> locale-sensitive is vastly, orders of magnitudes larger than the number of
> people who *do* rely on it being locale sensitive.
>
> The only saving grace is that this issue only turns up relatively rarely,
> because it requires you to explicitly call setlocale, as the locale is not
> inherited from the environment. Or more likely, you're not going to call
> setlocale, but discover this wonderful behavior because something else does
> for entirely unrelated reasons.
>

I agree, but the real question is how many of those who are explicitly
calling setlocale() are relying on this behavior - as the change we're
proposing effects only them anyway.  So while the fact those who are using
setlocale() are likely a small minority is a given, the real question is
within this subgroup - what's the breakdown of people relying on it.  I'd
argue that within that group, those relying on it are likely a majority,
even if when they first bumped into this behavior they thought to
themselves "Huh, that's funny, I didn't expect that.".  Ultimately, their
code now relies on it.

There shouldn't be any developers who are using setlocale() and are relying
on a behavior that never existed in PHP (which doesn't mean they don't
exist - but I can't imagine they're a sizable subgroup let alone a
majority).

3. I view a UX change as a big deal.  As we should in a language that is
>> very commonly used to create UI.
>> 4. This could effect not only UX, but also integration code.  You could
>> have PHP output feeding into something else - and suddenly, the format
>> breaks.  With the fix HAVING to be in the other side, no less.
>>
>
> The fix doesn't have to be on the other side. Most likely you'd prefer to
> fix it on your side by explicitly formatting the float in the desired
> manner.
>

I agree here, it doesn't have to be on the "other side" like I claimed.  It
may still be easier in many cases, as at least fixing it on the PHP side
would be quite difficult (again, involve a line by line code audit).


> It's usually the other way around. The current behavior is prone to
> breaking integration code, because data interchange layers generally do not
> expect floats to use comma separators. The reason why things don't break
> quite as terribly as they could is that PHP has introduced a number of
> workaround over time, as these issues have been reported. That's why you
> usually don't run into this when inserting float values into a DB query, at
> least when using prepared statements. This issue is not handled everywhere
> though (one recent example I remember is passing floats to bcmath) and I
> don't think that introducing more of these special cases is how we should
> be approaching this problem.
>

Again, I'm not disputing that the current behavior isn't desired.  I am
disputing that it's a "no big deal" to change it 20+ years after the fact,
and I am disputing that while many may not be fond of this behavior - they
can still have code that has grown to rely on it over the years.

I do think that if we do decide to change it, it should be while providing
users a long-term (and probably permanent) language level way to keep the
current behavior.  Yes, it's against our motto - but then, so is such a
widescale compatibility breakage without an easy forward path that does not
involve a full line by line code audit.

Zeev


Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-02 Thread Stanislav Malyshev
Hi!

> 2. Even if somebody is using this functionality, the only thing that's
> going to happen is that their number display switches from 1,5 to 1.5.
> That's a minor UX regression, not a broken application. It's something
> that will have to be fixed, but it's also not critical, and for a legacy
> application one might even not bother.

If this is part of a data pipeline, the difference between 1,500 and
1.500 can be huge (about 1000 times ;). With luck, there would be unit
tests, so instead of broken bank account we'd have broken unit tests,
but we all know how unit test coverage tends to lag behind...
Number formatting difference may be a funny quirk in an average website
context, but could be absolutely disastrous in scientific or financial
application context.

> I think we should just put this to an RFC vote. We regularly have these
> types of discussions, and people just disagree about level of
> anticipated BC break relative to benefit of the change.

I do not object to the RFC vote. What we're doing now is something that
comes before the vote - laying out arguments for and against it. I think
that'd be prerequisite to having an informed vote. I don't think this
change would absolutely ruin PHP if voted in, but I think I'd vote
against it, given the arguments laid out so far.
-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-02 Thread Lauri Kenttä

On 2019-01-02 19:22, Zeev Suraski wrote:

1. I'm not sure what you mean "not many people use this"?  People don't
convert floats to strings?


People don't format their floats using setlocale + echo. People use 
things like sprintf and number_format to get the right number of 
decimals, and many use number_format or even str_replace to change the 
decimal separator because setlocale has weird side effects (like the one 
being discussed).


On 2019-01-02 19:22, Zeev Suraski wrote:
3. I view a UX change as a big deal.  As we should in a language that 
is

very commonly used to create UI.


Then what about existing UI bugs? Thanks to this discussion, I found 
exactly one instance of setlocale in my whole PHP code base (used to 
format a printf nicely), and I also found a bug where "stringparam=$x" 
was ill-formatted because of this and produced a visible error in a 
generated image (although not critical and thus unnoticed until now). I 
was certainly not relying on this behaviour. It was just bad luck that 
the output (at least when I saw it – don't know about other users!) was 
”good enough” that I didn't notice the bug.


On 2019-01-02 19:22, Zeev Suraski wrote:

4. This could effect not only UX, but also integration code.  You could
have PHP output feeding into something else - and suddenly, the format
breaks.  With the fix HAVING to be in the other side, no less.


It's reasonable to expect that a float (with known range) is a valid 
number in most programming contexts such as CSS (width: px) 
or HTML (input type=number value=) or JavaScript (var width 
= ). Using number_format to fix these would feel almost as 
bad as using number_format before every arithmetic operation.


Because of this behaviour, using setlocale will break many libraries 
which output floating-point values in any other context than 
user-visible text.


On 2019-01-02 19:22, Zeev Suraski wrote:

With the fix HAVING to be in the other side, no less.


How so? If you send floats, you can format them yourself (and you 
certainly should, if they are locale-dependent!). If you receive floats, 
you can parse them yourself. No need to change ”the other side”.


--
Lauri Kenttä

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-02 Thread Nikita Popov
On Wed, Jan 2, 2019 at 6:22 PM Zeev Suraski  wrote:

>
>
> On Wed, Jan 2, 2019 at 6:11 PM Nikita Popov  wrote:
>
>> On Wed, Jan 2, 2019 at 12:33 PM Zeev Suraski  wrote:
>> I don't expect this to be a particularly large issue for two reasons:
>>
>> 1. Not many people use this. I'm sure that there *are* people who use
>> this and use it intentionally, but I've only ever seen reference to this
>> issue as a bug or a gotcha.
>> 2. Even if somebody is using this functionality, the only thing that's
>> going to happen is that their number display switches from 1,5 to 1.5.
>> That's a minor UX regression, not a broken application. It's something that
>> will have to be fixed, but it's also not critical, and for a legacy
>> application one might even not bother.
>>
>
> FWIW, neither of these are very convincing for me:
> 1. I'm not sure what you mean "not many people use this"?  People don't
> convert floats to strings?
>

No, that's not what I meant. Of course, many people convert floats to
strings. But the vast majority of them expect to get back a floating point
number in the standard format.

What I mean is that there are not many people who use float to string
conversion with the express intention of receiving a locale-dependent
result (and use a locale where the question is relevant). Those are the
only people who would be (negatively) affected by such a change.


> 2. Perhaps you meant they weren't proactively relying on this behavior,
> which could be true - but it doesn't matter whether people were expecting
> or otherwise desiring this behavior when they wrote the code.  Whatever the
> current behavior is - they adjusted for it, and ended up using it,
> consequently relying on it.
>

As said, I'm sure there are people relying on this. What I'm saying is that
the number of people who rely on float conversions to *not* be
locale-sensitive is vastly, orders of magnitudes larger than the number of
people who *do* rely on it being locale sensitive.

The only saving grace is that this issue only turns up relatively rarely,
because it requires you to explicitly call setlocale, as the locale is not
inherited from the environment. Or more likely, you're not going to call
setlocale, but discover this wonderful behavior because something else does
for entirely unrelated reasons.


> 3. I view a UX change as a big deal.  As we should in a language that is
> very commonly used to create UI.
> 4. This could effect not only UX, but also integration code.  You could
> have PHP output feeding into something else - and suddenly, the format
> breaks.  With the fix HAVING to be in the other side, no less.
>

The fix doesn't have to be on the other side. Most likely you'd prefer to
fix it on your side by explicitly formatting the float in the desired
manner.

It's usually the other way around. The current behavior is prone to
breaking integration code, because data interchange layers generally do not
expect floats to use comma separators. The reason why things don't break
quite as terribly as they could is that PHP has introduced a number of
workaround over time, as these issues have been reported. That's why you
usually don't run into this when inserting float values into a DB query, at
least when using prepared statements. This issue is not handled everywhere
though (one recent example I remember is passing floats to bcmath) and I
don't think that introducing more of these special cases is how we should
be approaching this problem.

Nikita


Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-02 Thread Zeev Suraski
On Wed, Jan 2, 2019 at 6:11 PM Nikita Popov  wrote:

> On Wed, Jan 2, 2019 at 12:33 PM Zeev Suraski  wrote:
> I don't expect this to be a particularly large issue for two reasons:
>
> 1. Not many people use this. I'm sure that there *are* people who use this
> and use it intentionally, but I've only ever seen reference to this issue
> as a bug or a gotcha.
> 2. Even if somebody is using this functionality, the only thing that's
> going to happen is that their number display switches from 1,5 to 1.5.
> That's a minor UX regression, not a broken application. It's something that
> will have to be fixed, but it's also not critical, and for a legacy
> application one might even not bother.
>

FWIW, neither of these are very convincing for me:
1. I'm not sure what you mean "not many people use this"?  People don't
convert floats to strings?
2. Perhaps you meant they weren't proactively relying on this behavior,
which could be true - but it doesn't matter whether people were expecting
or otherwise desiring this behavior when they wrote the code.  Whatever the
current behavior is - they adjusted for it, and ended up using it,
consequently relying on it.
3. I view a UX change as a big deal.  As we should in a language that is
very commonly used to create UI.
4. This could effect not only UX, but also integration code.  You could
have PHP output feeding into something else - and suddenly, the format
breaks.  With the fix HAVING to be in the other side, no less.

I fail to understand how we could consider changing such a fundamental
element (to-string behavior of floats) without an in-depth discussion.  We
mustn't.

I think we should just put this to an RFC vote. We regularly have these
> types of discussions, and people just disagree about level of anticipated
> BC break relative to benefit of the change.
>

The point of an RFC is, in fact, to have these discussions.  This is what
we're doing right now.  This is what the RFC process is all about - not the
vote.  It sounds to me as if you're saying "what's the point of discussing,
let's just vote" (waiting out the two weeks as needed), with which I would
wholeheartedly disagree.  Apologies if you meant something else.

Zeev


Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-02 Thread Nikita Popov
On Wed, Jan 2, 2019 at 12:33 PM Zeev Suraski  wrote:

>
>
> On Wed, Jan 2, 2019 at 11:26 AM Nikita Popov  wrote:
>
>> On Wed, Jan 2, 2019 at 12:30 AM Stanislav Malyshev 
>> wrote:
>>
>> We have a rather hard policy against ini options that influence language
>> behavior. Locale-dependent language behavior is essentially the same
>> issue,
>> just worse due to the mentioned issues, in particularly lack of
>> thread-safety and the possibility that the locale is changed by
>> third-party
>> libraries at runtime.
>>
>> We have removed existing ini flags controlling language behavior in the
>> past. I would say these removals were much more significant than what is
>> proposed here, but we did them anyway, and I think we are now in a better
>> place for it.
>>
>
> Unless I'm missing something, changing this behavior would require a full,
> line-by-line audit of the code - with no Search & Replace patterns that can
> find these instances in any reasonable level of reliability.  Every place
> where a floating number (which could come from anywhere, so not very easy
> to track) is used in a string context (which too can happen in countless
> different contexts, virtually impossible to track) would be affected.
> Sounds pretty nightmarish to me.  I for one fail to recall a behavioral
> change that was quite as significant as this one in terms of the complexity
> of finding instances that must be updated.  Like Stas, I'm not disputing
> that this is not an ideal behavior or that we'd do it differently if we
> were starting from scratch - but I also agree with him that it's pretty
> much out of the question to simply change it at this point.
>
> Can you point out a change you believe is as or more significant than this
> one that we did?  I think the only one that comes close is
> magic_quotes_runtime, and even that was significantly easier to handle in
> terms of the cost of auditing the code (again, unless I'm missing
> something, which is of course very much a possibility).
>
> The solution for this *might* be a very unholy one - actually going
> against our practices adding a new INI entry to would disable the
> locale-awareness for float->string conversions;  But for upgrade
> considerations, I don't think we can even consider simply changing this
> behavior and forcing virtually everyone using a non-dot decimal separator
> to undergo a full code audit.
>
> My 2c.
>
> Zeev
>

I don't expect this to be a particularly large issue for two reasons:

1. Not many people use this. I'm sure that there *are* people who use this
and use it intentionally, but I've only ever seen reference to this issue
as a bug or a gotcha.
2. Even if somebody is using this functionality, the only thing that's
going to happen is that their number display switches from 1,5 to 1.5.
That's a minor UX regression, not a broken application. It's something that
will have to be fixed, but it's also not critical, and for a legacy
application one might even not bother.

I think we should just put this to an RFC vote. We regularly have these
types of discussions, and people just disagree about level of anticipated
BC break relative to benefit of the change.

Nikita


Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-02 Thread Zeev Suraski
On Wed, 2 Jan 2019 at 17:12 Christoph M. Becker  wrote:

> On 02.01.2019 at 12:32, Zeev Suraski wrote:
>
> > Unless I'm missing something, changing this behavior would require a
> full,
> > line-by-line audit of the code - with no Search & Replace patterns that
> can
> > find these instances in any reasonable level of reliability.  Every place
> > where a floating number (which could come from anywhere, so not very easy
> > to track) is used in a string context (which too can happen in countless
> > different contexts, virtually impossible to track) would be affected.
> > Sounds pretty nightmarish to me.  I for one fail to recall a behavioral
> > change that was quite as significant as this one in terms of the
> complexity
> > of finding instances that must be updated.  Like Stas, I'm not disputing
> > that this is not an ideal behavior or that we'd do it differently if we
> > were starting from scratch - but I also agree with him that it's pretty
> > much out of the question to simply change it at this point.
> >
> > Can you point out a change you believe is as or more significant than
> this
> > one that we did?  I think the only one that comes close is
> > magic_quotes_runtime, and even that was significantly easier to handle in
> > terms of the cost of auditing the code (again, unless I'm missing
> > something, which is of course very much a possibility).
>
> Wasn't the removal of register_globals a similar change?  Not so long
> ago I've stumbled upon a script which counteracted this by extract()ing
> the superglobals manually (surely, a very bad practise, but at least
> some kind of workaround to keep legacy scripts going).  However, the
> introduction of “Uniform Variable Syntax”[1] may have caused similar
> issues; likely without any possible workaround.
>

Well, the removal of register_globals was a very big deal - and was done
for arguably much more pressing reasons (security).  So I wouldn't refer to
it as basis to illustrate that this isn't a big deal...  That said - as you
pointed out yourself, there was a very easy workaround for those that
didn't want or couldn't afford to do a full code audit - a few lines of
user and code that emulated it.

Regarding Uniform Variable Syntax - the cases where the behavior changed
there were truly edge cases, that nobody in his right mind should be using
anyway, and that can probably also be searched for using a clever regex.
This isn’t the case here.  Unless I’m missing something, a code as simple
as $x = 3.99; print “Price:  $x”; would be affected.

So, I think it has a much bigger impact than the UVS incompatibility, it’s
much more difficult to find, and does not have a userland workaround unless
we introduce a language level one.

>
> > The solution for this *might* be a very unholy one - actually going
> against
> > our practices adding a new INI entry to would disable the
> locale-awareness
> > for float->string conversions;  But for upgrade considerations, I don't
> > think we can even consider simply changing this behavior and forcing
> > virtually everyone using a non-dot decimal separator to undergo a full
> code
> > audit.
>
> Would it be a sensible option to trigger a warning or notice whenever a
> float is converted to string yielding a different result than before,
> using an ini directive to control this?  Or perhaps even throw a
> deprecation notice in this case, without even introducing an ini directive?


It would be technically possible, but given the context these conversions
often occur in I think it would look awful...   Also, one would have to run
their software through all possible code flows in order to know for sure
it’s safe to turn it off and move to the new behavior.  And legend has it,
that not all PHP users (or developers in general) have 100% testing
coverage :)

If we do end up adding a new INI entry - maybe it can be a tristate -
legacy, legacy+notice, or new.  Just a thought.  And I wouldn’t commit to
actually removing it at any time by officially deprecating it...

Zeev


Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-02 Thread Christoph M. Becker
On 02.01.2019 at 12:32, Zeev Suraski wrote:

> On Wed, Jan 2, 2019 at 11:26 AM Nikita Popov  wrote:
> 
>> On Wed, Jan 2, 2019 at 12:30 AM Stanislav Malyshev 
>> wrote:
>>
>> We have a rather hard policy against ini options that influence language
>> behavior. Locale-dependent language behavior is essentially the same issue,
>> just worse due to the mentioned issues, in particularly lack of
>> thread-safety and the possibility that the locale is changed by third-party
>> libraries at runtime.
>>
>> We have removed existing ini flags controlling language behavior in the
>> past. I would say these removals were much more significant than what is
>> proposed here, but we did them anyway, and I think we are now in a better
>> place for it.
> 
> Unless I'm missing something, changing this behavior would require a full,
> line-by-line audit of the code - with no Search & Replace patterns that can
> find these instances in any reasonable level of reliability.  Every place
> where a floating number (which could come from anywhere, so not very easy
> to track) is used in a string context (which too can happen in countless
> different contexts, virtually impossible to track) would be affected.
> Sounds pretty nightmarish to me.  I for one fail to recall a behavioral
> change that was quite as significant as this one in terms of the complexity
> of finding instances that must be updated.  Like Stas, I'm not disputing
> that this is not an ideal behavior or that we'd do it differently if we
> were starting from scratch - but I also agree with him that it's pretty
> much out of the question to simply change it at this point.
> 
> Can you point out a change you believe is as or more significant than this
> one that we did?  I think the only one that comes close is
> magic_quotes_runtime, and even that was significantly easier to handle in
> terms of the cost of auditing the code (again, unless I'm missing
> something, which is of course very much a possibility).

Wasn't the removal of register_globals a similar change?  Not so long
ago I've stumbled upon a script which counteracted this by extract()ing
the superglobals manually (surely, a very bad practise, but at least
some kind of workaround to keep legacy scripts going).  However, the
introduction of “Uniform Variable Syntax”[1] may have caused similar
issues; likely without any possible workaround.

> The solution for this *might* be a very unholy one - actually going against
> our practices adding a new INI entry to would disable the locale-awareness
> for float->string conversions;  But for upgrade considerations, I don't
> think we can even consider simply changing this behavior and forcing
> virtually everyone using a non-dot decimal separator to undergo a full code
> audit.

Would it be a sensible option to trigger a warning or notice whenever a
float is converted to string yielding a different result than before,
using an ini directive to control this?  Or perhaps even throw a
deprecation notice in this case, without even introducing an ini directive?

[1] 

-- 
Christoph M. Becker

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-02 Thread Zeev Suraski
On Wed, Jan 2, 2019 at 11:26 AM Nikita Popov  wrote:

> On Wed, Jan 2, 2019 at 12:30 AM Stanislav Malyshev 
> wrote:
>
> We have a rather hard policy against ini options that influence language
> behavior. Locale-dependent language behavior is essentially the same issue,
> just worse due to the mentioned issues, in particularly lack of
> thread-safety and the possibility that the locale is changed by third-party
> libraries at runtime.
>
> We have removed existing ini flags controlling language behavior in the
> past. I would say these removals were much more significant than what is
> proposed here, but we did them anyway, and I think we are now in a better
> place for it.
>

Unless I'm missing something, changing this behavior would require a full,
line-by-line audit of the code - with no Search & Replace patterns that can
find these instances in any reasonable level of reliability.  Every place
where a floating number (which could come from anywhere, so not very easy
to track) is used in a string context (which too can happen in countless
different contexts, virtually impossible to track) would be affected.
Sounds pretty nightmarish to me.  I for one fail to recall a behavioral
change that was quite as significant as this one in terms of the complexity
of finding instances that must be updated.  Like Stas, I'm not disputing
that this is not an ideal behavior or that we'd do it differently if we
were starting from scratch - but I also agree with him that it's pretty
much out of the question to simply change it at this point.

Can you point out a change you believe is as or more significant than this
one that we did?  I think the only one that comes close is
magic_quotes_runtime, and even that was significantly easier to handle in
terms of the cost of auditing the code (again, unless I'm missing
something, which is of course very much a possibility).

The solution for this *might* be a very unholy one - actually going against
our practices adding a new INI entry to would disable the locale-awareness
for float->string conversions;  But for upgrade considerations, I don't
think we can even consider simply changing this behavior and forcing
virtually everyone using a non-dot decimal separator to undergo a full code
audit.

My 2c.

Zeev


Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-02 Thread Lester Caine

On 01/01/2019 23:29, Stanislav Malyshev wrote:

Finally, I don't think that the global locale is the real problem for
PHP.  Rather it's PHP locale handling and the fact that setlocale()
works per process (and not per thread).  When PHP starts up, no locale



That's part of locale being global. Though even in environment where
threads are not involved, many apps do not account for locale quirks.


Like many things that originated in the 'Personal' age of PHP, the 
'Server' nature is somewhat inconsistent in many areas. Working with 
'time' while some people still insist on using LOCAL time on their 
servers, the more consistent method is to use UTC and then identify the 
CLIENTS preferred locale. Displaying other numbers have exactly the same 
problem and it should be a client locale setting that decides how to 
display them, with a global base of something ASCII based. Making 
validation client specific removes the need to mess up the server by 
trying to run multiple locales with the possible conflicts between that, 
just as trying to manage multiple times is complicated if the server is 
running yet another locale?


--
Lester Caine - G8HFL
-
Contact - https://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - https://lsces.co.uk
EnquirySolve - https://enquirysolve.com/
Model Engineers Digital Workshop - https://medw.co.uk
Rainbow Digital Media - https://rainbowdigitalmedia.co.uk

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-02 Thread Nikita Popov
On Wed, Jan 2, 2019 at 12:30 AM Stanislav Malyshev 
wrote:

> Hi!
>
> > So yes, (string)0.3 should return 0.3 in any locale.
>
> If we designed it now, without any doubt. But since we have 20 years of
> history behind... I'm not so sure.
>
> > Finally, I don't think that the global locale is the real problem for
> > PHP.  Rather it's PHP locale handling and the fact that setlocale()
> > works per process (and not per thread).  When PHP starts up, no locale
>
> That's part of locale being global. Though even in environment where
> threads are not involved, many apps do not account for locale quirks.
>

We have a rather hard policy against ini options that influence language
behavior. Locale-dependent language behavior is essentially the same issue,
just worse due to the mentioned issues, in particularly lack of
thread-safety and the possibility that the locale is changed by third-party
libraries at runtime.

We have removed existing ini flags controlling language behavior in the
past. I would say these removals were much more significant than what is
proposed here, but we did them anyway, and I think we are now in a better
place for it.

Regards,
Nikita


Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2019-01-01 Thread Stanislav Malyshev
Hi!

> So yes, (string)0.3 should return 0.3 in any locale.

If we designed it now, without any doubt. But since we have 20 years of
history behind... I'm not so sure.

> Finally, I don't think that the global locale is the real problem for
> PHP.  Rather it's PHP locale handling and the fact that setlocale()
> works per process (and not per thread).  When PHP starts up, no locale

That's part of locale being global. Though even in environment where
threads are not involved, many apps do not account for locale quirks.

-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2018-12-31 Thread Christoph M. Becker
On 31.12.2018 at 00:25, Christoph M. Becker wrote:

> Well, to begin with it would fix the broken behavior of var_export(),
> which is documented to “output or return a parsable string
> representation of a variable”, which it does not necessarily.

Nonsense.  There's nothing wrong with var_export() per se.  Sorry.

-- 
Christoph M. Becker

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistent float to string vs. string to floatcasting

2018-12-30 Thread Christoph M. Becker
On 29.12.2018 at 01:02, Stanislav Malyshev wrote:
>
>> As for me, the question is not *if*, but rather *when* and *how* this
>> inconsistency should be resolved.  Regarding the *when*, it seems to me
> 
> If you mean that loading "0.3" from string would suddenly stop working
> in Germany, then never. If you mean that (string)0.3 would return "0.3"
> and not "0,3" in Germany, then again I'd say probably never, though very
> slightly less confidently. Those people who didn't want it already have
> code to deal with it, and those that wanted it would have their unit
> tests crash and burn and their data pipelines blow up.

It seems to me that Nikita put it nicely[1]:

> […] but core language behavior should *never* be locale-sensitive.

So yes, (string)0.3 should return 0.3 in any locale.

> I think it's a very bad idea to change such things, which would create
> hundreds of year-persons of headache to anybody daring to upgrade. I
> understand it's a bad situation, but I think the right exit of it is to
> tell people that want predictable roundtrip results to use specific
> number conversion functions, and not exchange one mess to another,
> BC-breaking and havoc-wreaking, mess. I agree that the right thing to do
> would be to have (string)0.3 to always use dot and never use locale (did
> I mention I think global locale is a horrible idea?) but that ship has
> sailed. I don't see a use case that would be well served by breaking the
> BC now.

Well, to begin with it would fix the broken behavior of var_export(),
which is documented to “output or return a parsable string
representation of a variable”, which it does not necessarily.

Then, I'm not only thinking about the huge amount of existing code, but
also about the huge amount of code yet to written.  I'm pretty sure that
many new PHP developers (especially those comming from other programming
languages) stumble over the locale-aware float to string conversion
sometime.

Finally, I don't think that the global locale is the real problem for
PHP.  Rather it's PHP locale handling and the fact that setlocale()
works per process (and not per thread).  When PHP starts up, no locale
is set from the enviroment (except for LC_TYPE).  Only if a user
explicitly calls setlocale() with the second argument not equal to "0",
the locale is changed from C to whatever has been chosen and is
available.  Now consider a multi-threaded environment, which is rather
common on Windows.  While it is possible to set the desired locale
immediately before the float to string conversion, it is not easily
possible in PHP to make that really thread-safe.  This makes it very
hard to write robust code for these environments.  Even if thread-safety
is not an issue, a program that worked fine for years may be subtly
broken by inserting a call to setlocale() somewhere.  This “spooky
action at a distance” could also “have their unit tests crash and burn
and their data pipelines blow up”.

[1] 

-- 
Christoph M. Becker

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php