Re: [PHP-DEV] What's the purpose of zend_result?

2023-02-20 Thread G. P. B.
On Sun, 19 Feb 2023 at 08:45, Nikita Popov  wrote:

> On Sun, Feb 19, 2023, at 09:21, Max Kellermann wrote:
> > On 2023/02/19 08:56, Nikita Popov  wrote:
> > > If you have a function like zend_stream_open_function(), SUCCESS and
> FAILURE are directly meaningful values.
> >
> > Agree, but that doesn't explain why FAILURE needs to be negative.
>
> I expect that there are two main reasons for that:
>  - There are probably some places that return a (non-negative) value or
> FAILURE.
>  - There are probably some places that check for success/failure using >=
> 0 and < 0. Another POSIX-ism.
>
> I don't think we endorse such usage, but it likely exists.
>
> Let me turn the question around: Is there some reason to change the value
> of FAILURE at this point?
>
> > > The current guideline for use of bool and zend_result in php-src is
> that bool is an appropriate return value for "is" or "has" style functions,
> which return a yes/no answer. zend_result is an appropriate return value
> for functions that perform some operation that may succeed or fail.
> >
> > What does the return value of these functions mean?
> >
> > - zend_make_printable_zval()
> > - zend_make_callable()
> > - zend_parse_arg_bool()
> > - zend_fiber_init_context()
> > - zend_observer_remove_begin_handler()
> > - php_execute_script()1
> >
> > If I understand the guideline correctly, then those examples (and
> > countless others) are defective and should be fixed, correct?
>
> At least in principle, yes. Of course, actually doing a change may not
> always be worthwhile, especially as this might result in a silent behavior
> change for extensions (as putting the return value into an "if" would
> invert behavior now).
>
> I believe Girgias has done extensive work on making the int vs bool vs
> zend_result situation more consistent, so you might want to coordinate with
> him.
>
> Regards,
> Nikita


Yeah, I spent a lot of time around the release of PHP 8.0 and a bit in
between releases to convert various int types to bool or zend_result just
to make the API clearer.
One of the main cases I missed was to change the return value of object
handlers, as those, for the most part, are meant to return zend_result.
I didn't push forward with that one as when I realized we were already in
beta releases or maybe RC and even changing it for a minor version did feel
a bit disruptive.
However, it may make sense to tackle this sooner than later if we are doing
some other object handler changes.

There are however definitely still cases where FAILURE is assumed to be -1
either because this is what the function is expected to return on failure
or checking the value of a POSIX like API.
Ideally, all cases where zend_result is used would actually indicate this
to make it possible to *maybe* convert SUCCESS to true and FAILURE to false
and typedef zend_result to bool.

Best,

George P. Banyard


Re: [PHP-DEV] What's the purpose of zend_result?

2023-02-19 Thread Max Kellermann
On 2023/02/19 11:56, Niels Dossche  wrote:
> It's also worth noting that there's a couple of places where there's
> just a check for if (function()) { failure code }. i.e. there is no
> check for == FAILURE, it's just "implied".

Ouch.  That's not only fragile, but also very obscure.

(Yadda yadda, C++ wouldn't even allow this, etc., you know)

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] What's the purpose of zend_result?

2023-02-19 Thread Niels Dossche
On 2/19/23 11:32, Max Kellermann wrote:
> On 2023/02/19 09:45, Nikita Popov  wrote:
>> I expect that there are two main reasons for that:
>>  - There are probably some places that return a (non-negative) value or 
>> FAILURE.
>>  - There are probably some places that check for success/failure using >= 0 
>> and < 0. Another POSIX-ism.
>>
>> I don't think we endorse such usage, but it likely exists.
> 
> Oh.  That's  bad.
> 
> (Again, with C++'s strictly-typed enums, this would be easy to find:
> if it compiles, it's correct.)
> 
>> Let me turn the question around: Is there some reason to change the value of 
>> FAILURE at this point?
> 
> Right now, I'm trying to understand this choice of number and the
> obscure code comment NOT explaining it.  A comprehensive understanding
> of a design is an important first step for any other decision.
> 
> And indeed there would be a good practical reason to change this.  For
> example, on x86, the check for -1 compiles to:
> 
>  83 f8 ff cmp$0x,%eax
> 
> While the check for false (or 0) compiles to:
> 
>  85 c0test   %eax,%eax
> 
> On Aarch64, check for -1 is:
> 
>  3100041f cmnw0, #0x1
>  5460 b.eq   1c 
> 
> While false/0 compile to just one conditional branch:
> 
>  3560 cbnz   w0, 3c 
> 
> For such a basic definition used everywhere, this does make a
> difference

Just chiming in real quickly here for two small notes...

We could work around this issue by checking if (function() != SUCCESS)
instead of if (function() == FAILURE). Not sure if it gives any measurable
improvement though.

It's also worth noting that there's a couple of places where there's
just a check for if (function()) { failure code }. i.e. there is no
check for == FAILURE, it's just "implied". So if there would be changes
to their values, this needs to be taken into account or fixed. Although
there might be 3rd party extensions doing this kind of code as well.

> 
> It was rather clumsy to choose -1 for FAILURE because it leads to
> less-than-optimal machine code.  Larger machine code means slower
> execution.  For one central basic definition as zend_result that gets
> used everywhere, it can make a measurable difference.
> 
> (Note that using "bool" would still be faster than using an enum with
> 0 and 1, but everything's better than -1.)
> 
>>> If I understand the guideline correctly, then those examples (and
>>> countless others) are defective and should be fixed, correct?
>>
>> At least in principle, yes. Of course, actually doing a change may not 
>> always be worthwhile, especially as this might result in a silent behavior 
>> change for extensions (as putting the return value into an "if" would invert 
>> behavior now).
> 
> (... and yet again, C++ would have saved us from this pain by not
> allowing implicit bool casts - sorry for annoying you with my C++
> fanboyism, but any pain that stems from choosing C is self-inflicted.)
> 
> The zend_fibers thing is a fairly recent addition, yet nobody spotted
> this API mistake befor merging it.  I've submitted a PR to fix it:
> https://github.com/php/php-src/pull/10622
> 
> Surprisingly, CODING_STANDARDS.md doesn't mention this API convention.
> Maybe it should?
> 
> Max
> 

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] What's the purpose of zend_result?

2023-02-19 Thread Max Kellermann
On 2023/02/19 09:45, Nikita Popov  wrote:
> I expect that there are two main reasons for that:
>  - There are probably some places that return a (non-negative) value or 
> FAILURE.
>  - There are probably some places that check for success/failure using >= 0 
> and < 0. Another POSIX-ism.
> 
> I don't think we endorse such usage, but it likely exists.

Oh.  That's  bad.

(Again, with C++'s strictly-typed enums, this would be easy to find:
if it compiles, it's correct.)

> Let me turn the question around: Is there some reason to change the value of 
> FAILURE at this point?

Right now, I'm trying to understand this choice of number and the
obscure code comment NOT explaining it.  A comprehensive understanding
of a design is an important first step for any other decision.

And indeed there would be a good practical reason to change this.  For
example, on x86, the check for -1 compiles to:

 83 f8 ff cmp$0x,%eax

While the check for false (or 0) compiles to:

 85 c0test   %eax,%eax

On Aarch64, check for -1 is:

 3100041f cmnw0, #0x1
 5460 b.eq   1c 

While false/0 compile to just one conditional branch:

 3560 cbnz   w0, 3c 

For such a basic definition used everywhere, this does make a
difference.

It was rather clumsy to choose -1 for FAILURE because it leads to
less-than-optimal machine code.  Larger machine code means slower
execution.  For one central basic definition as zend_result that gets
used everywhere, it can make a measurable difference.

(Note that using "bool" would still be faster than using an enum with
0 and 1, but everything's better than -1.)

> > If I understand the guideline correctly, then those examples (and
> > countless others) are defective and should be fixed, correct?
> 
> At least in principle, yes. Of course, actually doing a change may not always 
> be worthwhile, especially as this might result in a silent behavior change 
> for extensions (as putting the return value into an "if" would invert 
> behavior now).

(... and yet again, C++ would have saved us from this pain by not
allowing implicit bool casts - sorry for annoying you with my C++
fanboyism, but any pain that stems from choosing C is self-inflicted.)

The zend_fibers thing is a fairly recent addition, yet nobody spotted
this API mistake befor merging it.  I've submitted a PR to fix it:
https://github.com/php/php-src/pull/10622

Surprisingly, CODING_STANDARDS.md doesn't mention this API convention.
Maybe it should?

Max

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] What's the purpose of zend_result?

2023-02-19 Thread Nikita Popov
On Sun, Feb 19, 2023, at 09:21, Max Kellermann wrote:
> On 2023/02/19 08:56, Nikita Popov  wrote:
> > If you have a function like zend_stream_open_function(), SUCCESS and 
> > FAILURE are directly meaningful values.
> 
> Agree, but that doesn't explain why FAILURE needs to be negative.

I expect that there are two main reasons for that:
 - There are probably some places that return a (non-negative) value or FAILURE.
 - There are probably some places that check for success/failure using >= 0 and 
< 0. Another POSIX-ism.

I don't think we endorse such usage, but it likely exists.

Let me turn the question around: Is there some reason to change the value of 
FAILURE at this point?

> > The current guideline for use of bool and zend_result in php-src is that 
> > bool is an appropriate return value for "is" or "has" style functions, 
> > which return a yes/no answer. zend_result is an appropriate return value 
> > for functions that perform some operation that may succeed or fail.
> 
> What does the return value of these functions mean?
> 
> - zend_make_printable_zval()
> - zend_make_callable()
> - zend_parse_arg_bool()
> - zend_fiber_init_context()
> - zend_observer_remove_begin_handler()
> - php_execute_script()1
> 
> If I understand the guideline correctly, then those examples (and
> countless others) are defective and should be fixed, correct?

At least in principle, yes. Of course, actually doing a change may not always 
be worthwhile, especially as this might result in a silent behavior change for 
extensions (as putting the return value into an "if" would invert behavior now).

I believe Girgias has done extensive work on making the int vs bool vs 
zend_result situation more consistent, so you might want to coordinate with him.

Regards,
Nikita

Re: [PHP-DEV] What's the purpose of zend_result?

2023-02-19 Thread Max Kellermann
On 2023/02/19 08:56, Nikita Popov  wrote:
> If you have a function like zend_stream_open_function(), SUCCESS and FAILURE 
> are directly meaningful values.

Agree, but that doesn't explain why FAILURE needs to be negative.

> The current guideline for use of bool and zend_result in php-src is that bool 
> is an appropriate return value for "is" or "has" style functions, which 
> return a yes/no answer. zend_result is an appropriate return value for 
> functions that perform some operation that may succeed or fail.

What does the return value of these functions mean?

- zend_make_printable_zval()
- zend_make_callable()
- zend_parse_arg_bool()
- zend_fiber_init_context()
- zend_observer_remove_begin_handler()
- php_execute_script()1

If I understand the guideline correctly, then those examples (and
countless others) are defective and should be fixed, correct?

Max

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] What's the purpose of zend_result?

2023-02-18 Thread Nikita Popov
On Sun, Feb 19, 2023, at 08:31, Max Kellermann wrote:
> Hi,
> 
> I've done a bit of refactoring work around code using "zend_result",
> but I keep wondering why it even exists.
> 
> It was added in 1999 by commit 573b46022c46 in a huge 16k line commit
> (as macros, converted to enum in 2012 by commit e3ef84c59bf).
> 
> In 1999, C99 was brand new, and the "bool" type had just been
> introduced to the C language - yes, C was 18 years old when a native
> boolean type was added - but PHP didn't switch to C99 for another 19
> years later (b51a99ae358b).
> 
> C had a long history of abusing "int" as boolean type, wasting 2 or 4
> bytes for something that could have easily fit in 1 byte.  As if that
> wasn't obscure enough, POSIX used 0 for success and -1 for error (and
> missed the chance to return error codes, and instead added a global
> variable for that which caused more headaches).  (And don't get me
> started on the horrible strcmp() return value.)  Returning errors in C
> is a huge obscure and inconsistent mess; every function does it
> differently.
> 
> This is PHP's original definition:
> 
> #define SUCCESS 0
> #define FAILURE -1 /* this MUST stay a negative number, or it may effect 
> functions! */
> 
> This appears to follow the POSIX school of bad error return values.
> There's a comment which makes the thing even more obscure.
> 
> Really, why does it need to be negative?
> 
> Why does it imitate POSIX instead of using a boolean? (i.e. "int" and
> non-zero for success for old-schoolers)
> 
> The obvious way to indicate success or failure (without giving details
> about the nature of the failure) would be to just return "bool".  With
> C99, any discussion on "0 or 1" vs "-1 or 0" is obsolete - there is
> now a canonical boolean type that should be used.
> 
> Of course, I know already that getting rid of "zend_result" in favor
> of "bool" would be a major API breakage that requires careful
> refactoring of almost all extensions.
> 
> I just want to understand why "zend_result" was ever invented and why
> it uses those surprising integer values.  The commit message and that
> code comment doesn't explain it.
> 
> Rephrased: do you consider it a worthy goal to eventually get rid of
> "zend_result", or do you believe it's good API design that should stay
> forever?
> 
> (Yet again I wish PHP was fully C++ - unlike C, C++ has strongly-typed
> enums which cannot be casted implicitly to "int" or "bool"; that makes
> refactoring a lot easier and safer.)
> 
> Max

Any type that has only two values is isomorphic to a boolean. However, for us 
humans, not all two-valued types are semantically equivalent.

If you have a function like zend_hash_exists(), true and false are directly 
meaningful values.

If you have a function like zend_stream_open_function(), SUCCESS and FAILURE 
are directly meaningful values.

Now, if you make zend_stream_open_function() return a boolean instead, it's no 
longer clear what the return value means. Does a true return value mean that 
the stream was opened successfully, or that it failed?

For a PHP programmer, that might sound silly -- of course true means success. 
However, in C the common error reporting convention is actually the other way 
around: Non-zero return values indicate failure. This means that false 
indicates success and true indicates failure. (I'm not kidding you -- I'm 
literally working on a project that uses boolean return values with this 
convention right now.)

The current guideline for use of bool and zend_result in php-src is that bool 
is an appropriate return value for "is" or "has" style functions, which return 
a yes/no answer. zend_result is an appropriate return value for functions that 
perform some operation that may succeed or fail.

I think that's a pretty reasonable state of things, and don't think there is a 
need to change it.

Regards,
Nikita

[PHP-DEV] What's the purpose of zend_result?

2023-02-18 Thread Max Kellermann
Hi,

I've done a bit of refactoring work around code using "zend_result",
but I keep wondering why it even exists.

It was added in 1999 by commit 573b46022c46 in a huge 16k line commit
(as macros, converted to enum in 2012 by commit e3ef84c59bf).

In 1999, C99 was brand new, and the "bool" type had just been
introduced to the C language - yes, C was 18 years old when a native
boolean type was added - but PHP didn't switch to C99 for another 19
years later (b51a99ae358b).

C had a long history of abusing "int" as boolean type, wasting 2 or 4
bytes for something that could have easily fit in 1 byte.  As if that
wasn't obscure enough, POSIX used 0 for success and -1 for error (and
missed the chance to return error codes, and instead added a global
variable for that which caused more headaches).  (And don't get me
started on the horrible strcmp() return value.)  Returning errors in C
is a huge obscure and inconsistent mess; every function does it
differently.

This is PHP's original definition:

 #define SUCCESS 0
 #define FAILURE -1 /* this MUST stay a negative number, or it may effect 
functions! */

This appears to follow the POSIX school of bad error return values.
There's a comment which makes the thing even more obscure.

Really, why does it need to be negative?

Why does it imitate POSIX instead of using a boolean? (i.e. "int" and
non-zero for success for old-schoolers)

The obvious way to indicate success or failure (without giving details
about the nature of the failure) would be to just return "bool".  With
C99, any discussion on "0 or 1" vs "-1 or 0" is obsolete - there is
now a canonical boolean type that should be used.

Of course, I know already that getting rid of "zend_result" in favor
of "bool" would be a major API breakage that requires careful
refactoring of almost all extensions.

I just want to understand why "zend_result" was ever invented and why
it uses those surprising integer values.  The commit message and that
code comment doesn't explain it.

Rephrased: do you consider it a worthy goal to eventually get rid of
"zend_result", or do you believe it's good API design that should stay
forever?

(Yet again I wish PHP was fully C++ - unlike C, C++ has strongly-typed
enums which cannot be casted implicitly to "int" or "bool"; that makes
refactoring a lot easier and safer.)

Max

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php