Hi,

> -----Message d'origine-----
> De : Bob Weinand [mailto:bobw...@hotmail.com]
>
> Hey, looks like I'm a bit late to the party, but I'd like to object against 
> the
> accessor macros on zend_string.
> 
> What does it help to replace very nice ->val, ->len etc. with macros?

If you find it 'very nice', probably nothing. I suggest you stop using macros 
everywhere. After all, why did we define Z_STRVAL() instead of writing 
something 'very nice' like 'zv.value.str->val' ? Just one reason : if we had 
written 'zv.value.val' everywhere, zend_strings couldn't exist today.

In a few words, we are improving encapsulation : publishing an API instead of a 
C structure.

> ZSTR_VAL(str)[ZSTR_LEN(str)] = 0; or str->val[str->len] = 0;
> zend_string_set_len(str, ZSTR_LEN(str) - 2); or str->len -= 2;
> Also, that particular function suggests to me that the string is expanded if
> lengths don't match.

I agree it would be quite natural and I'd like to implement it in a future 
version. We just need to store the allocated size in the struct, which also 
allows to over-allocate memory in advance. Anyway, performance is critical 
there, so the code must remain simple and fast.

> I much prefer direct accessors which indicate a) no side-effects happening
> here, b) less functions to remember (you just look at the struct and know
> everything about it) and c) end up being more readable.

a) You deal with an API, not a structure. So, you call methods instead of 
accessing variables. The user does not have to care about side effects of the 
methods he's using. Don't think about C macros, think about methods.
b) Getting documentation from the code is something we should forget. We still 
need to do it because PHP lacks a usable core API documentation, especially for 
PHP 7. So, getting API usage from the code is a workaround, not the appropriate 
way to publish an API. The same with some recently-published articles about new 
PHP 7 structures. These articles should *never* include the underlying 
structures. Then, we complain about people bypassing the 'official' APIs but, 
if they use such documentation, how can they know what they should access or 
not ? IMO, the rules are 'never publish structure details' and 'never allow 
direct access to structure elements'.
c) Using ZSTR_VAL/ZSTR_LEN macros can also be considered as more readable. I 
personally consider it as more readable because other unrelated structures may 
also define 'len' and 'val' elements.

> Yes, I'm totally aware that when we ever want to change the API (I don't see
> coming it in near future anyway), it could help having wrappers. But
> wrappers don't help against every API change… let's say for some reason
> we'd want to go back to the old char*/int; would wrappers help? no. (Yes,
> that's maybe a bad example, but just trying to show that, while it has maybe
> a theoretical gain, I don't see a real practical benefit.)

The fact that you don't imagine the API changing in the near future is not an 
argument. First, wrappers don't protect against API changes, they allow to 
maintain the same API when the backend implementation is modified. I see at 
least two reasons that could bring changes in the zend_string API, and it could 
be quite soon. The first one is unifying zend_string with smart_str, including 
in zend_string the smart_str allocation scheme. I don't know exactly the form 
it would take but reading and/or writing the string length could involve more 
than just writing a value in a structure element. The other subject that could 
require implementation changes is Unicode. It failed for a number of reasons in 
PHP6, but it doesn't mean we'll avoid the question forever. If we want to 
include Unicode at a low level, we may have to work at the zend_string level. 
Once again, in this case, reading and writing the string length will probably 
involve different internal operations.

So, the question is not to blindly apply the theory and say 'tight coupling is 
evil'. There are objective reasons to encapsulate APIs. Zend_string is just the 
first one. It was choosen because no API existed to access the 'val' and 'len' 
elements, and because it was relatively easy to fix, allowing for an 
integration in 7.0. But the next one is HashTable, which is *much* more 
complex. If you look at the implementation and usage of HashTables in the code, 
you'll see how a combination of growing complexity and tight coupling can make 
code unmanageable. Actually, we are at a point where changing the HashTable 
internal implementation would be extremely heavy and complex. If we don't fix 
it now, it can only become worse over time.

> zend_string is really a core structure and extremely simple. Please don't
> trade simplicity for a small other benefit.
> 
> Also, we still leave zend_string struct exposed (presumably for allowing
> inlining). At the point where we are now, many extension authors probably
> already are also accessing zend_string directly. We're going to have a 
> colorful
> mix of code using the APIs and not using it, which is the worst we can get
> now.

Providing an API when it was missing allows to start 'cleaning' the existing 
code. It will take time and, during this time, we'll have a mix of code, yes. 
As long as the code is not 100 % clean, we cannot modify the API, yes. But the 
most important, IMO, is to know where we are going to. What alternative do we 
have ? Provide no encapsulation at all and get blocked when we *need* to change 
implementation in 1 or 2 years ? It will be too late.

> To end up, I very much like unifying prefixes (ZSTR_*), but I strongly dislike
> hurting readability and consistency so much for such little benefit.

You will be satisfied. I just sent a new patch to Dmitry where the whole 
zend_string API is prefixed with 'ZSTR_', providing a better consistency (with 
compatibility macros for old names).

Regards

François



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to