Re: [PHP-DEV] [RFC] UString
On Tue, Jun 30, 2015 at 10:36 PM, Joe Watkins pthre...@pthreads.org wrote: Another possible issue is engine integration: $string = (UString) $someString; $string = (UString) someString; That sounds as a cool idea to discuss as a completely separate, unrelated RFC, and not specific to UString. e.g. $obj = (ClassName)$arg; /* turns into */ $obj = new ClassName($arg); So you could use casting with any class which supports single-argument constructors. But again, orthogonal to this RFC. -Sara -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Improved zend_string API
Am 30.06.2015 um 20:16 schrieb François Laupretre franc...@php.net: De : morrison.l...@gmail.com [mailto:morrison.l...@gmail.com] De la part Just to time in with my $0.02: I feel that using macros as an abstraction in this case is bad practice. I believe that in *most* cases macros as an abstraction is a bad practice. Furthermore, there isn't any reason that `zend_string_*` functions cannot act as an abstraction layer since zend_string's are passed by pointer. Agreed. That's why ZSTR_VAL() and ZSTR_LEN() are functions now. Macros don't provide enough isolation. The choice of renaming 'zend_string_' to 'ZSTR_' is just a question of name consistency. The most important is that these are functions. Zend_string are passed by pointer and, in theory, this type should be opaque, (void *) for instance. @Bob - I remember an idea I had, that I should discuss with Dmitry, and which can be implemented without any change to the proposed API. The idea is to return the address of the string instead of the address of the struct. This would allow using this address for the zend_string API and for any other function expecting a plain (char *) address. Z_STRVAL() and '-val' would become useless, of course. The other struct elements would just be below the used address (as it is done in malloc()). In this case, the calling code must consider the address as, either an opaque value that can be passed to the zend_string API, or the address of an allocated memory buffer that can be read and written, up to the declared size. This is possible only through an encapsulated API. Compare it with malloc(), as both would use a similar mechanism. When you call malloc() on a system, you don't care about the underlying allocated structure, and it may be very different on different systems. This is the same, a malloc() with a pair of additional features. You don't have to know more about the implementation. Interesting idea (from the concept). I even could get on board with this; I'm just not that sure about the performance impact. (alignment of refcount to base address etc.) Also, you have to make sure, to not accidentally pass a real char * array to something that expects a char * from a zend_string *… which is where I'm not so sure if we can do it. It may loose some typing security :-/ Working on the allocation scheme just requires to store a new 'allocated size' element , which does not cost much and can avoid a lot of costly [e]realloc() calls. Then new functions may be defined, if needed, to control allocation policy. All of this doesn't require changing the existing API, these are just additions. I really don't understand why you are so sure that any change to the internal representation will require changes to the API. New functions can be added, yes, but we can improve a lot of things while keeping the same API. It's just my experience… I might be wrong, but that's what I experienced generally for such low-level structures. Also wrappers around such low-level structures often tend to be leaky or be just such a shitload of individual functions that nobody ever can remember it before having worked with it all day for a few months. (I mean leaky in a sense that you maybe are able to do everything, but not quite in the performant way we like to) You base all your examples on the fact that zend_string represents a structure. I don't assume anything at this level. Maybe we'll find that performance is better with a 2-level storage, storing fixed-length information in a pre-allocated array, for example, and storing the strings elsewhere. I leave it as open as possible, while you prefer constraints just because you cannot imagine today how it can evolve tomorrow. That one even would not be an issue with current zend_string *. zstr-val is a char *. Whether the char pointer is now zend_string * + 24 or elsewhere doesn't matter and user code also shouldn't rely on *that*. About hash values, nobody said we should automatically reset the hash value any time something is written. And you're wrong : we don't end up controlling the hash value manually. We control it through two well-defined methods. This is not low-level control, not the same as using 'zstr-h', for instance. It is part of the API, nothing shocking there. And nothing says that someone won't find some way to make hash management 'smarter', without doing millions of useless operations. There may be new operations but, once again, the existing ones will remain unchanged. As a conclusion, the zend_string API I propose provides some isolation, but you will be glad to know that it is not as advanced as I'd like, mostly for historical reasons. As an example, I am sure we will be annoyed by the 'persistent' argument to init/alloc/realloc. For init and alloc, it would be better to have a flag mask, allowing other flags to be defined in the future. We'll need this,
Re: [PHP-DEV] [RFC] UString
On Jul 1, 2015, at 1:06 PM, Sara Golemon poll...@php.net wrote: On Tue, Jun 30, 2015 at 10:36 PM, Joe Watkins pthre...@pthreads.org wrote: Another possible issue is engine integration: $string = (UString) $someString; $string = (UString) someString; That sounds as a cool idea to discuss as a completely separate, unrelated RFC, and not specific to UString. e.g. $obj = (ClassName)$arg; /* turns into */ $obj = new ClassName($arg); So you could use casting with any class which supports single-argument constructors. But again, orthogonal to this RFC. -Sara -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php Expanding on this idea, a separate RFC could propose a magic __cast($value) static method that would be called for code like below: $obj = (ClassName) $scalarOrObject; // Invokes ClassName::__cast($scalarOrObject); This would allow UString to implement casting a string to a UString and allow users to implement such behavior with their own classes. However, I would not implement such casting syntax for UString only. Being able to write $ustring = (UString) $string; without the ability to do so for other classes would be unusual and confusing in my opinion. If an RFC adding such behavior was implemented, UString could be updated to support casting. Obviously a UString should be able to be cast to a scalar string using (string) $ustring. If performance is a concern, UString::__toString() should cache the result so multiple casts to the same object are quick. Aaron Piotrowski -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP-DEV] [RFC] UString
Hi, -Original Message- From: Aaron Piotrowski [mailto:aa...@icicle.io] Sent: Wednesday, July 1, 2015 9:00 PM To: Sara Golemon Cc: pthre...@pthreads.org; internals@lists.php.net Subject: Re: [PHP-DEV] [RFC] UString On Jul 1, 2015, at 1:06 PM, Sara Golemon poll...@php.net wrote: On Tue, Jun 30, 2015 at 10:36 PM, Joe Watkins pthre...@pthreads.org wrote: Another possible issue is engine integration: $string = (UString) $someString; $string = (UString) someString; That sounds as a cool idea to discuss as a completely separate, unrelated RFC, and not specific to UString. e.g. $obj = (ClassName)$arg; /* turns into */ $obj = new ClassName($arg); So you could use casting with any class which supports single-argument constructors. But again, orthogonal to this RFC. -Sara -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php Expanding on this idea, a separate RFC could propose a magic __cast($value) static method that would be called for code like below: $obj = (ClassName) $scalarOrObject; // Invokes ClassName::__cast($scalarOrObject); This would allow UString to implement casting a string to a UString and allow users to implement such behavior with their own classes. However, I would not implement such casting syntax for UString only. Being able to write $ustring = (UString) $string; without the ability to do so for other classes would be unusual and confusing in my opinion. If an RFC adding such behavior was implemented, UString could be updated to support casting. Obviously a UString should be able to be cast to a scalar string using (string) $ustring. If performance is a concern, UString::__toString() should cache the result so multiple casts to the same object are quick. One way doing this is already there thanks https://wiki.php.net/rfc/operator_overloading_gmp . Consider $n = gmp_init(42); var_dump($n, (int)$n); However the other way round - could be done on case by case basis, IMHO. Where it could make sense for class vs scalar, casting class to class is a quite unpredictable thing. While users could implement it, how is it handled with arbitrary objects? How would it map properties, would those classes need to implement the same interface, et cetera? We're not in C at this point, where we would just force a block of memory to be interpreted as we want. Regards Anatol -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP-DEV] Improved zend_string API
De : Bob Weinand [mailto:bobw...@hotmail.com] Interesting idea (from the concept). I even could get on board with this; I'm just not that sure about the performance impact. (alignment of refcount to base address etc.) zend_string *s; s = zend_string_alloc(256, 0); ... ZSTR_SET_LEN(s, snprint(s, ZSTR_LEN(s)+1, format, args...); /* Overflow-protected */ ... hash = ZSTR_HASH(s); ... char *p = estrndup(s, ZSTR_LEN(s)); /* Result is a pure 'char *', not a zend_string */ ... ZSTR_RELEASE(s); Isn't it nice ? You're right, refcount alignement is the serious issue to solve. Peformance needs to be tested of course, but we compute '-val' much more often than we use the struct base address (for realloc/free only except when accessing gc). Accessing length and hash using an offset has the same cost as before. Also, you have to make sure, to not accidentally pass a real char * array to something that expects a char * from a zend_string *… which is where I'm not so sure if we can do it. It may loose some typing security :-/ The zend_string type would remain, even if typedef-ed to 'char'. So, 'zend_string *' declarations would remain, they wouldn't become 'char *' (just a question of declarations because it would work too). In debug mode, we also can add a marker in the structure to detect when we receive an invalid address, as it is sometimes done in memory management libs. Also wrappers around such low-level structures often tend to be leaky or be just such a shitload of individual functions that nobody ever can remember it before having worked with it all day for a few months. (I mean leaky in a sense that you maybe are able to do everything, but not quite in the performant way we like to) I agree, that's often the case but, even when performance impose low-level control, an appropriate abstraction layer often allows for a cleaner future. What I fear most is not this, it's having an idea impossible to implement because people designing the API did not think wide enough when it was time. The PHP history is full of such changes which required a lot of time and energy, just because everyone had been working during years with too short-sighted APIs. The art of APIs id to find the best compromise between usability, performance, and extensibility, knowing there's no ideal solution. And one additional difficulty is that many people think it is easy ! Meh, that persistent/non-persistent is annoying me a bit too sometimes. I don't disagree that we should change that. We could e.g. use GC_TYPE() and add our custom flag here. But maybe it's there to be able to tell compiler what branch in perealloc() will be taken, so that that branch can be compiled out. Abstraction APIs are a powerful tool, but they come at a price… Actually, the information is already stored in the struct, there's nothing to add. zend_string_release/free() functions even use it to determine which kind of mem they are freeing. It is not even consistent because only realloc/extend/truncate use this 'useless' arg. Actually, the only reason given is too optimize compilation. I know that everything has a price but I think that's going too much on the performance side. An argument with a single value allowed is not an argument, IMO. Remember, if the wrong value is given, your program crashes ! Anyway, I'd first like to measure the performance gains/loss of such choices. But issue is a bit now that with 7.1 we don't want to do major API changes. We're *allowed* to, but we shouldn't. While the politics of BC breaks at the PHP level is now quite well-defined, it is less clear at the C level. During 5.x, we saw several macros and functions disappear without notice, and nobody seems so surprised about it. IMO, the hardest part is to get a consensus on such changes. I don't feel controlled by Zend. The only major appearance from Zend was in scalar types discussion in form of Zeev. Err yeah, and maybe the PHP 7 name. Hah. Yes, it's maybe Zend which pushed Dmitry to optimize PHP to death, I have no idea. But that's a good thing, stop ranting against that :-P I don't say I'm controlled by Zend. I say we shouldn't have given the phpng whitecard with so few supervision and control from the community. Regards François -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] UString
On Jul 1, 2015, at 2:25 PM, Anatol Belski anatol@belski.net wrote: Expanding on this idea, a separate RFC could propose a magic __cast($value) static method that would be called for code like below: $obj = (ClassName) $scalarOrObject; // Invokes ClassName::__cast($scalarOrObject); This would allow UString to implement casting a string to a UString and allow users to implement such behavior with their own classes. However, I would not implement such casting syntax for UString only. Being able to write $ustring = (UString) $string; without the ability to do so for other classes would be unusual and confusing in my opinion. If an RFC adding such behavior was implemented, UString could be updated to support casting. Obviously a UString should be able to be cast to a scalar string using (string) $ustring. If performance is a concern, UString::__toString() should cache the result so multiple casts to the same object are quick. Hi, One way doing this is already there thanks https://wiki.php.net/rfc/operator_overloading_gmp . Consider $n = gmp_init(42); var_dump($n, (int)$n); However the other way round - could be done on case by case basis, IMHO. Where it could make sense for class vs scalar, casting class to class is a quite unpredictable thing. While users could implement it, how is it handled with arbitrary objects? How would it map properties, would those classes need to implement the same interface, et cetera? We're not in C at this point, where we would just force a block of memory to be interpreted as we want. Regards Anatol Hello, I was thinking that the __cast() static method would examine the parameter given, then use that value to build a new object and return it or return null (which would then result in the engine throwing an Error saying that $scalarOrValue could not be cast to ClassName). It was just a suggestion to see what others thought because someone suggested supporting casting syntax such as $ustring = (UString) $scalarString. I don’t really care for either method though (__cast() or enabling casting just for UString), as they don't offer any advantage over writing new UString($string) or UString::fromString($string). Aaron Piotrowski -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] UString
Hi Joe. Am 01.07.15 um 07:36 schrieb Joe Watkins: [..] Another possible issue is engine integration: $string = (UString) $someString; $string = (UString) someString; These aren't very different to 'new UString', but for an integrated solution, kind of expected to work. Why would that be expected behaviour? I mean I can't do $date = (DateTime) $timestring; after all, can I? But I can use $date = new DateTime($timestring); Just my 2 Cent. Cheers Andreas -- ,,, (o o) +-ooO-(_)-Ooo-+ | Andreas Heigl | | mailto:andr...@heigl.org N 50°22'59.5 E 08°23'58 | | http://andreas.heigl.org http://hei.gl/wiFKy7 | +-+ | http://hei.gl/root-ca | +-+ smime.p7s Description: S/MIME Cryptographic Signature
Re: [PHP-DEV] [RFC] UString
Morning, Why would that be expected behaviour? I mean I can't do $date = (DateTime) $timestring; No, but you can't do: $string = (string) $datetime; But can do: $string = (string) $ustring; Where $ustring is instanceof UString. Even if you never write $string = (string) $ustring, the engine will perform the same action all the time, whenever you pass a UString to anything expecting string. It feels like a complete implementation should support both casts. Cheers Joe On Wed, Jul 1, 2015 at 7:38 AM, Andreas Heigl andr...@heigl.org wrote: Hi Joe. Am 01.07.15 um 07:36 schrieb Joe Watkins: [..] Another possible issue is engine integration: $string = (UString) $someString; $string = (UString) someString; These aren't very different to 'new UString', but for an integrated solution, kind of expected to work. Why would that be expected behaviour? I mean I can't do $date = (DateTime) $timestring; after all, can I? But I can use $date = new DateTime($timestring); Just my 2 Cent. Cheers Andreas -- ,,, (o o) +-ooO-(_)-Ooo-+ | Andreas Heigl | | mailto:andr...@heigl.org N 50°22'59.5 E 08°23'58 | | http://andreas.heigl.org http://hei.gl/wiFKy7 | +-+ | http://hei.gl/root-ca | +-+
Re: [PHP-DEV] Fix division by zero to throw exception (round 2)
Hi Bob, On 2 Jul 2015, at 01:26, Bob Weinand bobw...@hotmail.com wrote: Am 29.06.2015 um 19:14 schrieb Andrea Faulds a...@ajf.me: Hmm. Using Error might make some sense given it used to raise E_WARNING. I think DivisionByZeroError sounds like a good idea. Hey, I just committed that to master… Great! But I noticed that intdiv(PHP_INT_MIN, -1) isn't very well suited for a DivisionByZeroError. What do you think about adding an ArithmeticError for that case (and making DivisionByZeroError subclass of it)? That ArithmeticError could then be reused for negative bitshifts, which would solve the question what to do with that too. Well, that specific case is integer overflow. Normally in PHP we just upgrade to float instead of throwing an error in these situations, but for intdiv() I didn’t think that made sense (it’s *integer* division). So, maybe OverflowError would be a better name. But we don’t really do overflow errors anywhere else that I can think of, so the more general ArithmeticError might be fine. Thanks. -- Andrea Faulds http://ajf.me/ -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php