Re: [PHP-DEV] New handler for retrieving properties when object is serialized
On 27 July 2013 15:58, Jakub Zelenka bu...@php.net wrote: Alternatively, could the problem perhaps be mitigated by making available some utility functions for serializing native types? We could definitely make some parts of serializer API-accessible, or even make serializer as a whole to be more API-friendly, I think it'd be a nice idea if making implementing Serializable would be made easy by reusing serializer code and combining pieces. If now it's hard making it easier definitely a good idea, much better than creating one more API IMHO. I was thinking about that and it could actually be done in a way that would resolve the problem. I have just created RFC that specify the API. https://wiki.php.net/rfc/internal_serialize_api It's backward compatible and allow portable object generation in more abstract way. What's about unserialization? Where does the state come from when recursively calling serialize? -- Regards, Mike -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
Hi What's about unserialization? Yeah that's actually a good point! :) It could be handled by implicit object unserialization if the unserialize callback is NULL. However this is not optimal for performance and also create properties in the object which doesn't have to be necessary. There should be functions that help with parsing the string buffer that is passed to unserialize class callback. Sort of opposite to API for serialization. It should be part of RFC. I will add it. Where does the state come from when recursively calling serialize? I am not sure if I understand the question. The serialize_state will be a sort of wrapper for smart_str with some additions for checking whether the functions are called in correct order. The state instance will be in most cases a local variable (as it is in the example). In case you try to serialize object zval (using php_serailize_property_zval), it would try to serialize the object the same way and the resulted string would be added to the state buffer (will be in RFC...). Did I answer your question? I will try to clarify the RFC a bit. Any feedback would be much appreciated. Thanks. Jakub
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
Alternatively, could the problem perhaps be mitigated by making available some utility functions for serializing native types? We could definitely make some parts of serializer API-accessible, or even make serializer as a whole to be more API-friendly, I think it'd be a nice idea if making implementing Serializable would be made easy by reusing serializer code and combining pieces. If now it's hard making it easier definitely a good idea, much better than creating one more API IMHO. I was thinking about that and it could actually be done in a way that would resolve the problem. I have just created RFC that specify the API. https://wiki.php.net/rfc/internal_serialize_api It's backward compatible and allow portable object generation in more abstract way. It's just draft so I am happy to do changes. Jakub
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
What about using our Serializable interface? - That was invented to serialize internal objects with arbitrary internal data. This is a bit painful to generate and then parse (unserialize) string if you do it in C... Especially if you need only add few properties that will contain data for serialization. In that case Serailizable is a bit overkill... And for my use case (DateTime serialization) it would be BC as well. Different format (Serializable always generate Custom string). Jakub
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
You can check for BG(serialize_lock) instead of GC_G(gc_active) What if there was an extension that call php_var_serialize without locking? I know there probably isn't any but still I don't think that this is a nice way how to provide data for serialization. What I think is that we should add an extra parameter to get_properties instead of keeping adding these. That would be nice. I will send a patch later and then we can discuss it... ;)
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
On Fri, 2013-07-26 at 10:38 +0100, Jakub Zelenka wrote: This is a bit painful to generate and then parse (unserialize) string if you do it in C... Especially if you need only add few properties that will contain data for serialization. In that case Serailizable is a bit overkill... So rather change the engine, which affects way more things and add yet another serialisation hook making it even more confusing what to use? And for my use case (DateTime serialization) it would be BC as well. Different format (Serializable always generate Custom string). You're solution also depends on an API break and also changes the way the objects behave during serialization/unserialization. Doesn't convince me. johannes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
So rather change the engine, which affects way more things and add yet another serialisation hook making it even more confusing what to use? Yes I agree that the new hook is confusing. I think that what Gustavo said would be actually much better. Having some flag that would say what is the purpose of retrieving properties could be very useful. It's not just serialization but also casting object to array. There can be even more use cases in the future. On top of it the hooks like get_gc and get_debug_info could be part of it. But I would rather discuss this leter when I send a patch. You're solution also depends on an API break and also changes the way the objects behave during serialization/unserialization. Not sure what you mean by API break and changing the way the objects behave? When you look to this pach https://github.com/bukka/php-src/compare/date_serialize there shouldn't be any difference in object behaviour. Or am I wrong? By BC I meant the resulted string that you get after using serialize function. If you use Serializable you get something that is prefixed by C: . But if you have object, the prefix is O: . Please see this http://3v4l.org/Sd8aT . Jakub
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
On Fri, 2013-07-26 at 11:58 +0100, Jakub Zelenka wrote: So rather change the engine, which affects way more things and add yet another serialisation hook making it even more confusing what to use? Yes I agree that the new hook is confusing. I think that what Gustavo said would be actually much better. Having some flag that would say what is the purpose of retrieving properties could be very useful. It's not just serialization but also casting object to array. There can be even more use cases in the future. On top of it the hooks like get_gc and get_debug_info could be part of it. But I would rather discuss this leter when I send a patch. This means one has to touch every extension using those hooks. more use cases in the future means having to do this over and over again, for a feature I see little benefit in. Serializable was added precisely for the purpose to allow arbitrary internal classes to provide serializer logic. You're solution also depends on an API break and also changes the way the objects behave during serialization/unserialization. Not sure what you mean by API break Breaking PHP's API. Requiring modifications to other extensions. and changing the way the objects behave? When you look to this pach https://github.com/bukka/php-src/compare/date_serialize there shouldn't be any difference in object behaviour. Or am I wrong? I haven't looked deeper but isn't your whole point to change the serialized representation of the object, to include all data properly? By BC I meant the resulted string that you get after using serialize function. If you use Serializable you get something that is prefixed by C: . But if you have object, the prefix is O: . Please see this http://3v4l.org/Sd8aT . I know the difference. So yes, new serialized files can't be properly read on old PHP. johannes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
This means one has to touch every extension using those hooks. more use cases in the future means having to do this over and over again, for a feature I see little benefit in. I don't want to change get_properties. That would break too many extensions. The idea is to create something like HashTable *get_special_properties(zval *object, int purpose, int *is_tmp TSRMLS_DC) This would replace get_debug_info so the number of hooks would stay the same. The only extensions that would need to be changed are the ones that use get_debug_info. The purposes could be debug, serialize, array_casting. If there is a new use case, you would just add new purpose and wouldn't have to change anything. I haven't looked deeper but isn't your whole point to change the serialized representation of the object, to include all data properly? No the whole point was returning the same properties but only if the object is serialized or debugged. Deeper explanation is in this thread: http://news.php.net/php.internals/68305 Jakub
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
On Fri, 2013-07-26 at 14:14 +0100, Jakub Zelenka wrote: This means one has to touch every extension using those hooks. more use cases in the future means having to do this over and over again, for a feature I see little benefit in. I don't want to change get_properties. That would break too many extensions. The idea is to create something like HashTable *get_special_properties(zval *object, int purpose, int *is_tmp TSRMLS_DC) This would replace get_debug_info so the number of hooks would stay the same. The only extensions that would need to be changed are the ones that use get_debug_info. The purposes could be debug, serialize, array_casting. If there is a new use case, you would just add new purpose and wouldn't have to change anything. Such a way for overloading is bad. This is painful on each update. If there are different things to do use different hooks. The implementations might call the same implementation but a purpose option leads to a long term maintenance hell whenever there is a change.(a tiny bit better would be using an enum instead of int as in some situations compilers/analyzers might give a hint ... but still a mess) But I still don't buy the argument for this hook. Typically internal classes use some C data structure for keeping their internal state. Often actually in a form that it can not be represented in zval. From that observation (after a few years of looking at phpsrc, pecl external and close-source exts) Serializable is the preferred way to serialize these in an efficient way (converting everything in zvals, then serialize, finally freeing temporary data is a quite expensive way ..) (mind: comparison with debug hooks is only of limited use. a) a debug feature has less performance relevance b) a debug feature doesn't need to convert the full state and has no need to be reversible) So what is being proposed is to extend our already too large API with yet another way of doing something for edge cases for which we have a good solution for the general case already. I haven't looked deeper but isn't your whole point to change the serialized representation of the object, to include all data properly? No the whole point was returning the same properties but only if the object is serialized or debugged. Deeper explanation is in this thread: So the purpose is to have a different serialized representation, so there would be compatibility issues, similar to going to Serializable. johannes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
On Fri, Jul 26, 2013 at 2:59 PM, Johannes Schlüter johan...@schlueters.dewrote: Such a way for overloading is bad. This is painful on each update. If there are different things to do use different hooks. The implementations might call the same implementation but a purpose option leads to a long term maintenance hell whenever there is a change.(a tiny bit better would be using an enum instead of int as in some situations compilers/analyzers might give a hint ... but still a mess) Ok so probably better not to do that :) But I still don't buy the argument for this hook. Typically internal classes use some C data structure for keeping their internal state. Often actually in a form that it can not be represented in zval. From that observation (after a few years of looking at phpsrc, pecl external and close-source exts) Serializable is the preferred way to serialize these in an efficient way (converting everything in zvals, then serialize, finally freeing temporary data is a quite expensive way ..) I agree that Serializable is better for performance. The reason why I proposed this is to differentiate serialization properties from normal properties in cases where Serializable would require lots of coding for a small benefit (object is not serialized often) or it is not an options (DateTime - see bellow). So the purpose is to have a different serialized representation, so there would be compatibility issues, similar to going to Serializable. Yeah but that serialized representation is exactly the same as the current representation (the serialized strings before the patch is exactly the same as the string after the patch :)). The difference is that get_properties returns empty HashTable now. The serialization is backward compatible though. I just want to find a solution for this problem. What would you do? Would you use Serialazable even if it breaks apps after updating PHP. Or would you still use get_properties (possibly with testing serialize_lock) ? Jakub
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
Hi! This is a bit painful to generate and then parse (unserialize) string if you do it in C... Especially if you need only add few properties that will contain data for serialization. In that case Serailizable is a bit overkill... And for my use case (DateTime serialization) it would be BC as well. Different format (Serializable always generate Custom string). I think given that we already have not one, but two interfaces for dealing with serialization (__sleep and Serializable), adding a third one that does essentially the same but with a little difference would be really too much. -- Stanislav Malyshev, Software Architect SugarCRM: http://www.sugarcrm.com/ (408)454-6900 ext. 227 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
On 26-07-2013 15:59, Johannes Schlüter wrote: Such a way for overloading is bad. This is painful on each update. If there are different things to do use different hooks. The implementations might call the same implementation but a purpose option leads to a long term maintenance hell whenever there is a change.(a tiny bit better would be using an enum instead of int as in some situations compilers/analyzers might give a hint ... but still a mess) This is a gross misrepresentation. The vast majority of extensions will not care about this purpose, just like most extensions do not care about get_debug_info or get_gc or about the purpose in read_property. If a new purpose is added, this would not be any different than adding a new hook as long as: 1) the extension doesn't purposefully break forward compatibility, for example by doing switch(purpose) { ... default: panic(); } and 2) the new hook would default to get_properties, either by explicitly setting the pointer to get_properties in the standard handlers array or by having the engine fall back to get_properties (the last one was the solution adopted for get_debug_info and get_gc afaik). Yes, this would break extensions. But we never made any promises about source BC for extensions. In fact, we actually break it for every release, sometimes in rather subtle ways like in 5.4 (due to interned strings, and object properties storage). So what is being proposed is to extend our already too large API with yet another way of doing something for edge cases for which we have a good solution for the general case already. Alternatively, could the problem perhaps be mitigated by making available some utility functions for serializing native types? -- Gustavo Lopes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
On 26-07-2013 11:53, Jakub Zelenka wrote: You can check for BG(serialize_lock) instead of GC_G(gc_active) What if there was an extension that call php_var_serialize without locking? I know there probably isn't any but still I don't think that this is a nice way how to provide data for serialization. I think not using PHP_VAR_SERIALIZE_INIT() would always be an error in the extension anyway. However, you are right this method is not nice by any means. -- Gustavo Lopes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
Hi! Yes, this would break extensions. But we never made any promises about source BC for extensions. In fact, we actually break it for every Actually, we try to preserve source BC for extensions, unless they do some weird things or they go very deep into the implementation details. Regular extension generally should not suffer from adding a feature. Alternatively, could the problem perhaps be mitigated by making available some utility functions for serializing native types? We could definitely make some parts of serializer API-accessible, or even make serializer as a whole to be more API-friendly, I think it'd be a nice idea if making implementing Serializable would be made easy by reusing serializer code and combining pieces. If now it's hard making it easier definitely a good idea, much better than creating one more API IMHO. -- Stanislav Malyshev, Software Architect SugarCRM: http://www.sugarcrm.com/ (408)454-6900 ext. 227 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] New handler for retrieving properties when object is serialized
Hi, I have just sent this PR: https://github.com/php/php-src/pull/397 It's reaction to the comment from nikic in PR: https://github.com/php/php-src/pull/393 The patch is about adding new object handler that is used when object is serialized. Currently this needs to be done using get_properties which is sometimes a bit over kill. It's useful in situation if you only need to change properties for serialization. It's very similar to get_debug_info that is used only in specific situations (print_r...) Currently I call it get_serialize_info . It was the first name that came to my mind. :) I am happy to change if someone has a better idea? Any thoughts? Jakub
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
Hi! It's useful in situation if you only need to change properties for serialization. It's very similar to get_debug_info that is used only in specific situations (print_r...) Isn't __sleep already doing this? -- Stanislav Malyshev, Software Architect SugarCRM: http://www.sugarcrm.com/ (408)454-6900 ext. 227 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
On Thu, Jul 25, 2013 at 9:09 PM, Stas Malyshev smalys...@sugarcrm.comwrote: Hi! It's useful in situation if you only need to change properties for serialization. It's very similar to get_debug_info that is used only in specific situations (print_r...) Isn't __sleep already doing this? __sleep returns an array of object property *names* to be serialized. Using __sleep you can restrict the set of serialized properties, but the properties still need to exist. Nikita
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
Hi! __sleep returns an array of object property *names* to be serialized. Using __sleep you can restrict the set of serialized properties, but the properties still need to exist. You can create those properties when processing __sleep. Nothing requires them to exist before __sleep was called. -- Stanislav Malyshev, Software Architect SugarCRM: http://www.sugarcrm.com/ (408)454-6900 ext. 227 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
On Thu, Jul 25, 2013 at 8:56 PM, Stas Malyshev smalys...@sugarcrm.comwrote: You can create those properties when processing __sleep. Nothing requires them to exist before __sleep was called. The problem is that properties will exist after __sleep was called. :) There are situations that you don't want them to exist (for example DateTime).
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
On 25-07-2013 18:42, Jakub Zelenka wrote: I have just sent this PR: https://github.com/php/php-src/pull/397 It's reaction to the comment from nikic in PR: https://github.com/php/php-src/pull/393 The patch is about adding new object handler that is used when object is serialized. Currently this needs to be done using get_properties which is sometimes a bit over kill. It's useful in situation if you only need to change properties for serialization. It's very similar to get_debug_info that is used only in specific situations (print_r...) This seems too narrow a situation to warrant a new serialization method, in addition to the two we already have. That said, there are two precedents: * get_debug_info (variant for var_dump) * get_gc (variant for garbage collection) And anyway, if you really want to add special properties for serialization, you can use the same hack that was used before get_gc was available: http://lxr.php.net/xref/PHP_5_3/ext/spl/spl_observer.c#297 (and http://lxr.php.net/xref/PHP_5_3/ext/spl/spl_observer.c#258 ) You can check for BG(serialize_lock) instead of GC_G(gc_active) What I think is that we should add an extra parameter to get_properties instead of keeping adding these. -- Gustavo Lopes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New handler for retrieving properties when object is serialized
On Thu, 2013-07-25 at 21:01 +0100, Jakub Zelenka wrote: On Thu, Jul 25, 2013 at 8:56 PM, Stas Malyshev smalys...@sugarcrm.comwrote: You can create those properties when processing __sleep. Nothing requires them to exist before __sleep was called. The problem is that properties will exist after __sleep was called. :) There are situations that you don't want them to exist (for example DateTime). What about using our Serializable interface? - That was invented to serialize internal objects with arbitrary internal data. johannes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php