Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-29 Thread Michael Wallner
On 27 July 2013 15:58, Jakub Zelenka bu...@php.net wrote:

  Alternatively, could the problem perhaps be mitigated by making
  available some utility functions for serializing native types?

 We could definitely make some parts of serializer API-accessible, or
 even make serializer as a whole to be more API-friendly, I think it'd be
 a nice idea if making implementing Serializable would be made easy by
 reusing serializer code and combining pieces. If now it's hard making it
 easier definitely a good idea, much better than creating one more API IMHO.


 I was thinking about that and it could actually be done in a way that would
 resolve the problem. I have just created RFC that specify the API.

 https://wiki.php.net/rfc/internal_serialize_api

 It's backward compatible and allow portable object generation in more
 abstract way.

What's about unserialization?
Where does the state come from when recursively calling serialize?


-- 
Regards,
Mike

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-29 Thread Jakub Zelenka
Hi

What's about unserialization?


Yeah that's actually a good point! :) It could be handled by implicit
object unserialization if the unserialize callback is NULL. However this is
not optimal for performance and also create properties in the object which
doesn't have to be necessary. There should be functions that help with
parsing the string buffer that is passed to unserialize class callback.
Sort of opposite to API for serialization.

It should be part of RFC. I will add it.


 Where does the state come from when recursively calling serialize?


I am not sure if I understand the question. The serialize_state will be a
sort of wrapper for smart_str with some additions for checking whether the
functions are called in correct order. The state instance will be in most
cases a local variable (as it is in the example). In case you try to
serialize object zval (using php_serailize_property_zval), it would try to
serialize the object the same way and the resulted string would be added to
the state buffer (will be in RFC...). Did I answer your question?

I will try to clarify the RFC a bit. Any feedback would be much
appreciated. Thanks.

Jakub


Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-27 Thread Jakub Zelenka

  Alternatively, could the problem perhaps be mitigated by making
  available some utility functions for serializing native types?

 We could definitely make some parts of serializer API-accessible, or
 even make serializer as a whole to be more API-friendly, I think it'd be
 a nice idea if making implementing Serializable would be made easy by
 reusing serializer code and combining pieces. If now it's hard making it
 easier definitely a good idea, much better than creating one more API IMHO.


I was thinking about that and it could actually be done in a way that would
resolve the problem. I have just created RFC that specify the API.

https://wiki.php.net/rfc/internal_serialize_api

It's backward compatible and allow portable object generation in more
abstract way.

It's just draft so I am happy to do changes.

Jakub


Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-26 Thread Jakub Zelenka


 What about using our Serializable interface? - That was invented to
 serialize internal objects with arbitrary internal data.


This is a bit painful to generate and then parse (unserialize) string if
you do it in C... Especially if you need only add few properties that will
contain data for serialization. In that case Serailizable is a bit
overkill...

And for my use case (DateTime serialization) it would be BC as well.
Different format (Serializable always generate Custom string).

Jakub


Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-26 Thread Jakub Zelenka
 You can check for BG(serialize_lock) instead of GC_G(gc_active)


What if there was an extension that call php_var_serialize without locking?
I know there probably isn't any but still I don't think that this is a nice
way how to provide data for serialization.



 What I think is that we should add an extra parameter to get_properties
 instead of keeping adding these.


That would be nice. I will send a patch later and then we can discuss it...
;)


Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-26 Thread Johannes Schlüter
On Fri, 2013-07-26 at 10:38 +0100, Jakub Zelenka wrote:


 This is a bit painful to generate and then parse (unserialize) string
 if you do it in C... Especially if you need only add few properties
 that will contain data for serialization. In that case Serailizable is
 a bit overkill...

So rather change the engine, which affects way more things and add yet
another serialisation hook making it even more confusing what to use?
 
 
 And for my use case (DateTime serialization) it would be BC as well.
 Different format (Serializable always generate Custom string).
 
You're solution also depends on an API break and also changes the way
the objects behave during serialization/unserialization.

Doesn't convince me.

johannes




-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-26 Thread Jakub Zelenka

 So rather change the engine, which affects way more things and add yet
 another serialisation hook making it even more confusing what to use?
 


Yes I agree that the new hook is confusing. I think that what Gustavo said
would be actually much better. Having some flag that would say what is
the purpose of retrieving properties could be very useful. It's not just
serialization
but also casting object to array. There can be even more use cases
in the future. On top of it the hooks like get_gc and get_debug_info
could be part of it. But I would rather discuss this leter
when I send a patch.



 You're solution also depends on an API break and also changes the way
 the objects behave during serialization/unserialization.


Not sure what you mean by API break and changing the way the objects
behave? When you look to this pach
https://github.com/bukka/php-src/compare/date_serialize there shouldn't be
any difference in object behaviour. Or am I wrong?

By BC I meant the resulted string that you get after using serialize
function. If you use Serializable you get something that is prefixed by C:
. But if you have object, the prefix is O: . Please see this
http://3v4l.org/Sd8aT .

Jakub


Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-26 Thread Johannes Schlüter
On Fri, 2013-07-26 at 11:58 +0100, Jakub Zelenka wrote:
 
  So rather change the engine, which affects way more things and add yet
  another serialisation hook making it even more confusing what to use?
  
 
 
 Yes I agree that the new hook is confusing. I think that what Gustavo said
 would be actually much better. Having some flag that would say what is
 the purpose of retrieving properties could be very useful. It's not just
 serialization
 but also casting object to array. There can be even more use cases
 in the future. On top of it the hooks like get_gc and get_debug_info
 could be part of it. But I would rather discuss this leter
 when I send a patch.

This means one has to touch every extension using those hooks. more use
cases in the future means having to do this over and over again, for a
feature I see little benefit in. Serializable was added precisely for
the purpose to allow arbitrary internal classes to provide serializer
logic.

 
 
 
  You're solution also depends on an API break and also changes the way
  the objects behave during serialization/unserialization.
 
 
 Not sure what you mean by API break

Breaking PHP's API. Requiring modifications to other extensions.

  and changing the way the objects
 behave? When you look to this pach
 https://github.com/bukka/php-src/compare/date_serialize there shouldn't be
 any difference in object behaviour. Or am I wrong?

I haven't looked deeper but isn't your whole point to change the
serialized representation of the object, to include all data properly?

 By BC I meant the resulted string that you get after using serialize
 function. If you use Serializable you get something that is prefixed by C:
 . But if you have object, the prefix is O: . Please see this
 http://3v4l.org/Sd8aT .

I know the difference. So yes, new serialized files can't be properly
read on old PHP. 

johannes



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-26 Thread Jakub Zelenka


 This means one has to touch every extension using those hooks. more use
 cases in the future means having to do this over and over again, for a
 feature I see little benefit in.


I don't want to change get_properties. That would break too many
extensions. The idea is to create something like

HashTable *get_special_properties(zval *object, int purpose, int *is_tmp
TSRMLS_DC)

This would replace get_debug_info so the number of hooks would stay the
same.
The only extensions that would need to be changed are the ones that use
get_debug_info. The purposes could be debug, serialize, array_casting. If
there is
a new use case, you would just add new purpose and wouldn't have to change
anything.



 I haven't looked deeper but isn't your whole point to change the
 serialized representation of the object, to include all data properly?


No the whole point was returning the same properties but only if the object
is
serialized or debugged. Deeper explanation is in this thread:

http://news.php.net/php.internals/68305

Jakub


Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-26 Thread Johannes Schlüter
On Fri, 2013-07-26 at 14:14 +0100, Jakub Zelenka wrote:
 
 
  This means one has to touch every extension using those hooks. more use
  cases in the future means having to do this over and over again, for a
  feature I see little benefit in.
 
 
 I don't want to change get_properties. That would break too many
 extensions. The idea is to create something like
 
 HashTable *get_special_properties(zval *object, int purpose, int *is_tmp
 TSRMLS_DC)
 
 This would replace get_debug_info so the number of hooks would stay the
 same.
 The only extensions that would need to be changed are the ones that use
 get_debug_info. The purposes could be debug, serialize, array_casting. If
 there is a new use case, you would just add new purpose and wouldn't have to 
 change
 anything.

Such a way for overloading is bad. This is painful on each update. If
there are different things to do use different hooks. The
implementations might call the same implementation but a purpose
option leads to a long term maintenance hell whenever there is a
change.(a tiny bit better would be using an enum instead of int as in
some situations compilers/analyzers might give a hint ... but still a
mess)

But I still don't buy the argument for this hook. 

Typically internal classes use some C data structure for keeping their
internal state. Often actually in a form that it can not be represented
in zval. From that observation (after a few years of looking at phpsrc,
pecl external and close-source exts) Serializable is the preferred way
to serialize these in an efficient way (converting everything in zvals,
then serialize, finally freeing temporary data is a quite expensive
way ..)
(mind: comparison with debug hooks is only of limited use. a) a debug
feature has less performance relevance b) a debug feature doesn't need
to convert the full state and has no need to be reversible)


So what is being proposed is to extend our already too large API with
yet another way of doing something for edge cases for which we have a
good solution for the general case already.

  I haven't looked deeper but isn't your whole point to change the
  serialized representation of the object, to include all data properly?
 
 
 No the whole point was returning the same properties but only if the object
 is serialized or debugged. Deeper explanation is in this thread:

So the purpose is to have a different serialized representation, so
there would be compatibility issues, similar to going to Serializable.

johannes



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-26 Thread Jakub Zelenka
On Fri, Jul 26, 2013 at 2:59 PM, Johannes Schlüter
johan...@schlueters.dewrote:


 Such a way for overloading is bad. This is painful on each update. If
 there are different things to do use different hooks. The
 implementations might call the same implementation but a purpose
 option leads to a long term maintenance hell whenever there is a
 change.(a tiny bit better would be using an enum instead of int as in
 some situations compilers/analyzers might give a hint ... but still a
 mess)


Ok so probably better not to do that :)


 But I still don't buy the argument for this hook.

 Typically internal classes use some C data structure for keeping their
 internal state. Often actually in a form that it can not be represented
 in zval. From that observation (after a few years of looking at phpsrc,
 pecl external and close-source exts) Serializable is the preferred way
 to serialize these in an efficient way (converting everything in zvals,
 then serialize, finally freeing temporary data is a quite expensive
 way ..)


I agree that Serializable is better for performance. The reason why
I proposed this is to differentiate serialization properties from normal
properties in cases where Serializable would require lots
of coding for a small benefit (object is not serialized often)
or it is not an options (DateTime - see bellow).



 So the purpose is to have a different serialized representation, so
 there would be compatibility issues, similar to going to Serializable.


Yeah but that serialized representation is exactly the same as
the current representation (the serialized strings before the patch is
exactly the same as the string after the patch :)).  The difference is that
get_properties returns empty HashTable now. The serialization is
backward compatible though.

I just want to find a solution for this problem. What would you do?
Would you use Serialazable even if it breaks apps after updating PHP.
Or would you still use get_properties (possibly with testing
serialize_lock) ?

Jakub


Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-26 Thread Stas Malyshev
Hi!

 This is a bit painful to generate and then parse (unserialize) string if
 you do it in C... Especially if you need only add few properties that
 will contain data for serialization. In that case Serailizable is a bit
 overkill...
 
 And for my use case (DateTime serialization) it would be BC as well.
 Different format (Serializable always generate Custom string).

I think given that we already have not one, but two interfaces for
dealing with serialization (__sleep and Serializable), adding a third
one that does essentially the same but with a little difference would be
really too much.

-- 
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-26 Thread Gustavo Lopes

On 26-07-2013 15:59, Johannes Schlüter wrote:


Such a way for overloading is bad. This is painful on each update. If
there are different things to do use different hooks. The
implementations might call the same implementation but a purpose
option leads to a long term maintenance hell whenever there is a
change.(a tiny bit better would be using an enum instead of int as in
some situations compilers/analyzers might give a hint ... but still a
mess)


This is a gross misrepresentation. The vast majority of extensions will 
not care about this purpose, just like most extensions do not care about 
get_debug_info or get_gc or about the purpose in read_property. If a new 
purpose is added, this would not be any different than adding a new hook 
as long as:


1) the extension doesn't purposefully break forward compatibility, for 
example by doing switch(purpose) { ... default: panic(); } and
2) the new hook would default to get_properties, either by explicitly 
setting the pointer to get_properties in the standard handlers array or 
by having the engine fall back to get_properties (the last one was the 
solution adopted for get_debug_info and get_gc afaik).


Yes, this would break extensions. But we never made any promises about 
source BC for extensions. In fact, we actually break it for every 
release, sometimes in rather subtle ways like in 5.4 (due to interned 
strings, and object properties storage).




So what is being proposed is to extend our already too large API with
yet another way of doing something for edge cases for which we have a
good solution for the general case already.



Alternatively, could the problem perhaps be mitigated by making 
available some utility functions for serializing native types?


--
Gustavo Lopes

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-26 Thread Gustavo Lopes

On 26-07-2013 11:53, Jakub Zelenka wrote:

You can check for BG(serialize_lock) instead of GC_G(gc_active)



What if there was an extension that call php_var_serialize without locking?
I know there probably isn't any but still I don't think that this is a nice
way how to provide data for serialization.



I think not using PHP_VAR_SERIALIZE_INIT() would always be an error in 
the extension anyway. However, you are right this method is not nice 
by any means.


--
Gustavo Lopes

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-26 Thread Stas Malyshev
Hi!

 Yes, this would break extensions. But we never made any promises about 
 source BC for extensions. In fact, we actually break it for every 

Actually, we try to preserve source BC for extensions, unless they do
some weird things or they go very deep into the implementation details.
Regular extension generally should not suffer from adding a feature.

 Alternatively, could the problem perhaps be mitigated by making 
 available some utility functions for serializing native types?

We could definitely make some parts of serializer API-accessible, or
even make serializer as a whole to be more API-friendly, I think it'd be
a nice idea if making implementing Serializable would be made easy by
reusing serializer code and combining pieces. If now it's hard making it
easier definitely a good idea, much better than creating one more API IMHO.
-- 
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-25 Thread Jakub Zelenka
Hi,

I have just sent this PR:  https://github.com/php/php-src/pull/397

It's reaction to the comment from nikic in PR:
https://github.com/php/php-src/pull/393

The patch is about adding new object handler that is used when object is
serialized. Currently this needs to be done using get_properties which is
sometimes a bit over kill.

It's useful in situation if you only need to change properties for
serialization. It's very similar to get_debug_info that is used only in
specific situations (print_r...)

Currently I call it get_serialize_info . It was the first name that came to
my mind. :) I am happy to change if someone has a better idea?

Any thoughts?

Jakub


Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-25 Thread Stas Malyshev
Hi!

 It's useful in situation if you only need to change properties for
 serialization. It's very similar to get_debug_info that is used only in
 specific situations (print_r...)

Isn't __sleep already doing this?

-- 
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-25 Thread Nikita Popov
On Thu, Jul 25, 2013 at 9:09 PM, Stas Malyshev smalys...@sugarcrm.comwrote:

 Hi!

  It's useful in situation if you only need to change properties for
  serialization. It's very similar to get_debug_info that is used only in
  specific situations (print_r...)

 Isn't __sleep already doing this?


__sleep returns an array of object property *names* to be serialized. Using
__sleep you can restrict the set of serialized properties, but the
properties still need to exist.

Nikita


Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-25 Thread Stas Malyshev
Hi!

 __sleep returns an array of object property *names* to be serialized.
 Using __sleep you can restrict the set of serialized properties, but the
 properties still need to exist.

You can create those properties when processing __sleep. Nothing
requires them to exist before __sleep was called.

-- 
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-25 Thread Jakub Zelenka
On Thu, Jul 25, 2013 at 8:56 PM, Stas Malyshev smalys...@sugarcrm.comwrote:


 You can create those properties when processing __sleep. Nothing
 requires them to exist before __sleep was called.


The problem is that properties will exist after __sleep was called. :)
There are situations that you don't want them to exist (for example
DateTime).


Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-25 Thread Gustavo Lopes

On 25-07-2013 18:42, Jakub Zelenka wrote:

I have just sent this PR:  https://github.com/php/php-src/pull/397

It's reaction to the comment from nikic in PR:
https://github.com/php/php-src/pull/393

The patch is about adding new object handler that is used when object is
serialized. Currently this needs to be done using get_properties which is
sometimes a bit over kill.

It's useful in situation if you only need to change properties for
serialization. It's very similar to get_debug_info that is used only in
specific situations (print_r...)



This seems too narrow a situation to warrant a new serialization method, 
in addition to the two we already have.


That said, there are two precedents:

* get_debug_info (variant for var_dump)
* get_gc (variant for garbage collection)

And anyway, if you really want to add special properties for 
serialization, you can use the same hack that was used before get_gc was 
available:


http://lxr.php.net/xref/PHP_5_3/ext/spl/spl_observer.c#297
(and http://lxr.php.net/xref/PHP_5_3/ext/spl/spl_observer.c#258 )

You can check for BG(serialize_lock) instead of GC_G(gc_active)

What I think is that we should add an extra parameter to get_properties 
instead of keeping adding these.


--
Gustavo Lopes


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] New handler for retrieving properties when object is serialized

2013-07-25 Thread Johannes Schlüter
On Thu, 2013-07-25 at 21:01 +0100, Jakub Zelenka wrote:
 On Thu, Jul 25, 2013 at 8:56 PM, Stas Malyshev smalys...@sugarcrm.comwrote:
 
 
  You can create those properties when processing __sleep. Nothing
  requires them to exist before __sleep was called.
 
 
 The problem is that properties will exist after __sleep was called. :)
 There are situations that you don't want them to exist (for example
 DateTime).

What about using our Serializable interface? - That was invented to
serialize internal objects with arbitrary internal data.

johannes



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php