Re: [PHP-DEV] [RFC] Lazy Objects

Tim Düsterhus Fri, 05 Jul 2024 12:50:35 -0700

Hi

On 7/2/24 16:48, Nicolas Grekas wrote:

Thanks for the detailed feedback again, it's very helpful!
Let me try to answer many emails at once, in chronological order:

Note that this kind of bulk reply make it very hard for me to keep trackof mailing list threads. It breaks threading, which makes it much harderfor me to find original context of a quoted part, especially since youdid not include the author / date for the quotes.

That said, I've taken a look at the differences since my email and alsogave the entire RFC another read.

don't touch `readonly` because of lazy objects: this feature is too niche

to cripple a major-major feature like `readonly`. I would suggest deferring
until after the first bits of this RFC landed.


Following Marco's advice, we've decided to remove all the flags related to
the various ways to handle readonly. This also removes the secondary vote.
The behavior related to readonly properties is now that they are skipped if
already initialized when calling resetAsLazy* methods, throw in the
initializer as usual, and are resettable only if the class is not final, as
already allowed in userland (and as explained in the RFC).

The 'readonly' section still mentions 'makeInstanceLazy', which likelyis a left-over from a previous version of the RFC. You should haveanother look and clean up the naming there.

There are not many reasons to do that. The only indented use-case that
doesn't involve an object freshly created with
->newInstanceWithoutConstructor() is to let an object manage its own
laziness by making itself lazy in its constructor:


Okay. But the RFC (and your email) does not explain why I would want do
that. It appears that much of the RFC's complexity (e.g. around readonly
properties and destructors) stems from the wish to support turning an
existing object into a lazy object. If there is no strong reason to
support that, I would suggest dropping that. It could always be added in
a future PHP version.


This capability is needed for two reasons: 1. completeness and 2. feature
parity with what can be currently done using magic methods (so that it's
already used to solve real-world problems).

Many things are already possible in userland. That does not always meanthat the cost-benefit ratio is appropriate for inclusion in core. I getbehind the two examples in the “About Lazy-Loading Strategies” section,but I'm afraid I still can't wrap my head why I would want an objectthat makes itself lazy in its own constructor: I have not yet seen areal-world example.

True, thanks for raising this point. After brainstorming with Arnaud, we
improved this behavior by:
1. allowing only parent classes, not child classes
2. requiring that all properties from a real instance have a corresponding
one on the proxy OR that the extra properties on the proxy are skipped/set
before initialization.

This means that it's now possible for a child class to add a property,
private or not. There's one requirement: the property must be skipped or
set before initialization.

For the record, with magic methods, we currently have no choice but to
create an inheritance proxy. This means the situation of having Proxy
extend Real like in your example is the norm. While doing so, it's pretty
common to attach some interface so that we can augment Real with extra
capabilities (let's say Proxy implements LazyObjectInterface). Being able
to use class Real as a backing store for Proxy gives us a very smooth
upgrade path (the implementation of the laziness can remain an internal
detail), and it's also sometimes the only way to leverage a factory that
returns Real, not Proxy.

I'm not entirely convinced that this is sound now, but I'm not in astate to think this through in detail.

I have one question regarding the updated initialization sequence. TheRFC writes:

Properties that are declared on the real instance are uninitialized on the 
proxy instance (including overlapping properties used with 
ReflectionProperty::skipLazyInitialization() or 
setRawValueWithoutLazyInitialization()) to synchronize the state shared by both 
instances.

I do not understand this. Specifically I do not understand the "tosynchronize the state" bit. My understanding is that the proxy willalways forward the property access, so there effectively is no state onthe proxy?! A more expansive explanation would be helpful. Possibly withan example that explains what would break if this would not happen.

That is very true. I had a look at the userland implementation and indeed,
we keep the wrapper while cloning the backing instance (it's not that we
have the choice, the engine doesn't give us any other options).
RFC updated.

We also updated the behavior when an uninitialized proxy is cloned: we now
postpone calling $real->__clone to the moment where the proxy clone is
initialized.

Do I understand it correctly that the initializer of the cloned proxy iseffectively replaced by the following:


    function (object $clonedProxy) use ($originalProxy) {
        return clone $originalProxy->getRealObject();
    }

? Then I believe this is unsound. Consider the following:

    $myProxy = $r->newLazyProxy(...);
    $clonedProxy = clone $myProxy;
    $r->initialize($myProxy);
    $myProxy->someProp++;
    var_dump($clonedProxy->someProp);

The clone was created before `someProp` was modified, but it outputs thevalue after modification!

Also: What happens if the cloned proxy is initialized *before* theoriginal proxy? There is no real object to clone.

I believe the correct behavior would be: Just clone the proxy and keepthe same initializer. Then both proxies are actually fully independentafter cloning, as I would expect from the clone operation.

   Any access to a non-existant (i.e. dynamic) property will trigger

initialization and this is not preventable using
'skipLazyInitialization()' and 'setRawValueWithoutLazyInitialization()'
because these only work with known properties?

While dynamic properties are deprecated, this should be clearly spelled
out in the RFC for voters to make an informed decision.



Absolutely. From a behavioral PoV, dynamic vs non-dynamic properties
doesn't matter: both kinds are uninitialized at this stage and the engine
will trigger object handlers in the same way (it will just not trigger the
same object handlers).

Unless I missed it, you didn't update the RFC to mention this. Please doso, I find it important to have a record of all details that werediscussed (e.g. for the documentation or when evaluating bug reports).

   > If the object is already lazy, a ReflectionException is thrown with

the message “Object is already lazy”.

What happens when calling the method on a *initialized* proxy object?
i.e. the following:

      class Obj { public function __construct(public string $name) {} }
      $obj1 = new Obj('obj1');
      $r->resetAsLazyProxy($obj, ...);
      $r->initialize($obj);
      $r->resetAsLazyProxy($obj, ...);

What happens when calling it for the actual object of an initialized
proxy object?



Once initialized, a lazy object should be indistinguishable from a non-lazy
one.
This means that the second call to resetAsLazyProxy will just do that:
reset the object like it does for any regular object.

It's probably not possible to prevent this, but will this
allow for proxy chains? Example:

      class Obj { public function __construct(public string $name) {} }
      $obj1 = new Obj('obj1');
      $r->resetAsLazyProxy($obj1, function () use (&$obj2) {
          $obj2 = new Obj('obj2');
          return $obj2;
      });
      $r->resetAsLazyProxy($obj2, function () {
          return new Obj('obj3');
      });
      var_dump($obj1->name); // what will this print?



This example doesn't work because $obj2 doesn't exist when trying to make
it lazy but you probably mean this instead?

Ah, yes you are right. An initialization is missing in the middle of thetwo `reset` calls (like in the previous example). My question wasspecifically about resetting an initialized proxy, so your adjustedexample is *not quite* what I was looking for, but the results shouldprobably be the same?


      class Obj { public function __construct(public string $name) {} }

      $obj1 = new Obj('obj1');
      $obj2 = new Obj('obj2');
      $r->resetAsLazyProxy($obj1, function () use ($obj2) {
          return $obj2;
      });
      $r->resetAsLazyProxy($obj2, function () {
          return new Obj('obj3');
      });
      var_dump($obj1->name); // what will this print?



This will print "obj3": each object is separate from the other from a
behavioral perspective, but with such a chain, accessing $obj1 will trigger
its initializer and will then access $obj2->name, which will trigger the
second initializer then access $obj3->name, which contains "obj3".
(I just confirmed with the implementation I have, which is from a previous
API flavor, but the underlying mechanisms are the same).


Okay, that works as expected then.

Please let me know if any topics remain unanswered.


I've indeed found two more questions.

1.

Just to confirm my understanding: The RFC mentions that the initializerof a proxy receives the proxy object as the first parameter. It furthermentions that making changes is legal (but likely useless).

My understanding is that attempting to read a property of theinitializer object will most likely fail, because it still isuninitialized? Or are the properties of the proxy object initializedwith their default value before calling the initializer?


For ghost objects the behavior is clear, just not for proxies.

2.

> Properties are not initialized to their default value yet (they areinitialized before calling the initializer).

I see that you removed the bit about this being not observable. What isthe reason that you removed that? One possible reason that comes to mymind is a default value that refers to a non-existing constant. It wouldbe observable because the initialization emits an error. Are there anyother reasons?


Best regards
Tim Düsterhus

Re: [PHP-DEV] [RFC] Lazy Objects

Reply via email to