Re: [PHP-DEV] [RFC] Lazy Objects

Benjamin Außenhofer Fri, 12 Jul 2024 00:54:45 -0700

Am 12.07.2024, 08:00:18 schrieb Rob Landers <rob@bottled.codes>:

>
>
> On Fri, Jul 12, 2024, at 01:40, Benjamin Außenhofer wrote:
>
>
>
> Am 11.07.2024, 20:31:44 schrieb Tim Düsterhus <t...@bastelstu.be>:
>
> Hi
>
> On 7/11/24 10:32, Nicolas Grekas wrote:
>
> > Many things are already possible in userland. That does not always mean
>
> > that the cost-benefit ratio is appropriate for inclusion in core. I get
>
> > behind the two examples in the “About Lazy-Loading Strategies” section,
>
> > but I'm afraid I still can't wrap my head why I would want an object
>
> > that makes itself lazy in its own constructor: I have not yet seen a
>
> > real-world example.
>
> >
>
>
> Keeping this capability for userland is not an option for me as it would
>
> mostly defeat my goal, which is to get rid of any userland code on this
>
> topic (and is achieved by the RFC).
>
>
> Here is a real-world example:
>
>
> https://github.com/doctrine/DoctrineBundle/blob/2.12.x/src/Repository/LazyServiceEntityRepository.php
>
>
> This class currently uses a poor-man's implementation of lazy objects and
>
> would greatly benefit from resetAsLazyGhost().
>
>
>
> Sorry, I was probably a little unclear with my question. I was not
> specifically asking if anyone did that, because I am fairly sure that
> everything possible has been done before.
>
> I was interested in learning why I would want to promote a
> "LazyServiceEntityRepository" instead of the user of my library just
> making the "ServiceEntityRepository" lazy themselves.
>
> I understand that historically making the "ServiceEntityRepository" lazy
> yourself would have been very complicated, but the new RFC makes this
> super easy.
>
> So based on my understanding the "LazyServiceEntityRepository"
> (c|sh)ould be deprecated with the reason that PHP 8.4 provides all the
> necessary tools to do it yourself, no? That would also match your goal
> of getting rid of userland code on this topic.
>
> To me this is what the language evolution should do: Enable users to do
> things that previously needed to be provided by userland libraries,
> because they were complicated and fragile, not enabling userland
> libraries to simplify things that they should not need to provide in the
> first place because the language already provides it.
>
>
> I agree with Tim here, the Doctrine ORM EntityRepository plus Symfony
> Service Entity Repository extension are not a necessary real world case
> that would require this RFC  to include a way for classes to make
> themselves lazy.
>
> I took the liberty at rewriting the code of DefaultRepositoryFactory
> (Doctrine code itself) and ContainerRepositoryFactory in a way to make the
> repositories lazy without needing resetAsLazy, just
> $reflector->createLazyProxy. In case of the second the
> LazyServiceEntityRepository class could be deleted.
>
> https://gist.github.com/beberlei/80d7a3219b6a2a392956af18e613f86a
>
> Please let me know if this is not how it works or can work or if my
> reasoning is flawed.
>
> Unless you have no way of getting to the „new $object“ in the code, there
> is always a way to just use newLazy*. And when a library does not expose
> new $object to you to override, then that is an architectural choice (and
> maybe flaw that you have to accept).
>
> I still think not having the reset* methods would greatly simplify this
> RFC and would allow to force more constraints, have less footguns.
>
> For example we could simplify the API of newLazyProxy to not receive a
> $factory that can arbitrarily create and get objects from somewhere, but
> also initializer and always force the lazy object to be an instance created
> by newInstanceWithoutConstructor.
>
> You said in a previous mail about reset*()
>
> From a technical pov, this is just  a different flavor of the same code
> infrastructure, so this is pretty  aligned with the rest of the proposed
> API.
>
>
> We are not specifically considering the technical POV, but even more
> importantly the user facing API. And this just adds to the surface of the
> API a lot of things that are pushing only a 1-5% edge case.
>
>
> > I have one question regarding the updated initialization sequence. The
>
> > RFC writes:
>
> >
>
> >> Properties that are declared on the real instance are uninitialized on
>
> > the proxy instance (including overlapping properties used with
>
> > ReflectionProperty::skipLazyInitialization() or
>
> > setRawValueWithoutLazyInitialization()) to synchronize the state shared
> by
>
> > both instances.
>
> >
>
> > I do not understand this. Specifically I do not understand the "to
>
> > synchronize the state" bit.
>
>
>
> We reworded this sentence a bit. Clearer?
>
>
> Yes, I think it is clearer. Let me try to rephrase this differently to
> see if my understanding is correct:
>
> ---
>
> For every property on that exists on the real instance, the property on
> the proxy instance effectively [1] is replaced by a property hook like
> the following:
>
>     public PropertyType $propertyName {
>         get {
>             return $this->realInstance->propertyName;
>         }
>         set(PropertyType $value) {
>             $this->realInstance->propertyName = $value;
>         }
>     }
>
> And value that is stored in the property will be freed (including
> calling the destructor if it was the last reference), as if `unset()`
> was called on the property.
>
> [1] No actual property hook will be created and the `realInstance`
> property does not actually exist, but the semantics behave as if such a
> hook would be applied.
>
> ---
>
>
> > My understanding is that the proxy will
>
> > always forward the property access, so there effectively is no state on
>
> > the proxy?!
>
>
>
> It follows that more properties can exist on the proxy itself (declared by
>
> child classes of the real object that the proxy implements).
>
>
>
> Right, that's mentioned in (2), so all clear.
>
>
> >> That is very true. I had a look at the userland implementation and
>
> > indeed,
>
> >> we keep the wrapper while cloning the backing instance (it's not that we
>
> >> have the choice, the engine doesn't give us any other options).
>
> >> RFC updated.
>
> >>
>
> >> We also updated the behavior when an uninitialized proxy is cloned: we
>
> > now
>
> >> postpone calling $real->__clone to the moment where the proxy clone is
>
> >> initialized.
>
> >
>
> > Do I understand it correctly that the initializer of the cloned proxy is
>
> > effectively replaced by the following:
>
> >
>
> >       function (object $clonedProxy) use ($originalProxy) {
>
> >           return clone $originalProxy->getRealObject();
>
> >       }
>
> >
>
>
> Nope, that's not what we describe in the RFC so I hope you can read it
>
> again and get where you were confused and tell us if we're not clear enough
>
> (to me we are :) )
>
>
> The "cloning of the real instance" bit is what lead me to this
> understanding.
>
> The $originalProxy is *not* shared with $clonedProxy. Instead, it's
>
> *initializers* that are shared between clones.
>
> And then, when we call that shared initializer in the $clonedProxy, we
>
> clone the returned instance, so that even if the initializer returns a
>
> shared instance, we don't share anything with the $originalProxy.
>
>
>
> Ah, so you mean if the initializer would look like this instead of
> creating a fresh object within the initializer?
>
>      $predefinedObject = new SomeObj();
>      $myProxy = $r->newLazyProxy(function () use ($predefinedObject) {
>          return $predefinedObject;
>      });
>      $clonedProxy = clone $myProxy;
>      $r->initialize($myProxy);
>      $r->initialize($clonedProxy);
>
> It didn't even occur to me that one would be able to return a
> pre-existing object: I assume that simply reusing the initializer would
> create a separate object and that would be sufficient to ensure that the
> cloned instance would be independent.
>
>
> > ? Then I believe this is unsound. Consider the following:
>
> >
>
> >       $myProxy = $r->newLazyProxy(...);
>
> >       $clonedProxy = clone $myProxy;
>
> >       $r->initialize($myProxy);
>
> >       $myProxy->someProp++;
>
> >       var_dump($clonedProxy->someProp);
>
> >
>
> > The clone was created before `someProp` was modified, but it outputs the
>
> > value after modification!
>
> >
>
> > Also: What happens if the cloned proxy is initialized *before* the
>
> > original proxy? There is no real object to clone.
>
> >
>
> > I believe the correct behavior would be: Just clone the proxy and keep
>
> > the same initializer. Then both proxies are actually fully independent
>
> > after cloning, as I would expect from the clone operation.
>
> >
>
>
> That's basically what we do and what we describe in the RFC, just with the
>
> added lazy-clone operation on the instance returned by the initializer.
>
>
>
> This means that if I would return a completely new object within the
> initializer then for a cloned proxy the new object would immediately be
> cloned and the original object be destructed, yes?
>
> Frankly, thinking about this cloning behavior gives me a headache,
> because it quickly leads to very weird semantics. Consider the following
> example:
>
>      $predefinedObject = new SomeObj();
>      $initializer = function () use ($predefinedObject) {
>          return $predefinedObject;
>      };
>      $myProxy = $r->newLazyProxy($initializer);
>      $otherProxy = $r->newLazyProxy($initializer);
>      $clonedProxy = clone $myProxy;
>      $r->initialize($myProxy);
>      $r->initialize($otherProxy);
>      $r->initialize($clonedProxy);
>
> To my understanding both $myProxy and $otherProxy would share the
> $predefinedObject as the real instance and $clonedProxy would have a
> clone of the $predefinedObject at the time of the initialization as its
> real instance?
>
> To me this sounds like cloning an uninitialized proxy would need to
> trigger an initialization to result in semantics that do not violate the
> principle of least astonishment.
>
> I would assume that cloning a proxy is something that rarely happens,
> because my understanding is that proxies are most useful for service
> objects, whereas ghost objects would be used for entities / value
> objects, so this should not be too much of a problem.
>
> > 2.
>
> >
>
> >   > Properties are not initialized to their default value yet (they are
>
> > initialized before calling the initializer).
>
> >
>
> > I see that you removed the bit about this being not observable. What is
>
> > the reason that you removed that? One possible reason that comes to my
>
> > mind is a default value that refers to a non-existing constant. It would
>
> > be observable because the initialization emits an error. Are there any
>
> > other reasons?
>
> >
>
>
> That's because this is observable using e.g. (array) or var_dump.
>
>
>
> I see. Perhaps add a short sentence with the reasoning. Something like:
>
> Properties are not initialized to their default value yet (they are
> initialized before calling the initializer). As an example, this has an
> impact on the behavior of an (array) cast on uninitialized objects and
> also when the default value is based on a constant that is not yet
> defined when creating the lazy object, but will be defined at the point
> of initialization.
>
> Best regards
> Tim Düsterhus
>
>
> For what it’s worth, I see “resetAsLazy()” being most useful for unit
> testing libraries that build proxies. While this feature will remove most
> of the tricky nuances around proxies, it doesn’t make it any easier in
> generating the code for them, so that has to be tested. Being able to write
> a test like this (abbreviated):
>
> $realObj = new $foo()
> $proxy = clone $realObj;
> makeTestProxy($proxy); // resets as lazy with initializer
> assert($realObj == $proxy);
>
> Is really simple. Without a reset method, this isn’t straightforward.
>


I don’t think this RFC can replace any logic from mock testing libraries
and doesn’t need the objects to be lazy.  Maybe I am not seeing the use
case here though.

The code generation part of a mock library to add the assertion logic needs
to happen anyways and making them lazy to defer initialization does not
seem a useful thing for a test library to do from my POV.

You can already do with ReflectionClass::newInstanceWithoutConstructor
everything that is needed for building mocks.

The only thing a lazy proxy / ghost could reasonbly do for mocking is to
allow saying what method was first called on the mock, but only when using
debug_backtrace in the factory method.

Maybe we could extend the proxy functionality in a follow-up RFC to allow
passing a $callInterceptor callback that gets invoked on every call to the
proxy. But this does not make reset* methods necessary.


> — Rob
>

Re: [PHP-DEV] [RFC] Lazy Objects

Reply via email to