Hi
On 7/2/24 16:48, Nicolas Grekas wrote:
Thanks for the detailed feedback again, it's very helpful!
Let me try to answer many emails at once, in chronological order:
Note that this kind of bulk reply make it very hard for me to keep track
of mailing list threads. It breaks threading, which makes it much harder
for me to find original context of a quoted part, especially since you
did not include the author / date for the quotes.
That said, I've taken a look at the differences since my email and also
gave the entire RFC another read.
don't touch `readonly` because of lazy objects: this feature is too niche
to cripple a major-major feature like `readonly`. I would suggest deferring
until after the first bits of this RFC landed.
Following Marco's advice, we've decided to remove all the flags related to
the various ways to handle readonly. This also removes the secondary vote.
The behavior related to readonly properties is now that they are skipped if
already initialized when calling resetAsLazy* methods, throw in the
initializer as usual, and are resettable only if the class is not final, as
already allowed in userland (and as explained in the RFC).
The 'readonly' section still mentions 'makeInstanceLazy', which likely
is a left-over from a previous version of the RFC. You should have
another look and clean up the naming there.
There are not many reasons to do that. The only indented use-case that
doesn't involve an object freshly created with
->newInstanceWithoutConstructor() is to let an object manage its own
laziness by making itself lazy in its constructor:
Okay. But the RFC (and your email) does not explain why I would want do
that. It appears that much of the RFC's complexity (e.g. around readonly
properties and destructors) stems from the wish to support turning an
existing object into a lazy object. If there is no strong reason to
support that, I would suggest dropping that. It could always be added in
a future PHP version.
This capability is needed for two reasons: 1. completeness and 2. feature
parity with what can be currently done using magic methods (so that it's
already used to solve real-world problems).
Many things are already possible in userland. That does not always mean
that the cost-benefit ratio is appropriate for inclusion in core. I get
behind the two examples in the “About Lazy-Loading Strategies” section,
but I'm afraid I still can't wrap my head why I would want an object
that makes itself lazy in its own constructor: I have not yet seen a
real-world example.
True, thanks for raising this point. After brainstorming with Arnaud, we
improved this behavior by:
1. allowing only parent classes, not child classes
2. requiring that all properties from a real instance have a corresponding
one on the proxy OR that the extra properties on the proxy are skipped/set
before initialization.
This means that it's now possible for a child class to add a property,
private or not. There's one requirement: the property must be skipped or
set before initialization.
For the record, with magic methods, we currently have no choice but to
create an inheritance proxy. This means the situation of having Proxy
extend Real like in your example is the norm. While doing so, it's pretty
common to attach some interface so that we can augment Real with extra
capabilities (let's say Proxy implements LazyObjectInterface). Being able
to use class Real as a backing store for Proxy gives us a very smooth
upgrade path (the implementation of the laziness can remain an internal
detail), and it's also sometimes the only way to leverage a factory that
returns Real, not Proxy.
I'm not entirely convinced that this is sound now, but I'm not in a
state to think this through in detail.
I have one question regarding the updated initialization sequence. The
RFC writes:
Properties that are declared on the real instance are uninitialized on the
proxy instance (including overlapping properties used with
ReflectionProperty::skipLazyInitialization() or
setRawValueWithoutLazyInitialization()) to synchronize the state shared by both
instances.
I do not understand this. Specifically I do not understand the "to
synchronize the state" bit. My understanding is that the proxy will
always forward the property access, so there effectively is no state on
the proxy?! A more expansive explanation would be helpful. Possibly with
an example that explains what would break if this would not happen.
That is very true. I had a look at the userland implementation and indeed,
we keep the wrapper while cloning the backing instance (it's not that we
have the choice, the engine doesn't give us any other options).
RFC updated.
We also updated the behavior when an uninitialized proxy is cloned: we now
postpone calling $real->__clone to the moment where the proxy clone is
initialized.
Do I understand it correctly that the initializer of the cloned proxy is
effectively replaced by the following:
function (object $clonedProxy) use ($originalProxy) {
return clone $originalProxy->getRealObject();
}
? Then I believe this is unsound. Consider the following:
$myProxy = $r->newLazyProxy(...);
$clonedProxy = clone $myProxy;
$r->initialize($myProxy);
$myProxy->someProp++;
var_dump($clonedProxy->someProp);
The clone was created before `someProp` was modified, but it outputs the
value after modification!
Also: What happens if the cloned proxy is initialized *before* the
original proxy? There is no real object to clone.
I believe the correct behavior would be: Just clone the proxy and keep
the same initializer. Then both proxies are actually fully independent
after cloning, as I would expect from the clone operation.
Any access to a non-existant (i.e. dynamic) property will trigger
initialization and this is not preventable using
'skipLazyInitialization()' and 'setRawValueWithoutLazyInitialization()'
because these only work with known properties?
While dynamic properties are deprecated, this should be clearly spelled
out in the RFC for voters to make an informed decision.
Absolutely. From a behavioral PoV, dynamic vs non-dynamic properties
doesn't matter: both kinds are uninitialized at this stage and the engine
will trigger object handlers in the same way (it will just not trigger the
same object handlers).
Unless I missed it, you didn't update the RFC to mention this. Please do
so, I find it important to have a record of all details that were
discussed (e.g. for the documentation or when evaluating bug reports).
> If the object is already lazy, a ReflectionException is thrown with
the message “Object is already lazy”.
What happens when calling the method on a *initialized* proxy object?
i.e. the following:
class Obj { public function __construct(public string $name) {} }
$obj1 = new Obj('obj1');
$r->resetAsLazyProxy($obj, ...);
$r->initialize($obj);
$r->resetAsLazyProxy($obj, ...);
What happens when calling it for the actual object of an initialized
proxy object?
Once initialized, a lazy object should be indistinguishable from a non-lazy
one.
This means that the second call to resetAsLazyProxy will just do that:
reset the object like it does for any regular object.
It's probably not possible to prevent this, but will this
allow for proxy chains? Example:
class Obj { public function __construct(public string $name) {} }
$obj1 = new Obj('obj1');
$r->resetAsLazyProxy($obj1, function () use (&$obj2) {
$obj2 = new Obj('obj2');
return $obj2;
});
$r->resetAsLazyProxy($obj2, function () {
return new Obj('obj3');
});
var_dump($obj1->name); // what will this print?
This example doesn't work because $obj2 doesn't exist when trying to make
it lazy but you probably mean this instead?
Ah, yes you are right. An initialization is missing in the middle of the
two `reset` calls (like in the previous example). My question was
specifically about resetting an initialized proxy, so your adjusted
example is *not quite* what I was looking for, but the results should
probably be the same?
class Obj { public function __construct(public string $name) {} }
$obj1 = new Obj('obj1');
$obj2 = new Obj('obj2');
$r->resetAsLazyProxy($obj1, function () use ($obj2) {
return $obj2;
});
$r->resetAsLazyProxy($obj2, function () {
return new Obj('obj3');
});
var_dump($obj1->name); // what will this print?
This will print "obj3": each object is separate from the other from a
behavioral perspective, but with such a chain, accessing $obj1 will trigger
its initializer and will then access $obj2->name, which will trigger the
second initializer then access $obj3->name, which contains "obj3".
(I just confirmed with the implementation I have, which is from a previous
API flavor, but the underlying mechanisms are the same).
Okay, that works as expected then.
Please let me know if any topics remain unanswered.
I've indeed found two more questions.
1.
Just to confirm my understanding: The RFC mentions that the initializer
of a proxy receives the proxy object as the first parameter. It further
mentions that making changes is legal (but likely useless).
My understanding is that attempting to read a property of the
initializer object will most likely fail, because it still is
uninitialized? Or are the properties of the proxy object initialized
with their default value before calling the initializer?
For ghost objects the behavior is clear, just not for proxies.
2.
> Properties are not initialized to their default value yet (they are
initialized before calling the initializer).
I see that you removed the bit about this being not observable. What is
the reason that you removed that? One possible reason that comes to my
mind is a default value that refers to a non-existing constant. It would
be observable because the initialization emits an error. Are there any
other reasons?
Best regards
Tim Düsterhus