On Sun, Jul 8, 2018 at 8:36 PM, Nicolas Grekas <nicolas.grekas+...@gmail.com
> wrote:

>
> Before talking about solutions, can the people who need this first outline
>>>> what functionality is needed and what it is needed for (and maybe what
>>>> workarounds you currently use). E.g. do you only need to know whether
>>>> something is a reference, or do you need to know whether two somethings
>>>> are
>>>> part of the same reference, etc. There are probably multiple use cases
>>>> for
>>>> this with different needs.
>>>>
>>>
>>> We're using reference introspection to do both: we need to know when a
>>> zval is a reference, and we also need to track each of them separately.
>>>
>>> The use case is being able to intropect any arbitrary PHP datastructure,
>>> with one main application: providing an enhanced "dump()" function.
>>>
>>> See e.g. this screenshot for what we get using the dump() function
>>> provided by Symfony VarDumper component:
>>> https://symfony.com/doc/current/_images/07-hard-ref.png
>>>
>>> In PHP5 days, Julien Pauli wrote a PHP extension to do zval
>>> introspection.
>>> Here is the code + README (see test case 001.phpt for example with
>>> references.):
>>> https://github.com/symfony/symfony/tree/3.4/src/Symfony/Comp
>>> onent/Debug/Resources/ext
>>>
>>> With PHP7, using pure PHP introspection is easier to maintain and still
>>> very fast so we deprecated the extension.
>>> Here is the code doing reference introspection:
>>> https://github.com/symfony/symfony/blob/master/src/Symfony/C
>>> omponent/VarDumper/Cloner/VarCloner.php#L83
>>>
>>> it might not be easy to follow, but the basic blocks are:
>>>
>>> $array2 = $array1;
>>> $array2[$key] = $unique_cookie;
>>> if ($array1[$key] === $unique_cookie) => we found a reference
>>> then we also maintain a registry of $unique_cookie so that we know if we
>>> already saw that reference or not (the check is done before the above "if"
>>> or course.)
>>>
>>
>> Thanks for the explanation. I think that the VarCloner use case needs two
>> bits of functionality:
>>
>> 1. Detecting whether a variable is a reference, so you can handle this
>> specially.
>> 2. An efficient way of determining whether a variable is part of a
>> reference that has already been seen (and which one).
>>
>> The second requirement is stronger than just the ability to detect
>> whether two variables are part of the same reference. Given just a
>> same_ref($v1, $v2) function, one would have to check against a list of all
>> previously seen references one at a time, rather than only performing a
>> hashtable lookup.
>>
>> Currently this functionality is implemented as:
>>
>> 1. Copying the array, assigning a cookie to the copy and seeing if the
>> original array is modified. With an extra catch for TypeErrors, this is
>> compatible with typed properties.
>> 2. Replacing the reference with a Stub object, which can be looked up by
>> object id. At the end the Stub objects are replaced with their values
>> again. This is fundamentally incompatible with typed properties, as the
>> type will likely not permit the Stub class.
>>
>> Here are my thoughts on possible APIs for this use case.
>>
>> Construction of reference-reflection objects
>> -----
>>
>> An issue already discussed in the other threads is that in PHP we need to
>> specify whether a parameter is accepted by reference, by value or by
>> preferred-reference. We don't have the possibility of accepting either a
>> value or reference, whatever we get. This leaves us with a few options:
>>
>> 1. Introducing a VM-level primitive that is not subject to this
>> limitation. The typed properties thread suggested a reflect_variable()
>> language construct. I'm not too fond of this option because reference
>> reflection seems like an awfully specific thing to introduce a new language
>> construct for.
>>
>> 2. A ReflectionReference::fromVariable(&$var) constructor. Contrary to
>> what was said in the other thread, this does not cause issues with the
>> copy-on-write mechanism. Since PHP 7 references and non-references can
>> share values (including immutablized values in SHM). However, this approach
>> does have two issues:
>> a) It is impossible to distinguish whether $var was a singleton reference
>> or a value beforehand. Both will show up as rc=2 references inside
>> ReflectionReference::fromVariable(). (This may also be an advantage,
>> because from a language-design perspective, we treat singleton references
>> as non-references.)
>> b) In case the original $var was a variable, it will now be a reference,
>> so this has a side-effect.
>>
>> 3. A ReflectionReference::fromArrayElem(array $array, string|int $key)
>> constructor, as suggested by Nicolas. This avoids the reference/value
>> problem and solves the specific VarCloner case efficiently and directly. On
>> the other hand, introspection of references inside non-arrays requires some
>> workarounds (e.g, casting objects to arrays).
>>
>> 4. A combination of these. For example we could have...
>> ... ReflectionReference::fromArrayElem(array $array, string|int $key)
>> for array items.
>> ... ReflectionReference::fromObjectProp(object $object, string $key) for
>> object properties.
>> ... ReflectionReference::fromVariable(&$var) for any other special cases.
>> This would allow to cover the common and interesting cases with
>> specialized methods, and leave a less efficient fallback for the general
>> case. This is probably the option I'd favor.
>>
>
> Either 3. or 4. would be good to me.
> IMHO 3. is enough, because :
> - ReflectionReference::fromObjectProp() => there need to be a way to deal
> with visibility, e.g. accessing private ones. Maybe get the
> ReflectionReference from a ReflectionProperty instead? Or just the
> array-cast is enought?
>

Good point, visibility handling is a problem here. In that case I agree
that it's best to just go through the array cast here.


> - ReflectionReference::fromVariable() => honestly, I don't see any use of
> local scope introspection. And if there is one, getting it as an array
> first is always possible, so you might prefer less complexity here (i.e not
> support this constructor)
>

Yeah, I don't see the use case either. Especially as this constructor has
the likely unwanted side-effect of turning the argument into a reference,
it's best not to include it.

If it turns out that these constructors are useful after all, it's always
possible to add them later.

Determining whether something is a reference
>> -----
>>
>> I think the best way to handle this (and the reason why I used named
>> constructors above) is to return null if the value is not a reference. This
>> should be the most common case and it would be best to avoid the overhead
>> of constructing an unnecessary object in this case.
>>
>> One important question in this context would be whether we consider
>> singleton references as references or not. If we do, then the
>> ReflectionReference::fromVariable() constructor will always return a
>> non-null value, as the variables will be turned into a singleton reference
>> if it was a reference. If we consider them as references, we'll also want
>> an API method to distinguish them. E.g. a specialized isSingleton() or more
>> generally getNumUsers() == 1.
>>
>> The alternative would be to always construct a ReflectionReference object
>> which may or may not be a reference and has an isReference() method. I
>> don't see any advantages to that approach though.
>>
>
> If we're seeking for a benefit, it would be by turning this to a
> ReflectionZval insteaf of ReflectionReference. There would be isRef, but
> also getType, etc. But honeslty I'm with you on this, only refs make sense.
>

I'm wondering if there is any legitimate use for something like this. The
only thing that comes to mind is comparing arrays by identity. E.g.
https://github.com/symfony/symfony/blob/master/src/Symfony/Component/VarDumper/Cloner/VarCloner.php#L153
could check directly check whether an array is the $GLOBALS array. It would
also allow direct recursion detection on arrays (ref
https://stackoverflow.com/questions/9105816/is-there-a-way-to-detect-circular-arrays-in-pure-php).
Doing the same on just references is nearly the same, but doesn't handle
the $GLOBALS case correctly.


> Reference equality
>> -----
>>
>> A couple of approaches:
>>
>> 1. Have an isEqual(ReflectionReference $other): bool method, which
>> determined whether two references are the same. The disadvantage is that
>> this only allows pair-wise comparisons, so it does not fully solve the
>> VarCloner use-case.
>>
>> 2. Make ReflectionReference constructor uniquing. That is, if a
>> ReflectionReference for a certain reference already exists, then the
>> constructor will return the same object. This means that references can be
>> compared by identity $ref1 === $ref2. It also means that they can be used
>> in hashtables via spl_object_id(). (Caveat: It's important to keep the
>> ReflectionReference object alive for the during in which spl_object_id() is
>> used, as usual.)
>>
>> 3. Some variation on 2 via a separate API. That is don't unique
>> ReflectionReferences themselves, but provide a separate getId() API. The
>> returned ID would only be meaningful as long as at least one
>> ReflectionReference object for the reference is live, otherwise it may be
>> reused.
>>
>
> 2. would work, as I could use that with an SplObjectStorage (which would
> satisfy your condition of keeping the object around).
>
> There is a 4th possibility: have a "ReflectionReference::getHash" method,
> that would return a unique string/id per reference (what
> symfony_zval_debug() does.)
>

Is this the same as my 3rd suggestion with a different name, or is there a
distinction here in that it removes the requirement for the
ReflectionReference object to be live? The reason I included that was to
make sure we have a (non-leaking) way of assigning ids to references.
Otherwise the only way to do is to use the address of the zend_reference as
the hash/id. I don't like exposing memory addresses directly to userland
(as this is generally a security issue), but we could return the address
hashed with a per-process key. This would make the function somewhat slower
though.

Nikita

Reply via email to