[perl #129019] [BUG] Range.WHICH fails on many kinds of endpoints

Brian S. Julin via RT Wed, 13 Sep 2017 09:11:26 -0700

On Wed, 30 Aug 2017 07:03:10 -0700, zef...@fysh.org wrote:
> Brian S. Julin via RT wrote:
> >For example "Foo^".."Bar" and "Foo"^.."Bar" would put out the same WHICH.
> 
> Yes, and that's a bigger problem.  In general Rakudo's .WHICH methods
> suffer this sort of problem when incorporating the .WHICH values of
> subobjects.  See [perl #128943] (Set, and in which I sketched out how
> to fix it) and [perl #128947] (Pair, just like your Pair example).
> 
> >we have no choice than to either use an escape character quoteish
> >construct, or prepend a length:
> 
> The former would be much nicer to work with.
> 
> -zefram



OK, a few things to note going forward: 1) the .Str of a WHICH is not
necessarily the value of the WHICH.  I think in the case where we have
really large objects it would be OK for there to be some tiny chance
of a collision between two WHICH.Str's as long as the actual WHICHs
do not collide.  So using a hash in the visual presentation may be
OK, but using that hash for actually implementing === or eqv would
be wrong (as is done now in some cases)

Note Range does _NOT_ do so for eqv but seems to punt to comparing WHICH
for ===, and Range's WHICH is itself a Str, creating "accidental collisions"
which the spec advises against.

$ perl6 -e 'say ("200^".."foo") eqv ("200"^.."foo")'
False
$ perl6 -e 'say ("200^".."foo") === ("200"^.."foo")'
True

Second, the spec only says that mutables must produce an ObjAt (and in fact
there's room left open for not requiring WHICH implementation at all on value 
types).
For value types, .WHICH could very well be just identity... or a mixin on the 
value
that stringifies differently if we so desire or which can be used to associate 
a cached
digest.  It would seem to me that implementing eqv and === candidates to work
directly on values would be faster than making up some weird WHICH value and 
then
comparing those values, unless we have a cache of WHICHs which include a hash 
which
can be used to shortcut the test and only do the longhand comparison when the 
hashes match...
preferably the same hash that needs to be done when using them as hash bucket 
keys anyway.

Third, there is a test or two in the current test suite that test for current
stringlike behavior which will probably have to be fixed in errata whatever be
done to fix the current situation.

Fifth, as an aside for non-value-types, even ignoring NUMA, it is not enough to
munge a pointer address... the WHICH cannot be a deterministic derivative of
the object storage address... which the GC might change and of course might be 
reused.
I can't remember what scheme we are doing here but ISTR it was sane, if not the
most efficient possible.

Finally, the cases where we might not present a WHICH.Str which is guaranteed
to be unique, a clue to the user that "hey this is a value type so just look
at the .perl" might help.

So, I'd suggest the stringification of a .WHICH be of limited length, give a few
clues about the type/value and what is in it, and above some threshold or when
anything messy is detected, elipses and a hash, and we just tell the users we 
are doing
that and not to rely on that stringification for end-of-the-world scenario 
things.

Meanwhile, implement value-type .WHICHs as either identity or a mixin, and try 
like
heck to code around actually using the latter for anything that does not involve
implementing speedups, since they tie up resources.

Phew. Maybe this was one of those brain pretzels S)2 warned us about.

[perl #129019] [BUG] Range.WHICH fails on many kinds of endpoints

Reply via email to