Re: [perl #129019] [BUG] Range.WHICH fails on many kinds of endpoints

2017-09-13 Thread Zefram
Brian S. Julin via RT wrote:
> it would be OK for there to be some tiny chance
>of a collision between two WHICH.Str's as long as the actual WHICHs
>do not collide.

One could make that distinction, but then the .Str of the .WHICH would
not fulfill the purposes for which .WHICH is used, and would seem pretty
pointless.

Consider the Set class, as an example user of .WHICH.  It needs something
that it can easily hash and compare, and whose equality corresponds
precisely to object identity.  That doesn't have to be a string, but a
string is a convenient format for that.  So OK, if .WHICH.Str doesn't do
the job then Set can go to extra effort to use the `real' non-colliding
.WHICH value.  In this context, a colliding .WHICH.Str is worse than
useless: it's an attractive nuisance, because it'll function well enough
as a .WHICH substitute to pass most test suites but then fail in real use.

>So, I'd suggest the stringification of a .WHICH be of limited length, give a 
>few
>clues about the type/value and what is in it, and above some threshold or when
>anything messy is detected, elipses and a hash, and we just tell the users we 
>are doing
>that and not to rely on that stringification for end-of-the-world scenario 
>things.

Key question: what value does a colliding .WHICH.Str add?  You seem to
envision it as a human-oriented inspection mechanism.  But we've aleady
got more than one of those.  .perl, .gist, and to a lesser extent .Str all
provide some kind of representation of an object's content as a string.
There is conceptual room for more than one of these, since they can take
different views of what aspects of an object are important and what kind
of ambiguities in the output are acceptable.  But in practice there isn't
enough attention paid to these to justify even the range of methods that
already exist.  In many respects .perl and .gist are near clones, because
.gist has neither a clearly defined output format nor clearly different
rules about what to represent.  It would make a fair bit of sense at this
point to delete the less-useful .gist.  It would be crazy to go the other
way, adding yet another stringification mechanism with no hard identity
requirements, no defined format, and no rules about how much to represent.

Even if there were some demand for yet another stringification method,
.WHICH.Str would be the wrong place to put it.  Half-hearted inspection
is entirely contrary to the basic concept of .WHICH.  If such a method
is to be added, it should be a method directly on the principal object.
.WHICH doesn't have to supply a string directly, or even just a string
wrapped in a funny class, but the object it supplies should be concerned
entirely with the precise identity of the principal object.  For the
.WHICH value to stringify to anything that doesn't have the same identity
properties would be misleading.

If you're interested in a human inspecting the .WHICH value itself,
rather than inspecting the principal object, then the most important
method to consider is .WHICH.perl.  By the intent of .perl, this ought
to produce a string that fully represents the actual .WHICH value.
Ellipsis is not useful here.  It would be acceptable for .WHICH.gist
to provide a lossy representation, but .gist is so loosely defined that
almost anything is acceptable.

>there's room left open for not requiring WHICH implementation at all on value 
>types).
>For value types, .WHICH could very well be just identity

That too would break Set and anything else that needs a way to hash
object identity.  A .WHICH method producing a consistent type of output
is useful on *all* types.

> It would seem to me that implementing eqv and === candidates to work
>directly on values would be faster than making up some weird WHICH value and 
>then
>comparing those values,

Sure.  eqv/=== should absolutely be implemented in type-specific ways
where that can be done faster than .WHICH comparison.  And code comparing
for object identity should use the likely-faster === in preference to
comparing .WHICH values.  But .WHICH still needs to be there, for users
like Set that need to do more than identity comparison, and equality
comparison of .WHICH values must always behave the same as ===.

-zefram


Re: [perl #129019] [BUG] Range.WHICH fails on many kinds of endpoints

2017-09-13 Thread Zefram via RT
Brian S. Julin via RT wrote:
> it would be OK for there to be some tiny chance
>of a collision between two WHICH.Str's as long as the actual WHICHs
>do not collide.

One could make that distinction, but then the .Str of the .WHICH would
not fulfill the purposes for which .WHICH is used, and would seem pretty
pointless.

Consider the Set class, as an example user of .WHICH.  It needs something
that it can easily hash and compare, and whose equality corresponds
precisely to object identity.  That doesn't have to be a string, but a
string is a convenient format for that.  So OK, if .WHICH.Str doesn't do
the job then Set can go to extra effort to use the `real' non-colliding
.WHICH value.  In this context, a colliding .WHICH.Str is worse than
useless: it's an attractive nuisance, because it'll function well enough
as a .WHICH substitute to pass most test suites but then fail in real use.

>So, I'd suggest the stringification of a .WHICH be of limited length, give a 
>few
>clues about the type/value and what is in it, and above some threshold or when
>anything messy is detected, elipses and a hash, and we just tell the users we 
>are doing
>that and not to rely on that stringification for end-of-the-world scenario 
>things.

Key question: what value does a colliding .WHICH.Str add?  You seem to
envision it as a human-oriented inspection mechanism.  But we've aleady
got more than one of those.  .perl, .gist, and to a lesser extent .Str all
provide some kind of representation of an object's content as a string.
There is conceptual room for more than one of these, since they can take
different views of what aspects of an object are important and what kind
of ambiguities in the output are acceptable.  But in practice there isn't
enough attention paid to these to justify even the range of methods that
already exist.  In many respects .perl and .gist are near clones, because
.gist has neither a clearly defined output format nor clearly different
rules about what to represent.  It would make a fair bit of sense at this
point to delete the less-useful .gist.  It would be crazy to go the other
way, adding yet another stringification mechanism with no hard identity
requirements, no defined format, and no rules about how much to represent.

Even if there were some demand for yet another stringification method,
.WHICH.Str would be the wrong place to put it.  Half-hearted inspection
is entirely contrary to the basic concept of .WHICH.  If such a method
is to be added, it should be a method directly on the principal object.
.WHICH doesn't have to supply a string directly, or even just a string
wrapped in a funny class, but the object it supplies should be concerned
entirely with the precise identity of the principal object.  For the
.WHICH value to stringify to anything that doesn't have the same identity
properties would be misleading.

If you're interested in a human inspecting the .WHICH value itself,
rather than inspecting the principal object, then the most important
method to consider is .WHICH.perl.  By the intent of .perl, this ought
to produce a string that fully represents the actual .WHICH value.
Ellipsis is not useful here.  It would be acceptable for .WHICH.gist
to provide a lossy representation, but .gist is so loosely defined that
almost anything is acceptable.

>there's room left open for not requiring WHICH implementation at all on value 
>types).
>For value types, .WHICH could very well be just identity

That too would break Set and anything else that needs a way to hash
object identity.  A .WHICH method producing a consistent type of output
is useful on *all* types.

> It would seem to me that implementing eqv and === candidates to work
>directly on values would be faster than making up some weird WHICH value and 
>then
>comparing those values,

Sure.  eqv/=== should absolutely be implemented in type-specific ways
where that can be done faster than .WHICH comparison.  And code comparing
for object identity should use the likely-faster === in preference to
comparing .WHICH values.  But .WHICH still needs to be there, for users
like Set that need to do more than identity comparison, and equality
comparison of .WHICH values must always behave the same as ===.

-zefram



[perl #129019] [BUG] Range.WHICH fails on many kinds of endpoints

2017-09-13 Thread Brian S. Julin via RT
On Wed, 30 Aug 2017 07:03:10 -0700, zef...@fysh.org wrote:
> Brian S. Julin via RT wrote:
> >For example "Foo^".."Bar" and "Foo"^.."Bar" would put out the same WHICH.
> 
> Yes, and that's a bigger problem.  In general Rakudo's .WHICH methods
> suffer this sort of problem when incorporating the .WHICH values of
> subobjects.  See [perl #128943] (Set, and in which I sketched out how
> to fix it) and [perl #128947] (Pair, just like your Pair example).
> 
> >we have no choice than to either use an escape character quoteish
> >construct, or prepend a length:
> 
> The former would be much nicer to work with.
> 
> -zefram


OK, a few things to note going forward: 1) the .Str of a WHICH is not
necessarily the value of the WHICH.  I think in the case where we have
really large objects it would be OK for there to be some tiny chance
of a collision between two WHICH.Str's as long as the actual WHICHs
do not collide.  So using a hash in the visual presentation may be
OK, but using that hash for actually implementing === or eqv would
be wrong (as is done now in some cases)

Note Range does _NOT_ do so for eqv but seems to punt to comparing WHICH
for ===, and Range's WHICH is itself a Str, creating "accidental collisions"
which the spec advises against.

$ perl6 -e 'say ("200^".."foo") eqv ("200"^.."foo")'
False
$ perl6 -e 'say ("200^".."foo") === ("200"^.."foo")'
True

Second, the spec only says that mutables must produce an ObjAt (and in fact
there's room left open for not requiring WHICH implementation at all on value 
types).
For value types, .WHICH could very well be just identity... or a mixin on the 
value
that stringifies differently if we so desire or which can be used to associate 
a cached
digest.  It would seem to me that implementing eqv and === candidates to work
directly on values would be faster than making up some weird WHICH value and 
then
comparing those values, unless we have a cache of WHICHs which include a hash 
which
can be used to shortcut the test and only do the longhand comparison when the 
hashes match...
preferably the same hash that needs to be done when using them as hash bucket 
keys anyway.

Third, there is a test or two in the current test suite that test for current
stringlike behavior which will probably have to be fixed in errata whatever be
done to fix the current situation.

Fifth, as an aside for non-value-types, even ignoring NUMA, it is not enough to
munge a pointer address... the WHICH cannot be a deterministic derivative of
the object storage address... which the GC might change and of course might be 
reused.
I can't remember what scheme we are doing here but ISTR it was sane, if not the
most efficient possible.

Finally, the cases where we might not present a WHICH.Str which is guaranteed
to be unique, a clue to the user that "hey this is a value type so just look
at the .perl" might help.

So, I'd suggest the stringification of a .WHICH be of limited length, give a few
clues about the type/value and what is in it, and above some threshold or when
anything messy is detected, elipses and a hash, and we just tell the users we 
are doing
that and not to rely on that stringification for end-of-the-world scenario 
things.

Meanwhile, implement value-type .WHICHs as either identity or a mixin, and try 
like
heck to code around actually using the latter for anything that does not involve
implementing speedups, since they tie up resources.

Phew. Maybe this was one of those brain pretzels S)2 warned us about.



Re: [perl #129019] [BUG] Range.WHICH fails on many kinds of endpoints

2017-08-30 Thread Zefram via RT
Brian S. Julin via RT wrote:
>For example "Foo^".."Bar" and "Foo"^.."Bar" would put out the same WHICH.

Yes, and that's a bigger problem.  In general Rakudo's .WHICH methods
suffer this sort of problem when incorporating the .WHICH values of
subobjects.  See [perl #128943] (Set, and in which I sketched out how
to fix it) and [perl #128947] (Pair, just like your Pair example).

>we have no choice than to either use an escape character quoteish
>construct, or prepend a length:

The former would be much nicer to work with.

-zefram



[perl #129019] [BUG] Range.WHICH fails on many kinds of endpoints

2017-08-29 Thread Brian S. Julin via RT
On Sat, 20 Aug 2016 10:24:51 -0700, zef...@fysh.org wrote:
> > (:a..:b).WHICH
> Range|a True..b True
> > (List..Pair).WHICH
> Use of uninitialized value $!min of type List in string context.
> Methods .^name, .perl, .gist, or .say can be used to stringify it to
> something meaningful.  in block  at  line 1
> Use of uninitialized value $!max of type Pair in string context.
> Methods .^name, .perl, .gist, or .say can be used to stringify it to
> something meaningful.  in block  at  line 1
> Range|..
> 
> Even where it doesn't produce these warnings and an empty
> representation
> of an endpoint, this style of .WHICH is very poor and clash-prone.
> Range.WHICH should apply .WHICH to the endpoint objects.
> 
> -zefram

Unfortunately using the .WHICHs of the endpoints will not be sufficient.

For example "Foo^".."Bar" and "Foo"^.."Bar" would put out the same WHICH.
Since any Str can foil any attempt we make to simply bracket the result,
we have no choice than to either use an escape character quoteish
construct, or prepend a length:

Range|(8)|Str|Foo^..Str|Bar

...we could, however, decide whether or not to do so
by detecting anything in $!min which may jump the rails.

Same problem with Pair, BTW, which already .WHICHs its members:

$ perl6 -e '("Foo|Str|2" => "1").WHICH.say'
Pair|Str|Foo|Str|2|Str|1
$ perl6 -e '("Foo" => "2|Str|1").WHICH.say'
Pair|Str|Foo|Str|2|Str|1

...and given the use of WHICH in hashing, this is more consequential
than it may seem at first glance.



[perl #129019] [BUG] Range.WHICH fails on many kinds of endpoints

2016-08-20 Thread via RT
# New Ticket Created by  Zefram 
# Please include the string:  [perl #129019]
# in the subject line of all future correspondence about this issue. 
# https://rt.perl.org/Ticket/Display.html?id=129019 >


> (:a..:b).WHICH
Range|a True..b True
> (List..Pair).WHICH
Use of uninitialized value $!min of type List in string context.
Methods .^name, .perl, .gist, or .say can be used to stringify it to something 
meaningful.  in block  at  line 1
Use of uninitialized value $!max of type Pair in string context.
Methods .^name, .perl, .gist, or .say can be used to stringify it to something 
meaningful.  in block  at  line 1
Range|..

Even where it doesn't produce these warnings and an empty representation
of an endpoint, this style of .WHICH is very poor and clash-prone.
Range.WHICH should apply .WHICH to the endpoint objects.

-zefram