Re: ===, =:=, ~~, eq and == revisited (blame ajs!)

Aaron Sherman Wed, 12 Jul 2006 13:17:22 -0700

On Wed, 2006-07-12 at 19:25 +0300, Yuval Kogman wrote:
> Over at #perl6 we had a short discussion on =:=, ===, and ~~, mostly raised by
> ajs's discussion on Str items and ===.


*wave*

> 1. what is .id on references? Is it related to the memory slot, like refaddr()
> in Perl 5?

That's something I'm not sure of, so I'll let it go, other than to say
that that question should probably avoid the word "memory", see below.

4. will we have a deep (possibly optimized[1]) equality operator, that
*will*

Now, let me handle this one out of order, since I think it's really key:

> return true for @foo = ( [ 1, 2 ], 3 ); @bar = ( [ 1, 2 ], 3 ); op(@foo, 
> @bar)?
> Is it going to be easy to make the newbies use that when they mean "it's the
> same", like they currently expect == and eq to work on "simple" values?

Isn't that ~~?

Per S03:

        Array   Array     arrays are comparable    match if $_ »~~« $x

~~ is really the all-purpose, bake-your-bread, clean-your-floors,
wax-your-cat operator that you're looking for.

It sounds like pugs is wrong here WRT the spec, since:

        ( [ 1, 2 ], 3 ) ~~ ( [ 1, 2 ], 3 )

is the same as:

        [1,2]~~[1,2] && 3 ~~ 3

which is the same as:

        (1~~1 && 2~~2) && 3~~3

which is true. Ain't recursive hyperoperators grand? Of course, I'm
assuming that a comparison hyperoperator in boolean context returns the
[&&] reduction of all of the values... that's an interesting assumption,
isn't it? But, it seems to be the assumption made by S03 under Smart
Matching, so I say it's true. ;)

> 2. is .id *always* a low level type representation of the object's value? It's
> specced that low level typed items have the same ID when they have the same
> value. What about complex types?

It cannot be for complex types or even strings... well, at least it
I<must> not be I<if> we care about performance.

That is, if C<$anything.id> needs to read every byte of $anything, then
an anything that happened to be a Buf containing the 3GB in-memory raw
image from the Hubble is going to really make C<.id> unhappy. I would
hope that C<.id> is an efficient enough operation that === should not
look like a performance bottleneck in my code....

> 3. Are these descriptions of the operators correct?
> 
>       ~~ matches the left side to a description on the right side

>       =:= makes sure the objects are actually the same single object (if $x 
> =:= $y
>       and you change $x.<foo> then $y.<foo> was also changed... is
>       this .id on refs?) Is =:= really eq .id? or more like
>       variable($x).id eq variable($y).id?

>       === makes sure that the values are equivalent ( @foo = ( 1, 2, 3 ); 
> @bar = ( 1,
>       2, 3); @foo === @bar currently works like that, but @foo = ( [ 1, 2 ], 
> 3 );
>       @bar = ( [ 1, 2 ], 3 ); @foo === @bar does not (in pugs). This is not 
> useful
>       because we already have this return false with =:=).


Let me counter-propose a slightly different way of saying that:

        ~~ as above. I think we all agree on this.

        =:= looks in the "symbol table" (caveat dragons) to see if LHS
        refers to the same variable as the RHS. Does this dereference?
        Probably not, but I'm not sure, based on S03.
        
        === Compares types and .id values. An implementation of this, as
        I interpreted S03, and with some assumptions made, and with some
        extra bits filling in the cracks where S03 didn't quite specify
        an implementation:
        
              * A .id method may return C<int>, C<num> or C<bit>. ===
                returns false for two objects which are not the same
                type (with the same traits), and thus the comparison
                must always be between identical .id return types.
              * As a special case, however, all "undefined" values (not
                objects which have the undefined trait, but true undefs
                with no other functionality) are === to each other.
              * Objects are always compared according to their
                underlying type, not the polymorphic role which they are
                serving at the moment.
              * num, Num and all like values return their num
                representation as a .id.
              * int, Int and all like values return their int
                representation as a .id.
              * Bool, bool and bit all have a bit representation for .id
              * All other code, objects, references, structures, complex
                numbers, etc. are compared strictly on the basis of an
                arbitrary C<int> which Perl will generate to represent
                their storage, and can be overridden by replacing the
                default .id method.

The other way to think about === would be that it tells you if its LHS
*could* be constant-folded onto its RHS (if it were constant for long
enough), where =:= tells you if that has already been done. Only ~~ has
some sort of "deep" semantics, and I think the documentation warns users
sufficiently of this magical behavior, so they should not be shocked
when C<$huge_tree_1 ~~ $huge_tree_2> takes a long time.

> If they are not correct, why is there an overlap between =:=? Why is it hard 
> to
> deeply compare values in 2006 without using e.g. Data::Compare?

Because of the word "deep". Deep implies arbitrary work, which isn't
really what you want in such a low-level operator. However, using these
operator, one could easily build whatever you like.

> 5. is there room for a new opperator?
> 
>       =::= makes sure the memory slot  is the same (might be different
>       for simple values). refaddr($x) == refaddr($y) in Perl 5

I'd avoid saying "memory", here. Some implementations of Perl 6 might
not know what memory looks like (on a sufficiently abstract VM).

However, otherwise, I think this is what =:= does, except that =:= may
or may not dereference.


In answer to your examples from the other message, I think it's:

        # eq is a bit uglier than I had thought, but not
        # too bad.
        # This is a very C++-like operation where we first
        # establish that the high-level bits are all in line,
        # and then get access to the representation and compare
        # bytes. This assumes:
        # * Str.charset returns a Str::CharSet
        # * Str.encoding returns a Str::Encoding
        # * Str.buf returns the underlying bytes as Buf
        # * a >>===<< b is true in bool context if all a[n]===b[n]
        our Bool multi infix:<eq> ( Str $a, Str $b ) {
                return True unless $a.defined || $b.defined;
                return False unless $a.defined && $b.defined;
                return False unless $a.charset === $b.charset;
                return False unless $a.encoding === $b.encoding;
                my Buf $bufa := $a.buf;
                my Buf $bufb := $b.buf;
                return $bufa >>===<< $bufb;
        }

        our Bool multi infix:<==> ( Int $a, Int $b ) {
                return $a === $b;
        }

        our Bool multi infix:<==> ( Num $a, Num $b) {
                return $a === $b;
        }

        our Bool multi infix:<==> ( Bool $a, Bool $b ) {
                return $a === $b;
        }
        # ... complex, bit, etc.


-- 
Aaron Sherman <[EMAIL PROTECTED]>
Senior Systems Engineer and Toolsmith
"We had some good machines, but they don't work no more." -Shriekback

Re: ===, =:=, ~~, eq and == revisited (blame ajs!)

Reply via email to