On 8/14/06, Smylers wrote:
David Green writes:
I guess my problem is that [1,2] *feels* like it should === [1,2].
You can explain that there's this mutable object stuff going on, and I
can follow that (sort of...), but it seems like an implementation
detail leaking out.
The currently defined behaviour seems intuitive to me, from a
starting point of Perl 5.
But is Perl 5 the best place to start? It's something many of us are
used to, but that doesn't mean it's the best solution conceptually,
even if it was the most reasonable way to implement it in P5.
The reason I think it's an implementation wart is that an array --
thought of as a single, self-contained lump -- is different from a
reference or pointer to some other variable. Old versions of Perl
always eagerly exploded arrays, so there was no way to refer to an
array as a whole; put two arrays together and P5 (or P4, etc.) thinks
it's just one big array or list.
Then when references were introduced, "array-refs" provided a way to
encapsulate arrays so we could work with them as single lumps. It's
not the most elegant solution, but being able to nest data structures
at all was a tremendous benefit, and it was backwards-compatible.
P6 doesn't have to be that backwards-compatible -- it already isn't.
P6 more naturally treats arrays as lumps; this may or may not be
*implemented* using references as in P5, but it doesn't have to -- or
at least, it doesn't have to *look* as though that's how it's doing
it. Conceptually, an array consisting only of constant literals,
like (1,2,3), isn't referring to anything, so it doesn't need to
behave that way.
The difference between:
my $new = [EMAIL PROTECTED];
my $new = [EMAIL PROTECTED];
is that the second one is a copy; square brackets always create a
new anonymous array rather than merely refering to an existing one,
and that's the same thing that's happening here. Think of square
brackets as meaning something like Array->new and each one is
I agree that [EMAIL PROTECTED] should be distinct from [EMAIL PROTECTED] -- in the former
case, we're deliberately taking a reference to the @orig variable.
What I don't like is that [EMAIL PROTECTED] is distinct from [EMAIL PROTECTED] -- sure,
I'm doing something similar to Array->new(1,2) followed by another
Array->new(1,2), but I still want them to be the same, just as I want
Str->new("foo") to be the same as Str->new("foo"). They're just
constants, they should compare equally regardless of how I created
them. (And arrays should work a lot like strings, because at some
conceptual level, a string is an array [of characters].)
> And I feel this way because [1,2] looks like it should be platonically
I'd say that C< (1, 2) > looks like that. But C< [1, 2] > looks
like it's its own thing that won't be equal to another one.
Except [1,2] can look like (1,2) in P6 because it automatically
refs/derefs stuff so that things Just Work. That's good, because you
shouldn't have to be referencing arrays yourself (hence my point
above about an array conceptually being a single lump). But if we're
going to hide the [implementational] distinction in some places, we
should hide it everywhere.
Actually, my point isn't even about arrays per se; that's just the
implementation/practical side of it. You can refer to a scalar
perl -e 'print \1, \1'
They're different because the *references* are different, but I don't
care about that. A reference to a constant value is kind of
pointless, because the value isn't going to change. References to
*variables* are useful, because you never know what value that
variable might have, and refs give you a pointer to the current value
of the variable at any time.
The fact that it's even possible to take a reference to a literal is
kind of funny, really; but since in P5 you had to be explicit about
(de)referencing, it didn't hurt, and you could maybe even find some
cute ways to take advantage of it (such as an easy way to get unique
IDs out of the str/numification of a ref?). P6 just lets you gloss
over certain ref/deref distinctions that in a perfect world wouldn't
have existed in the first place.
Leibniz's "identity of indiscernibles" is a perfectly practical
principle to pursue in programming. Now [EMAIL PROTECTED] may be discernible
from [EMAIL PROTECTED] or [1, @orig] from [1, @other], but \1 is completely the
same as \1 in all ways -- all ways except for being able to get a
representation of its memory location. And that's not anything about
"1", that's a bit of metadata about the reference itself -- something
that definitely is based on the implementation.
(I can imagine some other implementation where in a ridiculous
attempt to optimise for minimal memory footprint, everything with a
value of 1 points to the same address. When I say "$a=1; $a++", $a
first points to 0x1234567, and when I increment it, I don't change
the bits in that location, instead $a changes to point to address
0x3456789, where my unique 2 value is stored. Then the only way to
differentiate \1 from \1 is to generate some arbitrary unique ID.
Which would be silly.)
Anyway, I hope I'm making sense about why \1 !=== \1, etc. seems a
bit unnatural to me.