Re: Set-returning .keys (was Re: Smart Matching clarification)
HaloO, Darren Duncan wrote: This may be a non-issue from a user's viewpoint, but as a user, I want set operations that have sets as input to return sets as output by default. Eg, unioning 2 Set that have common values should return a Set. First of all the operations could be overloaded. Secondly the bag operations are defined in terms of min and max and therefore result in Sets remaining Sets at least in the sense of a Bag with all element multiplicities being 1. Moreover, since set operations would be a lot more common in practice than bag operations, the set operations should be the most terse, if they both aren't equally terse. I think the operators are the same. And with Seq a subtype of Bag they are applicable to Seqs as well. Some operations like (+) however make no sense for Sets and so they return a Bag. That is similar to using Bool values as arguments to numeric operations: True + True == 2. I see the matter as being similar to Int vs Num. Any operation whose operands are Ints should return Ints wherever it is conceivable to do so. In particular, this means that dividing an Int by an Int should return an Int. As a kind of compromise an Int division could return a "doubled" value with but: 5 / 4 == 1 but 1.25. (Also, the modulus should be undefined for 2 Num.) This could be achieved by introducing the modulus on the Int level. But I think that modulus can be easily generalized to Num. E.g. 3.2 % 2.4 == 0.8. Handling negative divisors is more of an issue. With the Euclidean definition of modulus a negative sign should not propagate into the result. E.g. 8 % -3 == 8 % 3 == 2. Only promote an Int to a Num if one or more other operands are already Num, even if the value of that Num is a whole number. I'm not sure that this is the intent of the current spec. I personally value adherence to the subtyping of Int and Num higher than keeping a pure Int subsystem. Regards, TSa. --
Re: Set-returning .keys (was Re: Smart Matching clarification)
To start off with, I agree with your comment about making Set the main type and making Bag an extension built upon that, as complex is built upon num, etc. At 6:01 PM +0100 11/27/06, TSa wrote: And I still think that it is a good idea to name the set operations after their equivalent boolean connectives: (|) union (&) intersection (^) symmetric difference Well, and to make them Bag operations to start with. This may be a non-issue from a user's viewpoint, but as a user, I want set operations that have sets as input to return sets as output by default. Eg, unioning 2 Set that have common values should return a Set. Moreover, since set operations would be a lot more common in practice than bag operations, the set operations should be the most terse, if they both aren't equally terse. I see the matter as being similar to Int vs Num. Any operation whose operands are Ints should return Ints wherever it is conceivable to do so. In particular, this means that dividing an Int by an Int should return an Int. (Also, the modulus should be undefined for 2 Num.) Only promote an Int to a Num if one or more other operands are already Num, even if the value of that Num is a whole number. So taking the semantics of Int vs Num that users see as examples for how Set vs Bag semantics should work, as far as argument and result types go, makes a lot of sense, and is easily to implement with Perl 6 multis. -- Darren Duncan
Re: Set-returning .keys (was Re: Smart Matching clarification)
HaloO, Darren Duncan wrote: I was not meaning to get into implementation issues so much as just to say that all bag values are a superset of all set values. I agree with that. And I end up wanting a nice syntax to create a supertype. This is particularly needed if the Bag type shall be loadable as module whereas the Set is a base type. The problem is the same as one might want to supertype Num with Complex. OK, Complex is perhaps a base type, but the latest when it comes to supertyping quaternions onto Complex, a module is asked for. A FuzzySet would be another supertype of Set that will hardly be a core type. So, how do we do supertyping? Funny result is that with 'Set does Bag' and 'Seq does Bag' and Bag coming from a module, without loading it Set and Seq have not even an indirect relation. But then again 'Seq does Bag' is doubtful anyway because there are several Seq values for a single Bag. This collapsing of several Sequences into a single Bag does not establish a subset relation. Non the less picking a particular order for a Bag constitutes an interface subtype. The latter is the only reason why the issue came up in this thread. Note that the iteration interface would ideally be available from the Bag type. So we might want to make it a core type. What's @Larry's opinion on that? Bag(1,2,2,2,3,3) symmetric_difference Bag(1,2,2,4,4); # Bag(2,3,3,4,4) or Bag(3,3,4,4) ? This is just the union (1,2,2,2,3,3,4,4) minus the intersection (1,2,2): (2,3,3,4,4). And I still think that it is a good idea to name the set operations after their equivalent boolean connectives: (|) union (&) intersection (^) symmetric difference Well, and to make them Bag operations to start with. Regards, TSa. --
Re: Set-returning .keys (was Re: Smart Matching clarification)
At 10:55 AM +0100 11/27/06, TSa wrote: Seq and Set are *both* more specific or restricted than Bag. So it would make more sense to say 'role Set does Bag' (and 'role Seq does Bag'), not 'role Bag does Set'. For illustrative purposes, replace "Set" with "Int" and "Bag" with "Num". Everything that is a valid Set|Seq is a valid Bag, but the reverse isn't true. All this depends on what kind of subtype you are creating. My proposal is a strict extension of the interface or internal representation. Your proposal is going the other way of restricting multiplicity to 1. It is clear that all operations of a Bag can be carried out with a Set if you take the multiplicity as 1. In the end it's a matter of choice. The coolest thing would actually be to *supertype* Bag atop of Set (see the thread 'set operations for roles'). I was not meaning to get into implementation issues so much as just to say that all bag values are a superset of all set values. The Set and Bag types could very well, and probably should, have disjoint implementations, just as Array and Hash should probably have disjoint implementations even though you could represent all Array values as Hash values if you wanted to, for matters of efficiency. Note that a "value" in the aforementioned refers to the whole collection object, not an element therein. Bag(1,2,2,2,3,3) d_union Bag(1,2,2,4,4); # Bag(2,3,3,4,4) or Bag(3,3,4,4) ? Disjoint union has a Bag of Pair as output. See http://en.wikipedia.org/wiki/Disjoint_union So we get Bag(1=>1, 1=>2, 1=>2, 1=>2, 1=>3, 1=>3, 2=>1, 2=>2, 2=>2, 2=>4, 2=>4). Well, or as Bag of Seq (1,1; 1,2; ... ; 2,4). I always considered disjoint_union to mean exactly the same thing as symmetric_difference, meaning an analogy to XOR. Still, Wikipedia says they are different, and its http://en.wikipedia.org/wiki/Symmetric_difference article describes the meaning I had been attributing to both. So what I was really asking was: Bag(1,2,2,2,3,3) symmetric_difference Bag(1,2,2,4,4); # Bag(2,3,3,4,4) or Bag(3,3,4,4) ? -- Darren Duncan
Re: Set-returning .keys (was Re: Smart Matching clarification)
HaloO, Darren Duncan wrote: To start off, I should clarify that I see little value for the existence of a Bag type except for certain matters of syntactic or semantic brevity, but that those alone can still warrant its existence. I partly agree. The Bag or MultiSet type naturally falls out as an intermediate between Set and Seq in my typing. Since I see Bag as an interface extension subtype of Set, one could e.g. derive a FuzzySet in the same way. I think you have something backwards here. While the 3 collection types Seq,Bag,Set could be sequenced like that for some purposes of explanation, where adjacent types have commonalities that the other doesn't, I don't see that it falls to also chain .does() in the same direction all the way across. Seq and Set are *both* more specific or restricted than Bag. So it would make more sense to say 'role Set does Bag' (and 'role Seq does Bag'), not 'role Bag does Set'. For illustrative purposes, replace "Set" with "Int" and "Bag" with "Num". Everything that is a valid Set|Seq is a valid Bag, but the reverse isn't true. All this depends on what kind of subtype you are creating. My proposal is a strict extension of the interface or internal representation. Your proposal is going the other way of restricting multiplicity to 1. It is clear that all operations of a Bag can be carried out with a Set if you take the multiplicity as 1. In the end it's a matter of choice. The coolest thing would actually be to *supertype* Bag atop of Set (see the thread 'set operations for roles'). The operators [union, intersection, difference, disjoint-union, etc] have clearly defined and predictable behaviour with a Set, since all inputs and outputs have no duplicates. The operations of MultiSets are equally well defined. See e.g. http://en.wikipedia.org/wiki/Multiset But I would ask whether it is desirable for those Set operators to be present in Bag|Seq, and if so, then what the desired semantics are. For example, what would these return: The multiplicity is min for intersection and max for union. Bag(1,2,2,2,3,3) union Bag(1,2,2,4,4); # Bag(1,1,2,2,2,2,2,3,3,4,4) or Bag(1,2,2,2,3,3,4,4) ? So this would be Bag(1,2,2,2,3,3,4,4). Bag(1,2,2,2,3,3) intersection Bag(1,2,2,4,4); # Bag(1,1,2,2,2,2,2) or Bag(1,2,2) ? Bag(1,2,2) of course. Bag(1,2,2,2,3,3) difference Bag(1,2,2,4,4); # Bag(2,3,3) or Bag(3,3) ? Difference is taking away the intersection, so we get Bag(2,3,3). The most interesting new operation in the Bag type is the join. That yields a Bag even for Sets: Set(1,2,3) (+) Set(2,3,4) === Bag(1,2,2,3,3,4) Bag(1,2,2,2,3,3) d_union Bag(1,2,2,4,4); # Bag(2,3,3,4,4) or Bag(3,3,4,4) ? Disjoint union has a Bag of Pair as output. See http://en.wikipedia.org/wiki/Disjoint_union So we get Bag(1=>1, 1=>2, 1=>2, 1=>2, 1=>3, 1=>3, 2=>1, 2=>2, 2=>2, 2=>4, 2=>4). Well, or as Bag of Seq (1,1; 1,2; ... ; 2,4). Regards, TSa. --
Re: Set-returning .keys (was Re: Smart Matching clarification)
To start off, I should clarify that I see little value for the existence of a Bag type except for certain matters of syntactic or semantic brevity, but that those alone can still warrant its existence. A Bag is for marking when your duplicate-allowing collection is conceptually not ordered, and that is all that it is for. This marker is useful for optimizing certain places a Seq would otherwise use, such as implicitly permitting hyperthreading (a Set can also hyperthread). And it is also useful as a language-enforced stricture where you are prevented from doing order-dependent operations on that collection because they don't make sense. Aside from these optimizations and strictures afforded by a Bag type, I see no reason to provide too many operators for them ... in fact, I would argue that what one can do with a Bag be defined as an intersection of what one can do with a Seq and a Set. That said ... At 4:16 PM +0100 11/23/06, TSa wrote: Adriano Rodrigues wrote: And we may argue as well that being Bag a multiset, the set is a special case where all the elements have the same multiplicity. Or specifically, a multiplicity of 1. Yes, that would be a subset type. The thing I had in mind was 'role Seq does Bag' and 'role Bag does Set'. And classes with the same names for creating instances. I think you have something backwards here. While the 3 collection types Seq,Bag,Set could be sequenced like that for some purposes of explanation, where adjacent types have commonalities that the other doesn't, I don't see that it falls to also chain .does() in the same direction all the way across. Seq and Set are *both* more specific or restricted than Bag. So it would make more sense to say 'role Set does Bag' (and 'role Seq does Bag'), not 'role Bag does Set'. For illustrative purposes, replace "Set" with "Int" and "Bag" with "Num". Everything that is a valid Set|Seq is a valid Bag, but the reverse isn't true. (That's not to say that we can't cast a Bag as a Set, but that would change the value, like doing round|floor|ceil|etc on a Num to get an Int, and this is external to a .does relationship.) This also allows us to reserve operators for Set that Bag can't or won't have (because they depend on all collection elements being distinct), as we can reserve operators for Seq that Bag can't have (because they depend on the order of elements being significant). Now, there is a small handful of operations that could easily be ascribed to all 3 of those types, such as testing if an element exists, or how many occurrances there are, or iterating through all elements in an order-agnostic fashion. These can all have easily predictable and consistent behaviour. Moreover, some operations are clearly useable with only the Seq type, such as iterating through elements in order or reading an element at a specific index. The operators [union, intersection, difference, disjoint-union, etc] have clearly defined and predictable behaviour with a Set, since all inputs and outputs have no duplicates. The operational advantage of Set being a supertype of Seq is that all set operations are available for Seq out of the box. Mixed operations of Seq and Set would dispatch to the Set variant. The Seq operations like hypering are naturally precluded for Sets. But I would ask whether it is desirable for those Set operators to be present in Bag|Seq, and if so, then what the desired semantics are. For example, what would these return: Bag(1,2,2,2,3,3) union Bag(1,2,2,4,4); # Bag(1,1,2,2,2,2,2,3,3,4,4) or Bag(1,2,2,2,3,3,4,4) ? Bag(1,2,2,2,3,3) intersection Bag(1,2,2,4,4); # Bag(1,1,2,2,2,2,2) or Bag(1,2,2) ? Bag(1,2,2,2,3,3) difference Bag(1,2,2,4,4); # Bag(2,3,3) or Bag(3,3) ? Bag(1,2,2,2,3,3) d_union Bag(1,2,2,4,4); # Bag(2,3,3,4,4) or Bag(3,3,4,4) ? Repeat again with Bag->Seq. In my mind, it would be far simpler to reserve such operators to the Set only, and cast a Bag|Seq as a Set to use them on it, if that is desired whereupon the results are all distinct. But still, it is something that should be decided on, one way or the other. -- Darren Duncan
Re: Set-returning .keys (was Re: Smart Matching clarification)
P.S. Sending this again, for timeliness, (first attempt was 20 hours ago) due to p6l mail server being down before. Sorry if you end up getting a duplicate later. To start off, I should clarify that I see little value for the existence of a Bag type except for certain matters of syntactic or semantic brevity, but that those alone can still warrant its existence. A Bag is for marking when your duplicate-allowing collection is conceptually not ordered, and that is all that it is for. This marker is useful for optimizing certain places a Seq would otherwise use, such as implicitly permitting hyperthreading (a Set can also hyperthread). And it is also useful as a language-enforced stricture where you are prevented from doing order-dependent operations on that collection because they don't make sense. Aside from these optimizations and strictures afforded by a Bag type, I see no reason to provide too many operators for them ... in fact, I would argue that what one can do with a Bag be defined as an intersection of what one can do with a Seq and a Set. That said ... At 4:16 PM +0100 11/23/06, TSa wrote: Adriano Rodrigues wrote: And we may argue as well that being Bag a multiset, the set is a special case where all the elements have the same multiplicity. Or specifically, a multiplicity of 1. Yes, that would be a subset type. The thing I had in mind was 'role Seq does Bag' and 'role Bag does Set'. And classes with the same names for creating instances. I think you have something backwards here. While the 3 collection types Seq,Bag,Set could be sequenced like that for some purposes of explanation, where adjacent types have commonalities that the other doesn't, I don't see that it falls to also chain .does() in the same direction all the way across. Seq and Set are *both* more specific or restricted than Bag. So it would make more sense to say 'role Set does Bag' (and 'role Seq does Bag'), not 'role Bag does Set'. For illustrative purposes, replace "Set" with "Int" and "Bag" with "Num". Everything that is a valid Set|Seq is a valid Bag, but the reverse isn't true. (That's not to say that we can't cast a Bag as a Set, but that would change the value, like doing round|floor|ceil|etc on a Num to get an Int, and this is external to a .does relationship.) This also allows us to reserve operators for Set that Bag can't or won't have (because they depend on all collection elements being distinct), as we can reserve operators for Seq that Bag can't have (because they depend on the order of elements being significant). Now, there is a small handful of operations that could easily be ascribed to all 3 of those types, such as testing if an element exists, or how many occurrances there are, or iterating through all elements in an order-agnostic fashion. These can all have easily predictable and consistent behaviour. Moreover, some operations are clearly useable with only the Seq type, such as iterating through elements in order or reading an element at a specific index. The operators [union, intersection, difference, disjoint-union, etc] have clearly defined and predictable behaviour with a Set, since all inputs and outputs have no duplicates. The operational advantage of Set being a supertype of Seq is that all set operations are available for Seq out of the box. Mixed operations of Seq and Set would dispatch to the Set variant. The Seq operations like hypering are naturally precluded for Sets. But I would ask whether it is desirable for those Set operators to be present in Bag|Seq, and if so, then what the desired semantics are. For example, what would these return: Bag(1,2,2,2,3,3) union Bag(1,2,2,4,4); # Bag(1,1,2,2,2,2,2,3,3,4,4) or Bag(1,2,2,2,3,3,4,4) ? Bag(1,2,2,2,3,3) intersection Bag(1,2,2,4,4); # Bag(1,1,2,2,2,2,2) or Bag(1,2,2) ? Bag(1,2,2,2,3,3) difference Bag(1,2,2,4,4); # Bag(2,3,3) or Bag(3,3) ? Bag(1,2,2,2,3,3) d_union Bag(1,2,2,4,4); # Bag(2,3,3,4,4) or Bag(3,3,4,4) ? Repeat again with Bag->Seq. In my mind, it would be far simpler to reserve such operators to the Set only, and cast a Bag|Seq as a Set to use them on it, if that is desired whereupon the results are all distinct. But still, it is something that should be decided on, one way or the other. -- Darren Duncan
Re: Set-returning .keys (was Re: Smart Matching clarification)
HaloO, Jonathan Lang wrote: Other cases: What would 'Set.push(items)' and 'Set.pop()' do? What _is_ the appropriate way to go about adding items to (or removing items from) a Set, or of searching the Set for an element? Since Sets are immutable values there should be no push and pop methods. These are part of the rw Array interface. Instead we have: my Set $s = set(1,2,3); $s (|)= (4,5,6); # $s === set(1,2,3,4,5,6) $s (-)= (5,6);# $s === set(1,2,3,4) Searching a set is just the element containment test: if 3 (in) $s { say "set contains 3" } Regards, TSa. --
Re: Set-returning .keys (was Re: Smart Matching clarification)
HaloO, Adriano Rodrigues wrote: And we may argue as well that being Bag a multiset, the set is a special case where all the elements have the same multiplicity. Yes, that would be a subset type. The thing I had in mind was 'role Seq does Bag' and 'role Bag does Set'. And classes with the same names for creating instances. Going from a Set to a Seq imposes some dilemma which may be though similar to how going from a list to a scalar (it is the length, a reference, what else?). The mapping of a Set to a Seq could be parameterized via an ordering sub. But there are more possibilities. Indeed it is easier to drop information than it is to create it in the first place. Note that the order of a Seq can be random, that is not compressable to a sub that produces it. So there's no other way than storing the order. And this is exactly what a Seq does anyway. The operational advantage of Set being a supertype of Seq is that all set operations are available for Seq out of the box. Mixed operations of Seq and Set would dispatch to the Set variant. The Seq operations like hypering are naturally precluded for Sets. Many places where the return value of Hash::keys might be put would be typed as expecting a Set. And actually returning a Set would make %a.keys === %b.keys a natural idiom. Jonathan's concerns of iterating being unavailable for a Set should be unfounded because this functionality should be in Set and hence in Seq from where we know it best. There could e.g. be a role Iterateable that Set, Range, Array and Hash do. Regards, TSa. --
Re: Set-returning .keys (was Re: Smart Matching clarification)
On 11/23/06, TSa <[EMAIL PROTECTED]> wrote: HaloO, Darren Duncan wrote: > And if Seq and Set etc are interchangeable for all situations where it > doesn't matter whether the elements are ordered or not, then a lot of > times users won't have to care which they have. One can argue that we have the subtyping chain Seq <: Bag <: Set for these immutable types. The idea is that a Bag is a multiset, that is a set that maintains a multiplicity per element. And we may argue as well that being Bag a multiset, the set is a special case where all the elements have the same multiplicity. Apart from that it behaves like Set. The Seq would add an order to the elements and some functionality that builds on that order. The good thing would be that Seq literals are applicable where Sets are expected. Going from a Set to a Seq imposes some dilemma which may be though similar to how going from a list to a scalar (it is the length, a reference, what else?). The mapping of a Set to a Seq could be parameterized via an ordering sub. But there are more possibilities.
Re: Set-returning .keys (was Re: Smart Matching clarification)
HaloO, Darren Duncan wrote: And if Seq and Set etc are interchangeable for all situations where it doesn't matter whether the elements are ordered or not, then a lot of times users won't have to care which they have. One can argue that we have the subtyping chain Seq <: Bag <: Set for these immutable types. The idea is that a Bag is a multiset, that is a set that maintains a multiplicity per element. Apart from that it behaves like Set. The Seq would add an order to the elements and some functionality that builds on that order. The good thing would be that Seq literals are applicable where Sets are expected. Regards, TSa. --
Set-returning .keys (was Re: Smart Matching clarification)
At 3:24 AM -0800 11/18/06, Jonathan Lang wrote: Jonathan Lang wrote: Larry Wall wrote: > it seems to me that .keys.sort is > suboptimal if sort has to second-guess the ordering provided by the underlying hash. Not only is it suboptimal, it might not be possible. Sorting depends on cmp returning Order::Increase, Order::Same, or Order::Decrease in every case; but what if you're comparing Color::Blue to Bool::True? And what about classes that involve partial ordering, such as sets? Heck, how do you '[cmp] Color::Blue, Color::Gray', and what does "sqrt(-1) cmp 0" (or even "sqrt(-1) <=> 0") return? OK: IIRC, this original definition also preceded Sets. So instead of "$_.keys.sort »===« $x.keys.sort", perhaps this should simply be "Set($_.keys) === Set($x.keys)", corrected for proper syntax. Heck, perhaps "$_.keys" and "$x.keys" should _be_ Sets. I seem to remember having this discussion months ago when trying to implement Set::Relation. Absolutely .keys should return a Set. We *know* already that all keys in a keyed collection are unique, so why not explicitly stamp them as such from the start by returning them in a Set container; then users of those keys don't have to recheck them for uniqueness, such as in a Set's constructor, if they want to use them as a set. Then we can reliably and tersely say "$_.keys === $x.keys" and it will do the right thing. Similarly, we can do set operations with the keys more tersely, such as "$_.keys subset $x.keys" or "$_.keys superset $x.keys" or "$_.keys union $x.keys" or "$_.keys intersection $x.keys" or "$_.keys minus $x.keys" etc. And you can compare the keysets of 2 collections reliably even when keys are objects of a data type that is *not* ordinal, which makes it more general. Finally, common operations like .keys.sort (which returns a Seq or List) can be written just as tersely or the same as before, so that syntax can be kept. On a tangential matter, if you follow my suggestion of another email and actually add a (immutable) Bag type to the language, which your documentation already references as a common thing people may use, then .values can return that, since it is an unordered collection that may have duplicates. Once again, you can then compare the value lists of 2 Hashes set without sorting them. Still, regardless of what you do here, making .keys return a Set should be done. -- Darren Duncan
Re: Smart Matching clarification
Jonathan Lang wrote: Larry Wall wrote: > Jonathan Lang wrote: > : Looking through the table provided, I ran across the following: > : > :$_ $xType of Match ImpliedMatching Code > :== = == > :HashHash hash keys identical match if $_.keys.sort > : »eq« $x.keys.sort > : > : My understanding is that at the time this was written, the working > : theory was that Hash keys would always be strings. I'm wondering: > : should this entry replace 'eq' with '===' or 'eqv', so that non-string > : keys can also be compared for equivalent values? If so, which > : operator should replace 'eq'? (I'm leaning toward '===', since S03 > : defines '$a eq $b' as '~$a === ~$b'.) > > Yes, it should be ===. But in revising the smartmatching tables for this > and other ===nesses, and thinking about how hashes may or may not be > implemented as ordered underneath, it seems to me that .keys.sort is > suboptimal if sort has to second-guess the ordering provided by the > underlying hash. Not only is it suboptimal, it might not be possible. Sorting depends on cmp returning Order::Increase, Order::Same, or Order::Decrease in every case; but what if you're comparing Color::Blue to Bool::True? And what about classes that involve partial ordering, such as sets? Heck, how do you '[cmp] Color::Blue, Color::Gray', and what does "sqrt(-1) cmp 0" (or even "sqrt(-1) <=> 0") return? OK: IIRC, this original definition also preceded Sets. So instead of "$_.keys.sort »===« $x.keys.sort", perhaps this should simply be "Set($_.keys) === Set($x.keys)", corrected for proper syntax. Heck, perhaps "$_.keys" and "$x.keys" should _be_ Sets. -- Jonathan "Dataweaver" Lang
Re: Smart Matching clarification
> So maybe we have some or all of: > > .keys .sortkeys > .values .sortvalues > .kv .sortkv > .pairs.sortpairs > > Possible variations: .skeys, .ordkeys, etc. Also could flip the > default and make .keys sort by default and then you use .rawkeys to get > unordered--shades of PHP. Taking a page from Template Toolkit. .keys # same as perl5 .sort # the sorted keys I know that it isn't quite parallel with Array.sort and it doesn't provide for .sortkv or .sort pairs, but it might be an option. Paul
Re: Smart Matching clarification
Larry Wall wrote: Jonathan Lang wrote: : Looking through the table provided, I ran across the following: : :$_ $xType of Match ImpliedMatching Code :== = == :HashHash hash keys identical match if $_.keys.sort : »eq« $x.keys.sort : : My understanding is that at the time this was written, the working : theory was that Hash keys would always be strings. I'm wondering: : should this entry replace 'eq' with '===' or 'eqv', so that non-string : keys can also be compared for equivalent values? If so, which : operator should replace 'eq'? (I'm leaning toward '===', since S03 : defines '$a eq $b' as '~$a === ~$b'.) Yes, it should be ===. But in revising the smartmatching tables for this and other ===nesses, and thinking about how hashes may or may not be implemented as ordered underneath, it seems to me that .keys.sort is suboptimal if sort has to second-guess the ordering provided by the underlying hash. Not only is it suboptimal, it might not be possible. Sorting depends on cmp returning Order::Increase, Order::Same, or Order::Decrease in every case; but what if you're comparing Color::Blue to Bool::True? And what about classes that involve partial ordering, such as sets? Heck, how do you '[cmp] Color::Blue, Color::Gray', and what does "sqrt(-1) cmp 0" (or even "sqrt(-1) <=> 0") return? -- Jonathan "Dataweaver" Lang
Re: Smart Matching clarification
On Thu, Nov 16, 2006 at 04:25:30PM -0800, Jonathan Lang wrote: : Looking through the table provided, I ran across the following: : :$_ $xType of Match ImpliedMatching Code :== = == :HashHash hash keys identical match if $_.keys.sort : »eq« $x.keys.sort : : My understanding is that at the time this was written, the working : theory was that Hash keys would always be strings. I'm wondering: : should this entry replace 'eq' with '===' or 'eqv', so that non-string : keys can also be compared for equivalent values? If so, which : operator should replace 'eq'? (I'm leaning toward '===', since S03 : defines '$a eq $b' as '~$a === ~$b'.) Yes, it should be ===. But in revising the smartmatching tables for this and other ===nesses, and thinking about how hashes may or may not be implemented as ordered underneath, it seems to me that .keys.sort is suboptimal if sort has to second-guess the ordering provided by the underlying hash. So maybe we have some or all of: .keys .sortkeys .values .sortvalues .kv .sortkv .pairs .sortpairs Possible variations: .skeys, .ordkeys, etc. Also could flip the default and make .keys sort by default and then you use .rawkeys to get unordered--shades of PHP. (Note that arrays are already considered to be ordered when you use these methods.) Or we stick with the methods we have now and give options for sorting and selecting as parameters to .keys etc. In any case, making these methods able to sort allows sorted .kv lists, which would not be possible with .keys.sort. It also allows the hash or array itself to specify the ordering behavior declaratively, and perhaps optimize ordering in various ways that .keys.sort can't do without cheating. Slicing modifiers like :kv and :pairs could also take an optional sort parameter, presumably. Larry
Smart Matching clarification
Looking through the table provided, I ran across the following: $_ $xType of Match ImpliedMatching Code == = == HashHash hash keys identical match if $_.keys.sort »eq« $x.keys.sort My understanding is that at the time this was written, the working theory was that Hash keys would always be strings. I'm wondering: should this entry replace 'eq' with '===' or 'eqv', so that non-string keys can also be compared for equivalent values? If so, which operator should replace 'eq'? (I'm leaning toward '===', since S03 defines '$a eq $b' as '~$a === ~$b'.) -- Jonathan "Dataweaver" Lang