To start off, I should clarify that I see little value for the existence of a Bag type except for certain matters of syntactic or semantic brevity, but that those alone can still warrant its existence.

A Bag is for marking when your duplicate-allowing collection is conceptually not ordered, and that is all that it is for. This marker is useful for optimizing certain places a Seq would otherwise use, such as implicitly permitting hyperthreading (a Set can also hyperthread). And it is also useful as a language-enforced stricture where you are prevented from doing order-dependent operations on that collection because they don't make sense.

Aside from these optimizations and strictures afforded by a Bag type, I see no reason to provide too many operators for them ... in fact, I would argue that what one can do with a Bag be defined as an intersection of what one can do with a Seq and a Set.

That said ...

At 4:16 PM +0100 11/23/06, TSa wrote:
Adriano Rodrigues wrote:
And we may argue as well that being Bag a multiset, the set is a
special case where all the elements have the same multiplicity.

Or specifically, a multiplicity of 1.

Yes, that would be a subset type. The thing I had in mind was
'role Seq does Bag' and 'role Bag does Set'. And classes with
the same names for creating instances.

I think you have something backwards here. While the 3 collection types Seq,Bag,Set could be sequenced like that for some purposes of explanation, where adjacent types have commonalities that the other doesn't, I don't see that it falls to also chain .does() in the same direction all the way across.

Seq and Set are *both* more specific or restricted than Bag. So it would make more sense to say 'role Set does Bag' (and 'role Seq does Bag'), not 'role Bag does Set'. For illustrative purposes, replace "Set" with "Int" and "Bag" with "Num". Everything that is a valid Set|Seq is a valid Bag, but the reverse isn't true.

(That's not to say that we can't cast a Bag as a Set, but that would change the value, like doing round|floor|ceil|etc on a Num to get an Int, and this is external to a .does relationship.)

This also allows us to reserve operators for Set that Bag can't or won't have (because they depend on all collection elements being distinct), as we can reserve operators for Seq that Bag can't have (because they depend on the order of elements being significant).

Now, there is a small handful of operations that could easily be ascribed to all 3 of those types, such as testing if an element exists, or how many occurrances there are, or iterating through all elements in an order-agnostic fashion. These can all have easily predictable and consistent behaviour.

Moreover, some operations are clearly useable with only the Seq type, such as iterating through elements in order or reading an element at a specific index.

The operators [union, intersection, difference, disjoint-union, etc] have clearly defined and predictable behaviour with a Set, since all inputs and outputs have no duplicates.

The operational advantage of Set being a supertype of Seq is that
all set operations are available for Seq out of the box. Mixed
operations of Seq and Set would dispatch to the Set variant. The
Seq operations like hypering are naturally precluded for Sets.

But I would ask whether it is desirable for those Set operators to be present in Bag|Seq, and if so, then what the desired semantics are. For example, what would these return:

  Bag(1,2,2,2,3,3) union Bag(1,2,2,4,4);
    # Bag(1,1,2,2,2,2,2,3,3,4,4) or Bag(1,2,2,2,3,3,4,4) ?

  Bag(1,2,2,2,3,3) intersection Bag(1,2,2,4,4);
    # Bag(1,1,2,2,2,2,2) or Bag(1,2,2) ?

  Bag(1,2,2,2,3,3) difference Bag(1,2,2,4,4);
    # Bag(2,3,3) or Bag(3,3) ?

  Bag(1,2,2,2,3,3) d_union Bag(1,2,2,4,4);
    # Bag(2,3,3,4,4) or Bag(3,3,4,4) ?

Repeat again with Bag->Seq.

In my mind, it would be far simpler to reserve such operators to the Set only, and cast a Bag|Seq as a Set to use them on it, if that is desired whereupon the results are all distinct.

But still, it is something that should be decided on, one way or the other.

-- Darren Duncan

Reply via email to