I just implemented Bag to the point where it passes the spectests.  
(https://github.com/masonk/rakudo/commit/2668178c6ba90863538ea74cfdd287684a20c520)
  However, in doing so, I discovered that I'm not really sure what Bags are 
for, anymore.

The more I think about Bags and Sets, the more my brain hurts.  They're a half 
an EnumMap and half an Iterable that does Associative but not Positional.  
However, I'm starting to believe that they are more like Iterables than 
EnumMaps.  When I imagine using them, I think of Sets as a cute way to operate 
on the unique elements of an Iterable.  I think of Bags / KeyBags as a way to 
remove ordering, which is a generally useful thing (everything that I'm about 
to say applies to both Bags and KeyBags, but I'm going to only talk about Bags 
for the rest of this post).  This is because, most of the time, we don't care 
about ordering, and having ordering on all of our collections even when we 
don't need it increases program complexity in time in a way that could be seen 
as analogous to the way in which unnecessarily global variables increased the 
space complexity of Perl 5.

I want to propose one major change to the Bag spec: When a Bag is used as an 
Iterable, you get an Iterator that has each key in proportion to the number of 
times it appears in the Bag.

With this one change to Bags, I could use them whenever I don't need ordering 
in my lists - which is usually.  Even though there are some side effects that 
don't rely on ordering (e.g., incrementation), the majority of them do - so by 
using this new kind of Bag, I would be reducing the complexity of my programs.  
Now, since Sets already give us the distinct values, having Bags do the same 
thing seems like redundant functionality, where we could be getting novel 
functionality.  

I'd like to anticipate one objection to this - the existence of the 'hyper' 
operator/keyword.  The hyper operator says, "I am taking responsibility for 
this particular code block and promising that it can execute out of order and 
concurrently".  Creating a Bag instead of an Array says, "there is no meaning 
to the ordering of this group of things, ever".  Basically, if I know at 
declaration time that my collection has no sense of ordering, then I shouldn't 
have to annotate every iteration of that collection as having no sense of 
ordering, which is nearly what hyper does (though, I readily admit, not quite, 
because there are unordered ways to create race conditions).

I also have some convenience syntax suggestions.  I do think this is important 
because Bags and Sets are competing with Arrays.  If they aren't as convenient 
as Arrays to use, they won't get used - even though they're closer, 
semantically, to what the developer wants in a lot of cases.   First, we should 
besigil Bags and Sets with @ instead of $.  Without this convenience, I'm not 
likely to replace my Arrays with Bags, because going through them in a loop or 
map would be a pain compared to Arrays.  If I have to say $bag.keys every 
single time, forgettaboutit.  

This, however, probably requires a change to S03, which says that the @ sigil 
is a means of coercing the object to the "Positional (or Iterable?)" role.  It 
seems to me, based on the guiding principle that perl6 should support 
functional idioms and side-effect free computing, the more fundamental and 
important aspect of things with @ in front is that you can go through them one 
by one, and not that they're ordered (since ordering is irrelevant in 
functional computing, but iterating is not).  My feeling is that we should 
reserve the special syntax for the more fundamental of the two operations, so 
as not to bias the programmer towards rigid sequentiality through syntax.

Second, I would be even more likely to replace my ordered lists with Bags if 
there were a convenient operator for constructing Bags.  I can't think of any 
good non-letter symbols that aren't taken right now (suggestions welcome), but, 
at  least, &b and &s as aliases to bag and set would be convenient.

Bags and Sets thus updated would look like this in use:
C<
my @array = < a a b c >;
my @set = s...@array;
for s...@array { say $_ };
for @set { say $_ };    # same thing
# b«␤»a«␤»c«␤»
# ordering undefined
# most common use case for sets, I think, is "unique elements of @array", isn't 
it?

hyper for @bag { ... };
# a«␤»b«␤»c«␤» a«␤»
# ordering undefined => less-thinking-required hyper

b< a b c c > === b< c c b a >
# Wouldn't this be the best way to make a comparison with these semantics?
# By the way, this useful idiom works as currently specced, but doesn't work in 
my implementation

@bag{a}
# 2

@bag{<a b z>}
# 2, 1, 0

[+] bag @array{<a b z>}
# 3
# this is also neat for "How many a's, b's, and z's do I have?"

+...@bag
# 4

@bag[2]
# I can't think of a meaning for this - not Positional - S03 needs a change?

@bag.WHAT
# Bag()

@bag.pairs
# a => 2, b => 1, c => 1
# ordering undefined

@bag.values
# 2, 1, 1
# ordering undefined

Junctions:

Junctions seem like one time when we care more about the values than the keys, 
because C<any|all|none|one> on @array and b...@array will have the same 
behavior (if my suggestion above is taken with respect to @bag holding < a a b 
c > out of order instead of < a b c > out of order), and for Sets, it's the 
same story, with the added proviso that C<one> degenerates to C<any>.  But 
@bag.any > $x seems like a pretty useful idiom.  It would feel inconsistent for 
any(@bag) and @bag.any to do different things, however.

On Oct 26, 2010, at  12:57 AM, nore...@github.com wrote:

> Branch: refs/heads/master
> Home:   http://github.com/perl6/specs
> 
> Commit: 32511f7db34905c740ed1030a70995239f7cfb66
>    
> http://github.com/perl6/specs/commit/32511f7db34905c740ed1030a70995239f7cfb66
> Author: TimToady <la...@wall.org>
> Date:   2010-10-25 (Mon, 25 Oct 2010)
> 
> Changed paths:
>  M S02-bits.pod
> 
> Log Message:
> -----------
> [S02] be more explicit about iterating sets/bags
> 
> The intent has always been that when you use a set or bag as a list,
> it behaves as a list of its keys, regardless of any underlying hash
> interface it might also respond to.  You must use .pairs explicitly
> to get the hash pairs out of a set or bag as a list.
> 
> 

Reply via email to