Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-16 Thread Darren Duncan

Jon Lang wrote:

Darren Duncan wrote:

This said, I specifically think that a simple pair of curly braces is the
best way to mark a Set.

 {1,2,3}  # a Set of those 3 elements

... and this is also how it is done in maths I believe (and in Muldis D).

In fact, I strongly support this assuming that all disambiguation eg with
hashes can be specified.


That would be great.


Glad you agree.

snip


Sets built from multi-dimensional arrays migt be a problem:

{1, 2, 3: 4, 5, 6}


Does that even work?  I thought the colon, or is it a semicolon, only had that 
meaning in a delimited list like () or [].


In any event, I don't believe there is such a thing as a multi-dimensional set 
in that way.  Unless you have a concept of multi-dimensional Hash keys, and then 
there might be an analogy.


snip

As for bags, well I think that is where we could get fancier.

But *no* doubling up, as we don't want to interfere with nesting.

Instead, it is common in maths to associate a + with set syntax to refer
to bags instead.

So, does Perl already ascribe a meaning to putting a + with various
bracketing characters, such as this:

 +{1,2,2,5}  # a Bag of 4 elements with 2 duplicates

 +{}  # an empty Bag, unless that already means something

So would the above try to cast the collection as a number, or take the count
of its elements, or can we use something like that?


I'd expect +{...} to count the elements.


Something else I just thought of, and my main reason for writing this reply, is 
other options.


Firstly, and I don't necessarily like this option, maybe we could use the simple 
 curly-brace pair to mean something more general that can be treated as either 
a Set or a Bag depending on context.  At least from my brief look around, it 
appears that maths use the same {foo, bar, baz} syntax to denote both sets and 
bags.  In some ways it would be like how Perl has the generic (foo, bar, baz) 
syntax, which remembers order but isn't an Array.  We certainly can't use the 
presence of duplicates in the {...} to pick Set vs Bag because there could 
legitimately be duplicates or not duplicates in the literals for both, 
especially if any of the list items are variables and we won't know until 
runtime whether any duplicate each other or not.


I still think the better option is to have slightly different looking syntax for 
the two.  I still prefer Set being the plain brace pair and a Bag being that 
plus something extra.  It seems that a leading + or ~ or ? is out because those 
have established meanings as treating what they're next to in num/str/bool 
context, so something else.  But it really should be a leading symbolic.


The differentiator needs to be be leading, not trailing; end-weight is bad.

I think that having the marker character /inside/ the curly braces actually 
gives us more choices and would cut down on syntactic conflicts, because then we 
can basically pick anything that isn't a symbolic prefix unary.


Barring a better suggestion, I suggest the greater-than symbol.

So:

  {1,2,3,3,4}  # 4-element Set

  {1,2,3,3,4}  # 5-element Bag

I think that looks different than anything else we have, and the greater-than 
could be a mnemonic that there is more in here.


Moreover, the different appearance means we could use = to indicate a count of 
that element's contribution to its count, {1,2,3=2,4}, without there being a 
confusion with a Hash.


That said, I like the + most when differentiating a Bag from a Set, but we 
have that symbolic unary + which could interfere with it.


-- Darren Duncan


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-13 Thread Brandon S Allbery KF8NH
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/7/10 23:19 , Jon Lang wrote:
 1 -- 2 -- 3
 
 Would be a Bag containing three elements: 1, 2, and 3.
 
 Personally, I wouldn't put a high priority on this; for my purposes,
 
Bag(1, 2, 3)
 
 works just fine.

Hm. Bag as [! 1, 2, 3 !] and Set as {! 1, 2, 3 !}, with the outer bracket by
analogy with arrays or hashes respectively and the ! having the mnemonic of
looking like handles?  (I have to imagine there are plenty of Unicode
brackets to match.)

- -- 
brandon s. allbery [linux,solaris,freebsd,perl]  allb...@kf8nh.com
system administrator  [openafs,heimdal,too many hats]  allb...@ece.cmu.edu
electrical and computer engineering, carnegie mellon university  KF8NH
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkzfCfcACgkQIn7hlCsL25W5DACgzX15js/a8QRcE64QIvAax0kc
b1AAn0G+eXfNN9+spB7vvybnAnbn1nFI
=EZJL
-END PGP SIGNATURE-


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-13 Thread Jon Lang
Brandon S Allbery KF8NH wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 11/7/10 23:19 , Jon Lang wrote:
     1 -- 2 -- 3

 Would be a Bag containing three elements: 1, 2, and 3.

 Personally, I wouldn't put a high priority on this; for my purposes,

    Bag(1, 2, 3)

 works just fine.

 Hm. Bag as [! 1, 2, 3 !] and Set as {! 1, 2, 3 !}, with the outer bracket by
 analogy with arrays or hashes respectively and the ! having the mnemonic of
 looking like handles?  (I have to imagine there are plenty of Unicode
 brackets to match.)

That saves a singlr character over Bag( ... ) and Set( ... ),
respectively (or three characters, if you find decent unicode bracket
choices).  It still wouldn't be a big enough deal to me to bother with
it.

As well, my first impression upon seeing [! ... !] was to think
you're negating everything inside?  That said, I could get behind
doubled brackets:

[[1, 2, 3]] # same as Bag(1, 2, 3)
{{1, 2, 3}} # same as Set(1, 2, 3)

AFAIK, this would cause no conflicts with existing code.

Or maybe these should be reversed:

[[1, 1, 2, 3]] # a Set containing 1, 2, and 3
{{1, 1, 2, 3}} # a Bag containing two 1s, a 2, and a 3
{{1 = 2, 2 = 1, 3 = 1}} # another way of defining the same Bag,
with explicit counts.

OTOH, perhaps the outermost character should always be a square brace,
to indicate that it operates primarily like a list; while the
innermost character should be either a square brace or a curly brace,
to hint at thye kind of syntax that you might find inside:

[[1, 1, 2, 3]] # a Set containing 1, 2, and 3
[{1, 1, 2, 3}] # a Bag containing two 1s, a 2, and a 3
[{1 = 2, 2 = 1, 3 = 1}] # another way of defining the same Bag,
with explicit counts.

Yeah; I could see that.  The only catch is that it might cause
problems with existing code that nests square or curly braces inside
of square braces:

[[1, 2, 3], [4, 5, 6], [7, 8, 9]] # fail; would try to create Set
from 1, 2, 3], [4, 5, 6], [7, 8, 9
[ [1, 2, 3], [4, 5, 6], [7, 8, 9] ] # creates 3-by-3 array

...so maybe not.

It should never be more than two characters on either side; and
there's some benefit to using square or curly braces as one of them,
to hint at proper syntax within.  Hmm... how about:

|[1, 2, 3]| # Set literal
|[1=true, 2=true, 3=true]| # technically possible; but why do it?
|{1, 1, 2, 3}| # Bag literal
|{1=2, 2=1, 3=1}| # counted Bag literal

-- 
Jonathan Dataweaver Lang


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-13 Thread Carl Mäsak
Jonathan Lang ():
 As well, my first impression upon seeing [! ... !] was to think
 you're negating everything inside?  That said, I could get behind
 doubled brackets:

    [[1, 2, 3]] # same as Bag(1, 2, 3)
    {{1, 2, 3}} # same as Set(1, 2, 3)

 AFAIK, this would cause no conflicts with existing code.

 Or maybe these should be reversed:

    [[1, 1, 2, 3]] # a Set containing 1, 2, and 3
    {{1, 1, 2, 3}} # a Bag containing two 1s, a 2, and a 3
    {{1 = 2, 2 = 1, 3 = 1}} # another way of defining the same Bag,
 with explicit counts.

 OTOH, perhaps the outermost character should always be a square brace,
 to indicate that it operates primarily like a list; while the
 innermost character should be either a square brace or a curly brace,
 to hint at thye kind of syntax that you might find inside:

    [[1, 1, 2, 3]] # a Set containing 1, 2, and 3
    [{1, 1, 2, 3}] # a Bag containing two 1s, a 2, and a 3
    [{1 = 2, 2 = 1, 3 = 1}] # another way of defining the same Bag,
 with explicit counts.

 Yeah; I could see that.  The only catch is that it might cause
 problems with existing code that nests square or curly braces inside
 of square braces:

    [[1, 2, 3], [4, 5, 6], [7, 8, 9]] # fail; would try to create Set
 from 1, 2, 3], [4, 5, 6], [7, 8, 9
    [ [1, 2, 3], [4, 5, 6], [7, 8, 9] ] # creates 3-by-3 array

 ...so maybe not.

 It should never be more than two characters on either side; and
 there's some benefit to using square or curly braces as one of them,
 to hint at proper syntax within.  Hmm... how about:

    |[1, 2, 3]| # Set literal
    |[1=true, 2=true, 3=true]| # technically possible; but why do it?
    |{1, 1, 2, 3}| # Bag literal
    |{1=2, 2=1, 3=1}| # counted Bag literal

After skimming all those suggestions, I have yet another proposal:
let's not add anything, creating marginal gain with lots of extra
syntax.

 That saves a singlr character over Bag( ... ) and Set( ... ),
 respectively (or three characters, if you find decent unicode bracket
 choices).  It still wouldn't be a big enough deal to me to bother with
 it.

+1. Let's leave it at that.

// Carl


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-13 Thread Jon Lang
Carl Mäsak wrote:
 Jonathan Lang ():
 That saves a singlr character over Bag( ... ) and Set( ... ),
 respectively (or three characters, if you find decent unicode bracket
 choices).  It still wouldn't be a big enough deal to me to bother with
 it.

 +1. Let's leave it at that.

That said, I do think that Bag( ... ) should be able to take pairs, so
that one can easily create a Bag that holds, say, twenty of a given
item, without having to spell out the item twenty times.  Beyond that,
the only other syntax being proposed is a set of braces to be used to
create Bags and Sets, as part of the initiative to make them nearly as
easy to use as lists.  In essence, you'd be introducing two operators:
circumfix:|[ ]| and circumfix:|{ }|, as aliases for the respective
Set and Bag constructors.  As I said, it's not a big deal - either
way.

Really, my main issue remains the choice of sigil for a variable
that's supposed to hold baggy containers.

-- 
Jonathan Dataweaver Lang


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-13 Thread Darren Duncan

Jon Lang wrote:

That saves a singlr character over Bag( ... ) and Set( ... ),
respectively (or three characters, if you find decent unicode bracket
choices).  It still wouldn't be a big enough deal to me to bother with
it.

As well, my first impression upon seeing [! ... !] was to think
you're negating everything inside?  That said, I could get behind
doubled brackets:

[[1, 2, 3]] # same as Bag(1, 2, 3)
{{1, 2, 3}} # same as Set(1, 2, 3)

snip

I prefer to have the mnemonic that {} means unordered and that [] means ordered, 
so please stick to [] meaning arrays or ordered collections, an {} meaning 
unordered collections, so set and bag syntax should be based around {} if either.


This said, I specifically think that a simple pair of curly braces is the best 
way to mark a Set.


So:

  {1,2,3}  # a Set of those 3 elements

... and this is also how it is done in maths I believe (and in Muldis D).

In fact, I strongly support this assuming that all disambiguation eg with hashes 
can be specified.


  {a=1,b=2}  # a Hash of 2 pairs

  {:a1, :a2}  # we'll have to pick a meaning

  {}  # we'll have to pick a meaning (Muldis D makes it a Set; %:{} is its Hash)

  {;}  # an anonymous sub or something

  {a=1}  # Hash

  {1}  # Set

  {1;}  # anonymous sub or something

But keep that simple an let nesting work normally, so:

  {{1}}  # a Set of 1 element that is a Set of 1 element

  {{a=1}}  # a Set with 1 Hash element

  {[1]}  # a Set with 1 Array element

  [{1}]  # an Array with 1 Set element

In certain cases, we can always still fall back to this:

  Set()  # empty Set

  Hash()  # empty Hash

  Set(:a1)  # if that's what we wanted

As for bags, well I think that is where we could get fancier.

But *no* doubling up, as we don't want to interfere with nesting.

Instead, it is common in maths to associate a + with set syntax to refer to 
bags instead.


So, does Perl already ascribe a meaning to putting a + with various bracketing 
characters, such as this:


  +{1,2,2,5}  # a Bag of 4 elements with 2 duplicates

  +{}  # an empty Bag, unless that already means something

So would the above try to cast the collection as a number, or take the count of 
its elements, or can we use something like that?


But I would recommend something along those lines.

I suppose then if +{} works for bags we could alternately use -{} for sets but I 
don't really like it.


-- Darren Duncan


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-13 Thread The Sidhekin
On Sun, Nov 14, 2010 at 12:12 AM, Jon Lang datawea...@gmail.com wrote:

 Carl Mäsak wrote:
  Jonathan Lang ():
  That saves a singlr character over Bag( ... ) and Set( ... ),
  respectively (or three characters, if you find decent unicode bracket
  choices).  It still wouldn't be a big enough deal to me to bother with
  it.
 
  +1. Let's leave it at that.

 That said, I do think that Bag( ... ) should be able to take pairs, so
 that one can easily create a Bag that holds, say, twenty of a given
 item, without having to spell out the item twenty times.


  Doesn't the xx operator cover this?


Eirik


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-13 Thread Jon Lang
Darren Duncan wrote:
 Jon Lang wrote:

 That saves a singlr character over Bag( ... ) and Set( ... ),
 respectively (or three characters, if you find decent unicode bracket
 choices).  It still wouldn't be a big enough deal to me to bother with
 it.

 As well, my first impression upon seeing [! ... !] was to think
 you're negating everything inside?  That said, I could get behind
 doubled brackets:

    [[1, 2, 3]] # same as Bag(1, 2, 3)
    {{1, 2, 3}} # same as Set(1, 2, 3)

 snip

 I prefer to have the mnemonic that {} means unordered and that [] means
 ordered, so please stick to [] meaning arrays or ordered collections, an {}
 meaning unordered collections, so set and bag syntax should be based around
 {} if either.

 This said, I specifically think that a simple pair of curly braces is the
 best way to mark a Set.

 So:

  {1,2,3}  # a Set of those 3 elements

 ... and this is also how it is done in maths I believe (and in Muldis D).

 In fact, I strongly support this assuming that all disambiguation eg with
 hashes can be specified.

That would be great.

  {a=1,b=2}  # a Hash of 2 pairs

  {:a1, :a2}  # we'll have to pick a meaning

My preference would be for this to be a Set that contains two items in
it, both of which are pairs.  IIRC, there's already behavior along
these lines when it comes to pairs.

  {}  # we'll have to pick a meaning (Muldis D makes it a Set; %:{} is its
 Hash)

Is there any difference between an empty Set and an empty Hash?  If
so, is one more general than the other?  Just as importantly, what
does {} do right now?

  {;}  # an anonymous sub or something

  {a=1}  # Hash

  {1}  # Set

  {1;}  # anonymous sub or something

Sets built from multi-dimensional arrays migt be a problem:

{1, 2, 3: 4, 5, 6}

 But keep that simple an let nesting work normally, so:

  {{1}}  # a Set of 1 element that is a Set of 1 element

  {{a=1}}  # a Set with 1 Hash element

  {[1]}  # a Set with 1 Array element

  [{1}]  # an Array with 1 Set element

 In certain cases, we can always still fall back to this:

  Set()  # empty Set

  Hash()  # empty Hash

  Set(:a1)  # if that's what we wanted

 As for bags, well I think that is where we could get fancier.

 But *no* doubling up, as we don't want to interfere with nesting.

 Instead, it is common in maths to associate a + with set syntax to refer
 to bags instead.

 So, does Perl already ascribe a meaning to putting a + with various
 bracketing characters, such as this:

  +{1,2,2,5}  # a Bag of 4 elements with 2 duplicates

  +{}  # an empty Bag, unless that already means something

 So would the above try to cast the collection as a number, or take the count
 of its elements, or can we use something like that?

I'd expect +{...} to count the elements.



-- 
Jonathan Dataweaver Lang


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-09 Thread TSa (Thomas Sandlaß)
On Tuesday, 9. November 2010 01:45:52 Mason Kramer wrote:
 I have to disagree here.  Arrays and Hashes may be about storage (I don't
 think they are, though, since you can change the (storage) implemenation of
 an Array or Hash via its metaclass and it can still remain an Array or
 Hash).

What I mean with storage is that you put some data into a numbered slot
in an array and a keyed slot into a hash. With the same index or key you
can retrieve your data at any time. This is the case irrespective of the
underlying implementation. A set is not about storage in this sense, because
there is no way of retrieving an element. The only operation is a membership
test which is of boolean nature like number comparison.


 The most important part of the @ sigil, and the reason I preferred it over
 $, is that @ flattens (moritz++'s word), when used in a list context such
 as for @blah,
 map {...}, @blah.

I wonder if it is not possible to bind flattening to Iterable. This of course
has the drawback that it is not syntactically distinguished. But doesn't

   my $x = (1,2,3);
   my $y = map {$^x * $^x}, $x;

result in $y containing the list (1,4,9)? And if $x happens to be a scalar
isn't it just squared? In the end we just need map:( closure, Set $data -- 
Set) as an overload. Or perhaps map:( closure, Iterable ::T $data -- T).


Regards, TSa.
-- 
The unavoidable price of reliability is simplicity -- C.A.R. Hoare
Simplicity does not precede complexity, but follows it. -- A.J. Perlis
1 + 2 + 3 + 4 + ... = -1/12  -- Srinivasa Ramanujan


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-09 Thread Moritz Lenz


On 11/09/2010 09:26 PM, TSa (Thomas Sandlaß) wrote:
 But doesn't
 
my $x = (1,2,3);
my $y = map {$^x * $^x}, $x;
 
 result in $y containing the list (1,4,9)? 

Not at all. The $ sigil implies a scalar, so what you get is roughly

my $y = (1, 2, 3).item * (1, 2, 3).item;

so $y ends up with a single list item of 9.

Cheers,
Moritz


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-08 Thread Solomon Foster
On Sun, Nov 7, 2010 at 11:19 PM, Jon Lang datawea...@gmail.com wrote:
 Mason Kramer wrote:
 I'd like to anticipate one objection to this - the existence of the 'hyper' 
 operator/keyword.  The hyper operator says, I am taking responsibility for 
 this particular code block and promising that it can execute out of order 
 and concurrently.  Creating a Bag instead of an Array says, there is no 
 meaning to the ordering of this group of things, ever.  Basically, if I 
 know at declaration time that my collection has no sense of ordering, then I 
 shouldn't have to annotate every iteration of that collection as having no 
 sense of ordering, which is nearly what hyper does (though, I readily admit, 
 not quite, because there are unordered ways to create race conditions).

 My understanding of the hyperoperator is that its primary use is to
 say operate on the individual elments of this collection, instead of
 on the collection itself.  In that regard, it's just as applicable to
 Bags and Sets as it is to lists.  Except...

 Except that the hyperoperator assumes that the collections are
 ordered.  It matches the first element on the left with the first
 element on the right; the second element on the left with the second
 on the right; and so on.  Bags and Sets don't have a useful notion of
 first, second, etc.  So what should happen if I try to apply a
 hyperoperator with a Bag or Set on one side?

Well, hyperoperators work fine on Hashes, they operate on the values,
paired up by key if needed.  (That is, %hash++ doesn't care about
the keys, %hash1 + %hash2 sums based on keys.)  I would assume
that Bag should work in the exact same way.  Dunno how Set should work
in this context, though.

-- 
Solomon Foster: colo...@gmail.com
HarmonyWare, Inc: http://www.harmonyware.com


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-08 Thread Jon Lang
Solomon Foster wrote:
 Well, hyperoperators work fine on Hashes, they operate on the values,
 paired up by key if needed.  (That is, %hash++ doesn't care about
 the keys, %hash1 + %hash2 sums based on keys.)  I would assume
 that Bag should work in the exact same way.  Dunno how Set should work
 in this context, though.

I would hope that Bags would not work the same way.  If they do, then
you get things like:

Bag(1, 3, 2, 1) + Bag(2, 3, 1, 2) # same as Bag(1, 1, 1, 2, 2, 2, 3, 3)

I'm not sure how (or even if) Bags _should_ work in this context; but
the above is definitely not what I'd expect.

IMHO, a key point about Bags and Sets (no pun intended) is that the
values of the elements _are_ the keys; the existence of separate
values (unsigned integers in the case of Bags; booleans in the case of
Sets) are - or should be - mostly a bookkeeping tool that rarely shows
itself.

Incidently, we might want to set up a role to define the shared
behavior or Bags, Sets, et al.  My gut instinct would be to call it
Baggy; Setty would make the stargazers happy, but otherwise
wouldn't mean much.  With this, you could do things like creating a
FuzzySet that stores a number between zero and one for each key, but
which otherwise behaves like a Set.

-- 
Jonathan Dataweaver Lang


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-08 Thread Mason Kramer
I'm honored that my letter generated so much activity, and thank you all for 
your thoughtful responses.  I'd like to address a few points.

 On Monday, 8. November 2010 17:20:43 Jon Lang wrote:
 Solomon Foster wrote:
 Well, hyperoperators work fine on Hashes, they operate on the values,
 paired up by key if needed.  (That is, %hash++ doesn't care about
 the keys, %hash1 + %hash2 sums based on keys.)  I would assume
 that Bag should work in the exact same way.  Dunno how Set should work
 in this context, though.
 
 I would hope that Bags would not work the same way.  If they do, then
 you get things like:
 
Bag(1, 3, 2, 1) + Bag(2, 3, 1, 2) # same as Bag(1, 1, 1, 2, 2, 2,
 3, 3)


With respect to Bags and » and «, the spec has something to say (somewhere in 
S03):  

in fact, an upgraded scalar is the only thing that will work for an unordered 
type such as a Bag:
 Bag(3,8,2,9,3,8) - 1;  # Bag(2,7,1,8,2,7) === Bag(1,2,2,7,7,8)


This makes sense to me.  I don't see how it could be otherwise.  This code 
snippet also makes it clear that » and « operate on the keys of a Bag, and not 
the counts or pairs of a Bag.  This also makes sense to me, since Bags ought to 
act much more like their keys than either their values or an EnumMap of k,v.  
Please note that my original post did not address » and «, but rather the 
hyper keyword / adverb, as in hyper for { ... }.


On Nov 8, 2010, at  04:25 PM, TSa (Thomas Sandlaß) wrote:
 snip
 I'm generally very happy with the choice of sigil for Sets and Bags
 because this is what they are: scalars as far as storage is concerned.
 More important is to have the right set of operators that automatically
 imply Bags:  (1,2,3,4) () (2,3) === Bag(1,2,2,3,3,4).
 
 Arrays and Hashes are about storage. In the abstract the memory of a computer
 is one big array! Sets and Bags are about operations on them like the numeric
 operations are on numbers or the string operators on strings. So it is very
 important to keep the domains nicely separated by means of disjoint operators.
 This is why we have ~ for concatenation and not overloaded +.
 
 It makes of course sense to iterate a Bag. But indexing it doesn't. We are
 also not indexing into strings: blah[2] is not 'a'.
 snip

I have to disagree here.  Arrays and Hashes may be about storage (I don't think 
they are, though, since you can change the (storage) implemenation of an Array 
or Hash via its metaclass and it can still remain an Array or Hash).  But 
sigils are definitely not about the storage of the underlying data.  Your own 
statement gives the contradiction - In actual storage in the memory of a 
computer, everything is somewhere in a big array.  But yet, we don't prefix 
everything with an @ sigil.  So clearly, sigils are about something else.  
jnthn said today, in irc, that sigils are about an interface contract.  
Everyone seems to agree that they imply the Positional role (i.e., the 
postcircumfix:[] method), and that Rakudo heavily relies on this conflation, 
so I'm withdrawing the suggestion that @ means does Iterable instead of does 
Positional.

The most important part of the @ sigil, and the reason I preferred it over $, 
is that @ flattens (moritz++'s word), when used in a list context such as 
for @blah,
map {...}, @blah.

Having Bags flatten in list context is pretty crucial to their being as easy 
and terse to use as arrays, because flattening is fundamental to how Arrays 
are used, and Bags will be used like Arrays.  Luckily, however, %, which 
implies the Associative contract, also flattens in list context.  If Bags and 
Sets are sigiled with %, they should flatten, and all that remains is to make 
sure they flatten into a list of keys, and not a list of enums.  

Any thoughts on that?

Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-08 Thread Jon Lang
This is going to be a rambling answer, as I have a number of questions
but no firm conclusions.  Please bear with me.

Mason Kramer wrote:
 Having Bags flatten in list context is pretty crucial to their being as
 easy and terse to use as arrays, because flattening is fundamental to
 how Arrays are used, and Bags will be used like Arrays.  Luckily,
 however, %, which implies the Associative contract, also flattens in list
 context.  If Bags and Sets are sigiled with %, they should flatten, and
 all that remains is to make sure they flatten into a list of keys, and
 not a list of enums.

The only qualm that I have with using % as a prefix for baggy things
is that % carries the connotation that you're dealing with key/value
pairs.  While baggy things _can_ be thought of as pairs, they're
value/membership pairs rather than key/value pairs; and the membership
side of the pair should be hidden from view unless explicitly
requested.  In short, a casual programmer ought to be encouraged to
think of a baggy thing as being a collection of values; the % sigil
implicitly encourages him to think of it as a collection of pairs.

That said, the problem with % is that baggies implement its features
in an unusual way; the problem with @ is that baggies don't implement
all of its features.  Conceptually, @ (a collection of values) is a
better fit  than % (a collection of pairs); but technically, the
reverse is true: % (does Associative) is a better fit than @ (does
Positional).

Query: should %x.k, %x.v, %x.kv, and %x.pair produce lists, bags, or
sets?  As I understand it, right now all four produce lists.  I could
see a case for having %x.k and %x.pair produce sets, while %x.kv
definitely should produce a list (since even though the overall order
doesn't matter, which even element follows which odd element _does_
matter); and %x.v might reasonably produce a bag.  OTOH, if this is
done then there will be no way to talk about, e.g., %x.k[0].

I'm wondering if bags and sets _should_ do Positional, but with the
caveat that the order is arbitrary.  After all, that's what currently
happens with %x.k: you get a list of the keys, but with the
understanding that the order in which you get them is ultimately
meaningless.  Or is it that the difference between Iterable and
Positional is that Positional provides random access to its
membership, whereas Iterable only guarantees that you can walk through
the members?

Another way to phrase my concern is this: one reason why Perl 6 has
gone with nominal typing rather than structural typing is that does
x can and does promise more than just implements the same features
that x implements; it also promises something about the way in which
said features will be implemented.  In this regard, I would argue that
baggies should not do Associative; because even though they implement
all of the same features that Associative promises to implement, they
don't do so in a way that's compatible with Associative's underlying
philosophy of keys and values.  And if they don't do Associative, it
doesn't make sense for them to use the % sigil.

I hesitate to suggest this; but might we want a special sigil for
Iterable, to be used when neither Positional nor Associative is quite
right?  Such a sigil might be useful for more than just baggies; for
instance, a stream is Iterable but neither Positional nor Associative.

--
Jonathan Dataweaver Lang


Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-07 Thread Mason Kramer
I just implemented Bag to the point where it passes the spectests.  
(https://github.com/masonk/rakudo/commit/2668178c6ba90863538ea74cfdd287684a20c520)
  However, in doing so, I discovered that I'm not really sure what Bags are 
for, anymore.

The more I think about Bags and Sets, the more my brain hurts.  They're a half 
an EnumMap and half an Iterable that does Associative but not Positional.  
However, I'm starting to believe that they are more like Iterables than 
EnumMaps.  When I imagine using them, I think of Sets as a cute way to operate 
on the unique elements of an Iterable.  I think of Bags / KeyBags as a way to 
remove ordering, which is a generally useful thing (everything that I'm about 
to say applies to both Bags and KeyBags, but I'm going to only talk about Bags 
for the rest of this post).  This is because, most of the time, we don't care 
about ordering, and having ordering on all of our collections even when we 
don't need it increases program complexity in time in a way that could be seen 
as analogous to the way in which unnecessarily global variables increased the 
space complexity of Perl 5.

I want to propose one major change to the Bag spec: When a Bag is used as an 
Iterable, you get an Iterator that has each key in proportion to the number of 
times it appears in the Bag.

With this one change to Bags, I could use them whenever I don't need ordering 
in my lists - which is usually.  Even though there are some side effects that 
don't rely on ordering (e.g., incrementation), the majority of them do - so by 
using this new kind of Bag, I would be reducing the complexity of my programs.  
Now, since Sets already give us the distinct values, having Bags do the same 
thing seems like redundant functionality, where we could be getting novel 
functionality.  

I'd like to anticipate one objection to this - the existence of the 'hyper' 
operator/keyword.  The hyper operator says, I am taking responsibility for 
this particular code block and promising that it can execute out of order and 
concurrently.  Creating a Bag instead of an Array says, there is no meaning 
to the ordering of this group of things, ever.  Basically, if I know at 
declaration time that my collection has no sense of ordering, then I shouldn't 
have to annotate every iteration of that collection as having no sense of 
ordering, which is nearly what hyper does (though, I readily admit, not quite, 
because there are unordered ways to create race conditions).

I also have some convenience syntax suggestions.  I do think this is important 
because Bags and Sets are competing with Arrays.  If they aren't as convenient 
as Arrays to use, they won't get used - even though they're closer, 
semantically, to what the developer wants in a lot of cases.   First, we should 
besigil Bags and Sets with @ instead of $.  Without this convenience, I'm not 
likely to replace my Arrays with Bags, because going through them in a loop or 
map would be a pain compared to Arrays.  If I have to say $bag.keys every 
single time, forgettaboutit.  

This, however, probably requires a change to S03, which says that the @ sigil 
is a means of coercing the object to the Positional (or Iterable?) role.  It 
seems to me, based on the guiding principle that perl6 should support 
functional idioms and side-effect free computing, the more fundamental and 
important aspect of things with @ in front is that you can go through them one 
by one, and not that they're ordered (since ordering is irrelevant in 
functional computing, but iterating is not).  My feeling is that we should 
reserve the special syntax for the more fundamental of the two operations, so 
as not to bias the programmer towards rigid sequentiality through syntax.

Second, I would be even more likely to replace my ordered lists with Bags if 
there were a convenient operator for constructing Bags.  I can't think of any 
good non-letter symbols that aren't taken right now (suggestions welcome), but, 
at  least, b and s as aliases to bag and set would be convenient.

Bags and Sets thus updated would look like this in use:
C
my @array =  a a b c ;
my @set = s...@array;
for s...@array { say $_ };
for @set { say $_ };# same thing
# b«␤»a«␤»c«␤»
# ordering undefined
# most common use case for sets, I think, is unique elements of @array, isn't 
it?

hyper for @bag { ... };
# a«␤»b«␤»c«␤» a«␤»
# ordering undefined = less-thinking-required hyper

b a b c c  === b c c b a 
# Wouldn't this be the best way to make a comparison with these semantics?
# By the way, this useful idiom works as currently specced, but doesn't work in 
my implementation

@bag{a}
# 2

@bag{a b z}
# 2, 1, 0

[+] bag @array{a b z}
# 3
# this is also neat for How many a's, b's, and z's do I have?

+...@bag
# 4

@bag[2]
# I can't think of a meaning for this - not Positional - S03 needs a change?

@bag.WHAT
# Bag()

@bag.pairs
# a = 2, b = 1, c = 1
# ordering undefined

@bag.values
# 2, 1, 1
# ordering undefined

Junctions:


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-07 Thread Darren Duncan

Mason Kramer wrote:
snip

I want to propose one major change to the Bag spec: When a Bag is used as an 
Iterable, you get an Iterator that has each key in proportion to the number of 
times it appears in the Bag.

snip

You present some interesting thoughts here.  But I don't have enough time to 
think about any implications to the point of agreeing or disagreeing with that 
change, other than to say that the proposal seems reasonable at first glance.


However, if the above proposal is done, I would still want an easy way to get 
the value-count pairs from a bag if I wanted them.


I do agree though with the principle that sets and bags should be just as easy 
and terse to use as arrays.


-- Darren Duncan


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-07 Thread Jon Lang
Mason Kramer wrote:
 I'd like to anticipate one objection to this - the existence of the 'hyper' 
 operator/keyword.  The hyper operator says, I am taking responsibility for 
 this particular code block and promising that it can execute out of order and 
 concurrently.  Creating a Bag instead of an Array says, there is no meaning 
 to the ordering of this group of things, ever.  Basically, if I know at 
 declaration time that my collection has no sense of ordering, then I 
 shouldn't have to annotate every iteration of that collection as having no 
 sense of ordering, which is nearly what hyper does (though, I readily admit, 
 not quite, because there are unordered ways to create race conditions).

My understanding of the hyperoperator is that its primary use is to
say operate on the individual elments of this collection, instead of
on the collection itself.  In that regard, it's just as applicable to
Bags and Sets as it is to lists.  Except...

Except that the hyperoperator assumes that the collections are
ordered.  It matches the first element on the left with the first
element on the right; the second element on the left with the second
on the right; and so on.  Bags and Sets don't have a useful notion of
first, second, etc.  So what should happen if I try to apply a
hyperoperator with a Bag or Set on one side?

The cross operators should also be looked at in this regard, though I
anticipate fewer problems there.

 This, however, probably requires a change to S03, which says that the @ sigil 
 is a means of coercing the object to the Positional (or Iterable?) role.  
 It seems to me, based on the guiding principle that perl6 should support 
 functional idioms and side-effect free computing, the more fundamental and 
 important aspect of things with @ in front is that you can go through them 
 one by one, and not that they're ordered (since ordering is irrelevant in 
 functional computing, but iterating is not).  My feeling is that we should 
 reserve the special syntax for the more fundamental of the two operations, so 
 as not to bias the programmer towards rigid sequentiality through syntax.

I tend to agree here - though to be clear, my @x should still
normally result in a list, sans further embellishments (e.g., my Bag
@x).

 Second, I would be even more likely to replace my ordered lists with Bags if 
 there were a convenient operator for constructing Bags.  I can't think of any 
 good non-letter symbols that aren't taken right now (suggestions welcome), 
 but, at  least, b and s as aliases to bag and set would be convenient.

Such a character ought to be some sort of punctuation, preferably of a
type that's similar to the comma and semicolon.  For a Bag, you might
consider an emdash (—), with the ascii equivalent being infix:--.
So:

1 -- 2 -- 3

Would be a Bag containing three elements: 1, 2, and 3.

Personally, I wouldn't put a high priority on this; for my purposes,

   Bag(1, 2, 3)

works just fine.

-- 
Jonathan Dataweaver Lang


Re: Bag / Set ideas - making them substitutable for Arrays makes them more useful

2010-11-07 Thread Moritz Lenz
On 11/08/2010 01:51 AM, Darren Duncan wrote:
 Mason Kramer wrote:
 snip
 I want to propose one major change to the Bag spec: When a Bag is used as an 
 Iterable, you get an Iterator that has each key in proportion to the number 
 of times it appears in the Bag.
 snip
 
 You present some interesting thoughts here.  But I don't have enough time to 
 think about any implications to the point of agreeing or disagreeing with 
 that 
 change, other than to say that the proposal seems reasonable at first glance.
 
 However, if the above proposal is done, I would still want an easy way to get 
 the value-count pairs from a bag if I wanted them.

There'd be still .kv and .pairs.

Cheers,
Moritz