Jon Lang wrote:
Darren Duncan wrote:
This said, I specifically think that a simple pair of curly braces is the
best way to mark a Set.
{1,2,3} # a Set of those 3 elements
... and this is also how it is done in maths I believe (and in Muldis D).
In fact, I strongly support this assuming that all disambiguation eg with
hashes can be specified.
That would be great.
Glad you agree.
<snip>
Sets built from multi-dimensional arrays migt be a problem:
{1, 2, 3: 4, 5, 6}
Does that even work? I thought the colon, or is it a semicolon, only had that
meaning in a delimited list like () or [].
In any event, I don't believe there is such a thing as a multi-dimensional set
in that way. Unless you have a concept of multi-dimensional Hash keys, and then
there might be an analogy.
<snip>
As for bags, well I think that is where we could get fancier.
But *no* doubling up, as we don't want to interfere with nesting.
Instead, it is common in maths to associate a "+" with set syntax to refer
to bags instead.
So, does Perl already ascribe a meaning to putting a + with various
bracketing characters, such as this:
+{1,2,2,5} # a Bag of 4 elements with 2 duplicates
+{} # an empty Bag, unless that already means something
So would the above try to cast the collection as a number, or take the count
of its elements, or can we use something like that?
I'd expect +{...} to count the elements.
Something else I just thought of, and my main reason for writing this reply, is
other options.
Firstly, and I don't necessarily like this option, maybe we could use the simple
curly-brace pair to mean something more general that can be treated as either
a Set or a Bag depending on context. At least from my brief look around, it
appears that maths use the same {foo, bar, baz} syntax to denote both sets and
bags. In some ways it would be like how Perl has the generic "(foo, bar, baz)"
syntax, which remembers order but isn't an Array. We certainly can't use the
presence of duplicates in the {...} to pick Set vs Bag because there could
legitimately be duplicates or not duplicates in the literals for both,
especially if any of the list items are variables and we won't know until
runtime whether any duplicate each other or not.
I still think the better option is to have slightly different looking syntax for
the two. I still prefer Set being the plain brace pair and a Bag being that
plus something extra. It seems that a leading + or ~ or ? is out because those
have established meanings as treating what they're next to in num/str/bool
context, so something else. But it really should be a leading symbolic.
The differentiator needs to be be leading, not trailing; end-weight is bad.
I think that having the marker character /inside/ the curly braces actually
gives us more choices and would cut down on syntactic conflicts, because then we
can basically pick anything that isn't a symbolic prefix unary.
Barring a better suggestion, I suggest the greater-than symbol.
So:
{1,2,3,3,4} # 4-element Set
{>1,2,3,3,4} # 5-element Bag
I think that looks different than anything else we have, and the greater-than
could be a mnemonic that there is "more" in here.
Moreover, the different appearance means we could use => to indicate a count of
that element's contribution to its count, "{>1,2,3=>2,4}", without there being a
confusion with a Hash.
That said, I like the "+" most when differentiating a Bag from a Set, but we
have that symbolic unary "+" which could interfere with it.
-- Darren Duncan