Jon Lang wrote:
Darren Duncan wrote:
This said, I specifically think that a simple pair of curly braces is the
best way to mark a Set.

 {1,2,3}  # a Set of those 3 elements

... and this is also how it is done in maths I believe (and in Muldis D).

In fact, I strongly support this assuming that all disambiguation eg with
hashes can be specified.

That would be great.

Glad you agree.

<snip>

Sets built from multi-dimensional arrays migt be a problem:

    {1, 2, 3: 4, 5, 6}

Does that even work? I thought the colon, or is it a semicolon, only had that meaning in a delimited list like () or [].

In any event, I don't believe there is such a thing as a multi-dimensional set in that way. Unless you have a concept of multi-dimensional Hash keys, and then there might be an analogy.

<snip>
As for bags, well I think that is where we could get fancier.

But *no* doubling up, as we don't want to interfere with nesting.

Instead, it is common in maths to associate a "+" with set syntax to refer
to bags instead.

So, does Perl already ascribe a meaning to putting a + with various
bracketing characters, such as this:

 +{1,2,2,5}  # a Bag of 4 elements with 2 duplicates

 +{}  # an empty Bag, unless that already means something

So would the above try to cast the collection as a number, or take the count
of its elements, or can we use something like that?

I'd expect +{...} to count the elements.

Something else I just thought of, and my main reason for writing this reply, is other options.

Firstly, and I don't necessarily like this option, maybe we could use the simple curly-brace pair to mean something more general that can be treated as either a Set or a Bag depending on context. At least from my brief look around, it appears that maths use the same {foo, bar, baz} syntax to denote both sets and bags. In some ways it would be like how Perl has the generic "(foo, bar, baz)" syntax, which remembers order but isn't an Array. We certainly can't use the presence of duplicates in the {...} to pick Set vs Bag because there could legitimately be duplicates or not duplicates in the literals for both, especially if any of the list items are variables and we won't know until runtime whether any duplicate each other or not.

I still think the better option is to have slightly different looking syntax for the two. I still prefer Set being the plain brace pair and a Bag being that plus something extra. It seems that a leading + or ~ or ? is out because those have established meanings as treating what they're next to in num/str/bool context, so something else. But it really should be a leading symbolic.

The differentiator needs to be be leading, not trailing; end-weight is bad.

I think that having the marker character /inside/ the curly braces actually gives us more choices and would cut down on syntactic conflicts, because then we can basically pick anything that isn't a symbolic prefix unary.

Barring a better suggestion, I suggest the greater-than symbol.

So:

  {1,2,3,3,4}  # 4-element Set

  {>1,2,3,3,4}  # 5-element Bag

I think that looks different than anything else we have, and the greater-than could be a mnemonic that there is "more" in here.

Moreover, the different appearance means we could use => to indicate a count of that element's contribution to its count, "{>1,2,3=>2,4}", without there being a confusion with a Hash.

That said, I like the "+" most when differentiating a Bag from a Set, but we have that symbolic unary "+" which could interfere with it.

-- Darren Duncan

Reply via email to