Re: Question about sets

Stuart Halloway Sun, 05 Aug 2012 13:18:22 -0700

My 2c:

1. Simplicity is partially about having orthogonal primitives. A 
duplicate-removing collection factory cannot be sensibly used to implement a 
throw-on-duplicates collection factory, nor vice versa, so both seem equally 
primitive to me.


2. Given that both flavors are useful, they should both be provided, with 
explicit docs distinguishing them. This could involve new variants of the 
sorted collections for completeness.

3. People may differ about which flavor constructor the literals should use, 
but I don't see any arguments here warranting a breaking change.

In short: can't this be fixed by fixing the docstrings?

Stu

P.S. I pre-disagree with Mark's recommendation that appeared as I was writing 
this. :-)

> On Sun, Aug 5, 2012 at 7:33 AM, Chas Emerick <c...@cemerick.com> wrote:
> Note that the .createWithCheck variations of all of the collections in 
> question are used by their "constructor" functions as well, e.g. hash-set, 
> hash-map, and array-map:
> 
> I hadn't noticed that, but I think that is good evidence that this 
> createWithCheck concept has been taken to an unintended extreme.  This is no 
> longer about "literal" versions of maps/sets and "constructed" versions of 
> maps/sets, so all the initial comments about how literal vectors/maps/sets 
> should only have constants are no longer relevant here (and the decision to 
> make constructors call the same builder as the literal syntax is actually 
> further evidence in support of my claim that literals and constructed 
> versions aren't intended to be hugely different in their semantics).  
> Instead, this is about how maps and sets are intended to be used, overall.
> 
> As I recall, the way constructed maps used to work was that when there were 
> duplicate keys, the right-most version took precedence.  To the best of my 
> knowledge, no one every complained about this.  It was intuitive, and 
> consistent with the behavior of into.  The only complaint was that very small 
> maps, built with the literal syntax, didn't do the intuitive thing and behave 
> like larger maps, because behind the scenes they used array maps which didn't 
> check for duplicate keys.  If you said something like {:a 1 :a 2}, Clojure 
> would happily go ahead and build something that wouldn't behave the way you'd 
> expect (e.g., it would return the wrong count and possibly the first 
> key-value pair would take precedence over the last).  But the solution isn't 
> to go ahead and alter the semantics of all forms of maps and sets!  All 
> people really wanted was to bring ArrayMaps into accordance with the other 
> forms of maps so that the behavior would be more predictable.
> 
> Furthermore, there's a *long* history of sets being used to reduce a 
> collection with duplicates into something that has no duplicates.  You're 
> creating a huge stumbling block for people if you create some arbitrary rule 
> that with one particular set of syntaxes (i.e., (set [1 2 1]) (sorted-set 1 2 
> 1)), sets do what you expect and reduce collections with duplicates to one 
> that has no duplicates, but that with another set of syntaxes (i.e., 
> (hash-set 1 2 1) or #{1 2 1}), it just breaks.  That's seriously confusing, 
> and counter to the spirit of what sets are supposed to do, in my opinion.
> 
> It sounds to me like when the dev team thought through the issue about how to 
> fix the problem with array maps, they were thinking specifically about the 
> cases where maps and sets are comprised entirely of constants, and thought it 
> would be a convenient place to try to catch human errors of the form {:apple 
> 1, :apple 2} where someone unintentionally duplicated a constant key.  If 
> somehow, the checking could only be limited to constants, I'd be perfectly 
> happy with the error check.  But I suspect there is no easy way to make 
> Clojure behave this way.
> 
> Therefore, it's important to be realistic and acknowledge that sets and maps 
> are created in a variety of ways, both with constants and variables, and we 
> want consistent semantics across these constructions.  Run-time errors for 
> duplicate keys in certain kinds of constructions of maps and sets strikes me 
> as running counter to Clojure's mantra of simplicity.
> 
> You talk about whether it's fair to expect Clojure to be a mind-reader.  Of 
> course not.  And there are all sorts of things that Clojure can't conceivably 
> check for me when I'm typing in data literals, for example, if I type {:apple 
> 1, :banano 2}, it's not going to warn me that I spelled banana wrong, nor 
> should it (if I wanted that kind of protection, I'd use a static-typed 
> language where each type of data was hardcoded to only allowing certain 
> fields).  The reality is that any sort of constant data entered into Clojure 
> has to be carefully double-checked for typos, because most things are not 
> checkable or mind-readable by Clojure.  Going to such error-checking extremes 
> and creating complex semantics in order to protect the user from errors like 
> {:apple 1 :apple 2} is unwarranted.
> 
> If my code doesn't work properly because I typed {:apple 1 :apple 2}, I blame 
> only myself.  If my code breaks because somewhere in my code, I legitimately 
> built a dynamic set that had duplicate entries and I expected the duplicates 
> to be removed rather than an error thrown, I blame Clojure.
> 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Question about sets

Reply via email to