On Sun, Aug 5, 2012 at 7:33 AM, Chas Emerick <c...@cemerick.com> wrote:

> Note that the .createWithCheck variations of all of the collections in
> question are used by their "constructor" functions as well, e.g. hash-set,
> hash-map, and array-map:
>

I hadn't noticed that, but I think that is good evidence that this
createWithCheck concept has been taken to an unintended extreme.  This is
no longer about "literal" versions of maps/sets and "constructed" versions
of maps/sets, so all the initial comments about how literal
vectors/maps/sets should only have constants are no longer relevant here
(and the decision to make constructors call the same builder as the literal
syntax is actually further evidence in support of my claim that literals
and constructed versions aren't intended to be hugely different in their
semantics).  Instead, this is about how maps and sets are intended to be
used, overall.

As I recall, the way constructed maps used to work was that when there were
duplicate keys, the right-most version took precedence.  To the best of my
knowledge, no one every complained about this.  It was intuitive, and
consistent with the behavior of into.  The only complaint was that very
small maps, built with the literal syntax, didn't do the intuitive thing
and behave like larger maps, because behind the scenes they used array maps
which didn't check for duplicate keys.  If you said something like {:a 1 :a
2}, Clojure would happily go ahead and build something that wouldn't behave
the way you'd expect (e.g., it would return the wrong count and possibly
the first key-value pair would take precedence over the last).  But the
solution isn't to go ahead and alter the semantics of all forms of maps and
sets!  All people really wanted was to bring ArrayMaps into accordance with
the other forms of maps so that the behavior would be more predictable.

Furthermore, there's a *long* history of sets being used to reduce a
collection with duplicates into something that has no duplicates.  You're
creating a huge stumbling block for people if you create some arbitrary
rule that with one particular set of syntaxes (i.e., (set [1 2 1])
(sorted-set 1 2 1)), sets do what you expect and reduce collections with
duplicates to one that has no duplicates, but that with another set of
syntaxes (i.e., (hash-set 1 2 1) or #{1 2 1}), it just breaks.  That's
seriously confusing, and counter to the spirit of what sets are supposed to
do, in my opinion.

It sounds to me like when the dev team thought through the issue about how
to fix the problem with array maps, they were thinking specifically about
the cases where maps and sets are comprised entirely of constants, and
thought it would be a convenient place to try to catch human errors of the
form {:apple 1, :apple 2} where someone unintentionally duplicated a
constant key.  If somehow, the checking could only be limited to constants,
I'd be perfectly happy with the error check.  But I suspect there is no
easy way to make Clojure behave this way.

Therefore, it's important to be realistic and acknowledge that sets and
maps are created in a variety of ways, both with constants and variables,
and we want consistent semantics across these constructions.  Run-time
errors for duplicate keys in certain kinds of constructions of maps and
sets strikes me as running counter to Clojure's mantra of simplicity.

You talk about whether it's fair to expect Clojure to be a mind-reader.  Of
course not.  And there are all sorts of things that Clojure can't
conceivably check for me when I'm typing in data literals, for example, if
I type {:apple 1, :banano 2}, it's not going to warn me that I spelled
banana wrong, nor should it (if I wanted that kind of protection, I'd use a
static-typed language where each type of data was hardcoded to only
allowing certain fields).  The reality is that any sort of constant data
entered into Clojure has to be carefully double-checked for typos, because
most things are not checkable or mind-readable by Clojure.  Going to such
error-checking extremes and creating complex semantics in order to protect
the user from errors like {:apple 1 :apple 2} is unwarranted.

If my code doesn't work properly because I typed {:apple 1 :apple 2}, I
blame only myself.  If my code breaks because somewhere in my code, I
legitimately built a dynamic set that had duplicate entries and I expected
the duplicates to be removed rather than an error thrown, I blame Clojure.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to