Excellent points. +1 for orthogonality.
Sean On Feb 22, 3:13 pm, Michał Marczyk <michal.marc...@gmail.com> wrote: > On 22 February 2010 20:28, Sean Devlin <francoisdev...@gmail.com> wrote: > > > Then is the seq (1 :a a) guaranteed? How do I know that I won't get > > (2 :b b), (1 :b c), etc? What if I want a specific combination > > instead? I've had to actually code this specific problem, and I found > > that using group-by & some secondary mapping operation was the only > > thing that gave me the flexibility I needed (manufacturing is fun!). > > The ordering guarantees distinct-by makes are exactly those that > distinct makes, because it uses the same code (as mentioned > previously, I lifted it all from clojure.core, then tweaked to take > the keyfn / eqfn into account). Basically this means that if your > collection has an intrinsic ordering, it will be preserved (the result > will include, for each equivalence class of items from the sequence > modulo the user-defined equivalence relation, the one earliest w.r.t. > that ordering). If it's a hash-map or a hash-set instead, you'll get > whatever ordering (seq coll) happens to produce. > > As for group-by giving you more flexibility -- well, it gives you a > lot of flexibility where it's appropriate to use it, but because of > its choice of data structure for the result, you can't use it to > reimplement distinct-by directly: > > user=> (group-by class [1 2 3 :a :b :c 'a 'b 'c]) > java.lang.ClassCastException: java.lang.Class cannot be cast to > java.lang.Comparable (NO_SOURCE_FILE:0) > > So no way to use non-Comparables as keys... > > And then there's the fact that you can't tell in which order the keys > discovered by group-by appeared in the original collection, which is > again because of its use of sorted-map, which has the consequence that > order is being mangled on purpose! E.g.: > > user=> (seq (group-by #(- %) [1 2 3 4 5])) > ([-5 [5]] [-4 [4]] [-3 [3]] [-2 [2]] [-1 [1]]) > > In other words: (seq (group-by f coll)) has an ordering possibly > completely unrelated to that of coll (so you'd have to make a separate > traversal through the coll to discover the original ordering of the > keys), whereas (distinct-by f coll), for either version of > distinct-by, preserves the ordering of coll. That's a desirable > property for when that's what you want to do, whereas group-by will, I > suppose, be more useful on other occasions. ;-) > > To sum it up, (1) distinct-by actually behaves in a very predictable > way (which may or may not be useful for any particular purpose), (2) > it cannot be implemented directly in terms of group-by. I'd say it's > pretty orthogonal to the existing library functions (that I know of) > actually... > > Sincerely, > Michał -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en