On 22 February 2010 20:28, Sean Devlin <francoisdev...@gmail.com> wrote: > Then is the seq (1 :a a) guaranteed? How do I know that I won't get > (2 :b b), (1 :b c), etc? What if I want a specific combination > instead? I've had to actually code this specific problem, and I found > that using group-by & some secondary mapping operation was the only > thing that gave me the flexibility I needed (manufacturing is fun!).
The ordering guarantees distinct-by makes are exactly those that distinct makes, because it uses the same code (as mentioned previously, I lifted it all from clojure.core, then tweaked to take the keyfn / eqfn into account). Basically this means that if your collection has an intrinsic ordering, it will be preserved (the result will include, for each equivalence class of items from the sequence modulo the user-defined equivalence relation, the one earliest w.r.t. that ordering). If it's a hash-map or a hash-set instead, you'll get whatever ordering (seq coll) happens to produce. As for group-by giving you more flexibility -- well, it gives you a lot of flexibility where it's appropriate to use it, but because of its choice of data structure for the result, you can't use it to reimplement distinct-by directly: user=> (group-by class [1 2 3 :a :b :c 'a 'b 'c]) java.lang.ClassCastException: java.lang.Class cannot be cast to java.lang.Comparable (NO_SOURCE_FILE:0) So no way to use non-Comparables as keys... And then there's the fact that you can't tell in which order the keys discovered by group-by appeared in the original collection, which is again because of its use of sorted-map, which has the consequence that order is being mangled on purpose! E.g.: user=> (seq (group-by #(- %) [1 2 3 4 5])) ([-5 [5]] [-4 [4]] [-3 [3]] [-2 [2]] [-1 [1]]) In other words: (seq (group-by f coll)) has an ordering possibly completely unrelated to that of coll (so you'd have to make a separate traversal through the coll to discover the original ordering of the keys), whereas (distinct-by f coll), for either version of distinct-by, preserves the ordering of coll. That's a desirable property for when that's what you want to do, whereas group-by will, I suppose, be more useful on other occasions. ;-) To sum it up, (1) distinct-by actually behaves in a very predictable way (which may or may not be useful for any particular purpose), (2) it cannot be implemented directly in terms of group-by. I'd say it's pretty orthogonal to the existing library functions (that I know of) actually... Sincerely, Michał -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en