I would like to argue that your code has to be pretty inefficient if it's creating a bigger performance impact than fetching from the database.
Erik. -- i farta > Den 12. nov. 2015 kl. 15.36 skrev Dave Tenny <dave.te...@gmail.com>: > > Thanks for the replies, as always lots of interesting ways to do it in > clojure. No zipper sugestions? > > For my own take, I happen to like the last suggestion (Gareth's) the most I > think, of the ones offered so far. It has a lispy elegance and is still > relatively efficient in traversals and memory allocation, compared to some > solutions (particular with a revision to the merge-roles implementation to > eliminate all the set creation, see next sentence). > > Some of you rewrote the problem a bit. For example, the 'merge-roles' > conversion to sets is very expensive, especially since I know that the input > in my original problem isn't going to offer duplicate project rules for a > given project ID. > > On efficiency in general, it is perhaps a generational thing that people say > "don't prematurely optimize" so often and then cons up intermediate objects > with wild abandon. And while a well crafted stop and copy garbage collector > can be have negligable GC time given sufficient memory free space, > allocations are still not free even in the best case, not to mention the time > to build the object being allocated (like hashing/comparing all the values in > a set). > > I'd like to suggest that there is a difference between premature optimization > and avoiding flagrantly inefficient code. > If you code so that you routinely copy sequences and nested data structures > without a care, then in a serious/large system > your performance will have death by a thousand cuts. The big O matters, at > least in some applications. > > As indicated in this problem, the input is coming from a database, i.e. a > potentially large record source. My bias is that per-record processing is > always best kept to a minimum. So inverting maps, copying sequences, > creating sets and such _for every record_ is just a big "nope" for me. And > that doesn't even consider what went into the retrieval of the records from > clojure.java.jdbc/query, where every record comes back as a map unless you do > something to prevent it. > > Meanwhile there were good suggestions for both abstraction and efficiency > here, thanks. I often forget about lazy-seq and have a tendency to lean > toward loop/recur, probably from too many years using Common Lisp. And of > course elimination of the mutable change detection hack is good too. So > nice to see the different ways people tackle the problem. > > >> On Wed, Nov 11, 2015 at 10:02 PM, Gareth Clapperton <gclapper...@live.com> >> wrote: >> Hey Daves >> >> I too like the map idea (and using sets for roles for that matter). But >> sticking with collections, I might do something like this >> >> (defn upsert >> "Updates or inserts v into coll" >> [key-fn update-fn v coll] >> (let [k (key-fn v)] >> (letfn [(step [v [x & xs]] >> (lazy-seq >> (cond >> (nil? x) (when v [v]) >> (and (some? v) (= (key-fn x) k)) (cons (update-fn x v) >> (step nil xs)) >> :default (cons x >> (step v xs)))))] >> (step v coll)))) >> >> (defn merge-roles [p1 {:keys [project_roles] :as p2}] >> (update-in p1 [:project_roles] #(vec (clojure.set/union (set %) (set >> project_roles))))) >> >> (upsert :project_id merge-roles prj-role prj-roles) >> >> Gareth >> >> >>> On Wednesday, November 11, 2015 at 4:25:51 PM UTC-5, Dave Tenny wrote: >>> A colleague and I are debating various things clojure as we were exploring >>> alternative ways to solve a problem. >>> >>> Here's the description of the problem that a particular function is trying >>> to solve, and the first implementation of it. >>> >>> (defn update-roles >>> "Given a vector of maps of the form {:project_id N :project_name S >>> :project_roles [...roles...]} >>> if there is already a map for the indicated project id/name, add new-role >>> to it and returned >>> a copy the updated input vector, otherwise return a vector with a new map >>> entry for the newly found >>> project and initial role. This function is basically aggregating tuples >>> from the database." >>> [projects project-id project-name new-role] >>> (let [updated? (atom nil) >>> >>> projects* (mapv (fn [m] >>> (if (= (:project_id m) project-id) >>> (do (reset! updated? true) >>> (assoc m :project_roles (conj >>> (:project_roles m) new-role))) >>> m)) >>> projects)] >>> (if @updated? >>> projects* >>> (conj projects {:project_id project-id :project_name project-name >>> :project_roles [new-role]})))) >>> >>> >>> ;; (update-roles [{:project_id 1 :project_name "One" :project_roles >>> [:own]}] 2 "Two" :edit) >>> ;; => [{:project_id 1, :project_name "One", :project_roles [:own]} >>> {:project_id 2, :project_name "Two", :project_roles [:edit]}] >>> ;; (update-roles [{:project_id 1 :project_name "One" :project_roles >>> [:own]}] 1 "Two" :edit) >>> ;; => [{:project_id 1, :project_name "One", :project_roles [:own :edit]}] >>> >>> >>> >>> Now here's another implementation: >>> >>> (defn update-or-insert-project-role >>> [prj-roles prj-role] >>> (let [to-insert-prj-id (:project_id prj-role) >>> by-pid (group-by :project_id prj-roles)] >>> (case (get by-pid to-insert-prj-id) >>> nil (conj prj-roles prj-role) >>> (->> (update-in by-pid [to-insert-prj-id 0 :project_roles] #(apply >>> conj % (:project_roles prj-role))) >>> (mapcat second) >>> (into []))))) >>> >>> ;; (def prj-roles [{:project_id 1, :project_name "One", :project_roles >>> [:own]} {:project_id 3 :project_name "Three" :project_roles [:edit]}]) >>> ;; (update-or-insert-project-role prj-roles {:project_id 2 :project_name >>> "Two" :project_roles [:edit]}) >>> ;; => [{:project_id 1, :project_name "One", :project_roles [:own]} >>> {:project_id 3, :project_name "Three", :project_roles [:edit]} {:project_id >>> 2, :project_name "Two", :project_roles [:edit]}] >>> ;; (update-or-insert-project-role prj-roles {:project_id 1 :project_name >>> "One" :project_roles [:edit]}) >>> ;; => [{:project_id 1, :project_name "One", :project_roles [:own :edit]} >>> {:project_id 3, :project_name "Three", :project_roles [:edit]}] >>> >>> >>> >>> The function is called in a loop to aggregate rows from a database, though >>> it isn't an overriding concern, we're not going millions of records in this >>> case. >>> >>> The first thing my colleague and I disagreed on was the calling sequence, >>> arguing over which is more readable. >>> The second thing was whether efficiency in this context is really >>> important, or whether it's all moot in clojure. >>> >>> Finally, I'm sure there's a better way, probably with Zippers or something, >>> but neither of us have used them. Suggestions for the stylistic and >>> operational epitome of clojure expression on this routine are welcome! >>> >>> Superficially, and probably incorrect in multiple ways, here is a poor >>> attempt at breaking down efficiency in terms of search/traversal and memory >>> allocations. This was done by someone with no knowledge of clojure >>> internals (including the library implementations of the called functions). >>> >>> ;; Comparing the two routines per function call, for existing project case >>> (i.e. where roles are updated) >>> ;; >>> ;; Assuming each routine allocates new vector for new-role placement in >>> existing project >>> ;; and MapEntry for assoc of new vector on project_roles, so they aren't >>> included in allocations >>> ;; below since both routines have to do it. >>> ;; >>> ;; Note that x-element map allocates storage for map and map-entries or >>> clojure equivalent. >>> ;; (and more expensive than an x-element vector, of course). >>> ;; >>> ;; n == length of input project list. >>> ;; m == average length of input project list role vectors. >>> ;; >>> ;; Object Allocations >>> ;; Function call: >>> ;; update-roles: >>> ;; 1 atom >>> ;; 1 O(n) vector for mapv >>> ;; update-or-insert-project-role: >>> ;; 1 3-entry map + 1 single-element vector for prj-role argument >>> input. >>> ;; 1 n-element map for group-by >>> ;; n vectors for group-by map values >>> ;; 1 n-element map for update-in >>> ;; 1 list/sequence for mapcat (+ n concat intermediaries?) >>> ;; 1 vector for into >>> ;; >>> ;; If we discard the second 'into' and first 'mapv' allocations the >>> update-or-insert-project-role routine allocates >>> ;; 3 additional maps (two of which are O(n)), n additional vectors, and 1 >>> additional list/sequence. >>> ;; >>> ;; Searches/traversals/copies >>> ;; update-roles: >>> ;; O(n) - mapv >>> ;; update-or-insert-project-role: >>> ;; O(n) - group-by >>> ;; O(n) - update-in >>> ;; O(n) - mapcat >>> ;; O(n) - into >>> ;; >>> ;; Here's what update-or-insert-project-role allocates (by way of >>> assistance in assessing the above) >>> ;;{1 [{:project_id 1, :project_name "One", :project_roles [:own]}], 3 >>> [{:project_id 3, :project_name "Three", :project_roles [:edit]}]} >>> -- group-by >>> ;;{1 [{:project_id 1, :project_name "One", :project_roles [:own :edit]}], 3 >>> [{:project_id 3, :project_name "Three", :project_roles [:edit]}]} -- >>> update-in >>> ;;({:project_id 1, :project_name "One", :project_roles [:own :edit]} >>> {:project_id 3, :project_name "Three", :project_roles [:edit]}) >>> -- mapcat >>> ;;[{:project_id 1, :project_name "One", :project_roles [:own :edit]} >>> {:project_id 3, :project_name "Three", :project_roles [:edit]}] >>> -- into >>> >>> Thanks for your thoughts and suggestions. >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clojure@googlegroups.com >> Note that posts from new members are moderated - please be patient with your >> first post. >> To unsubscribe from this group, send email to >> clojure+unsubscr...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to a topic in the >> Google Groups "Clojure" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/clojure/nhzH_xl8IlE/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> clojure+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with your > first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.