Style, Efficiency, and updating nested structure

Dave Tenny Wed, 11 Nov 2015 13:26:39 -0800

A colleague and I are debating various things clojure as we were exploring 
alternative ways to solve a problem.


Here's the description of the problem that a particular function is trying 
to solve, and the first implementation of it.

(defn update-roles 
  "Given a vector of maps of the form {:project_id N :project_name S 
:project_roles [...roles...]}
  if there is already a map for the indicated project id/name, add new-role 
to it and returned
  a copy the updated input vector, otherwise return a vector with a new map 
entry for the newly found
  project and initial role.  This function is basically aggregating tuples 
from the database."
  [projects project-id project-name new-role]
  (let [updated? (atom nil)

        projects* (mapv (fn [m] 
                          (if (= (:project_id m) project-id)
                            (do (reset! updated? true)
                                (assoc m :project_roles (conj (:project_roles 
m) new-role)))
                            m))
                        projects)]
    (if @updated?
      projects*
      (conj projects {:project_id project-id :project_name project-name 
:project_roles 
[new-role]}))))


;; (update-roles [{:project_id 1 :project_name "One" :project_roles [:own]}] 
2 "Two" :edit)
;; => [{:project_id 1, :project_name "One", :project_roles [:own]} {:project_id 
2, :project_name "Two", :project_roles [:edit]}]
;; (update-roles [{:project_id 1 :project_name "One" :project_roles [:own]}] 
1 "Two" :edit)
;; => [{:project_id 1, :project_name "One", :project_roles [:own :edit]}]



Now here's another implementation:

(defn update-or-insert-project-role
  [prj-roles prj-role]
  (let [to-insert-prj-id (:project_id prj-role)
        by-pid           (group-by :project_id prj-roles)]
    (case (get by-pid to-insert-prj-id)
      nil (conj prj-roles prj-role)
      (->> (update-in by-pid [to-insert-prj-id 0 :project_roles] #(apply conj % 
(:project_roles prj-role)))
           (mapcat second)
           (into [])))))

;; (def prj-roles [{:project_id 1, :project_name "One", :project_roles [:own]} 
{:project_id 3 :project_name "Three" :project_roles [:edit]}])
;; (update-or-insert-project-role prj-roles {:project_id 2 :project_name "Two" 
:project_roles [:edit]})
;; => [{:project_id 1, :project_name "One", :project_roles [:own]} {:project_id 
3, :project_name "Three", :project_roles [:edit]} {:project_id 2, :project_name 
"Two", :project_roles [:edit]}]
;; (update-or-insert-project-role prj-roles {:project_id 1 :project_name "One" 
:project_roles [:edit]})
;; => [{:project_id 1, :project_name "One", :project_roles [:own :edit]} 
{:project_id 3, :project_name "Three", :project_roles [:edit]}]




The function is called in a loop to aggregate rows from a database, though 
it isn't an overriding concern, we're not going millions of records in this 
case.

The first thing my colleague and I disagreed on was the calling sequence, 
arguing over which is more readable.
The second thing was whether efficiency in this context is really 
important, or whether it's all moot in clojure.  

Finally, I'm sure there's a better way, probably with Zippers or something, 
but neither of us have used them. Suggestions for the stylistic and 
operational epitome of clojure expression on this routine are welcome!

Superficially, and probably incorrect in multiple ways, here is a poor 
attempt at breaking down efficiency in terms of search/traversal and memory 
allocations.  This was done by someone with no knowledge of clojure 
internals (including the library implementations of the called functions).

;; Comparing the two routines per function call, for existing project case 
(i.e. where roles are updated)
;;
;; Assuming each routine allocates new vector for new-role placement in 
existing project
;; and MapEntry for assoc of new vector on project_roles, so they aren't 
included in allocations 
;; below since both routines have to do it.
;;
;; Note that x-element map allocates storage for map and map-entries or 
clojure equivalent.
;; (and more expensive than an x-element vector, of course).
;;
;; n == length of input project list.
;; m == average length of input project list role vectors.
;; 
;; Object Allocations
;;   Function call:
;;     update-roles: 
;;       1 atom
;;       1 O(n) vector for mapv 
;;     update-or-insert-project-role: 
;;       1 3-entry map + 1 single-element vector for prj-role argument 
input.
;;       1 n-element map for group-by
;;       n vectors for group-by map values
;;       1 n-element map for update-in
;;       1 list/sequence for mapcat (+ n concat intermediaries?)
;;       1 vector for into
;;
;; If we discard the second 'into' and first 'mapv' allocations the 
update-or-insert-project-role routine allocates
;; 3 additional maps (two of which are O(n)), n additional vectors, and 1 
additional list/sequence.
;;   
;; Searches/traversals/copies
;;  update-roles: 
;;       O(n) - mapv
;;  update-or-insert-project-role: 
;;       O(n) - group-by
;;       O(n) - update-in
;;       O(n) - mapcat
;;       O(n) - into
;;
;; Here's what update-or-insert-project-role allocates (by way of 
assistance in assessing the above)
;;{1 [{:project_id 1, :project_name "One", :project_roles [:own]}], 3 
[{:project_id 3, :project_name "Three", :project_roles [:edit]}]}           
  -- group-by
;;{1 [{:project_id 1, :project_name "One", :project_roles [:own :edit]}], 3 
[{:project_id 3, :project_name "Three", :project_roles [:edit]}]}       -- 
update-in
;;({:project_id 1, :project_name "One", :project_roles [:own :edit]} 
{:project_id 3, :project_name "Three", :project_roles [:edit]})             
   -- mapcat
;;[{:project_id 1, :project_name "One", :project_roles [:own :edit]} 
{:project_id 3, :project_name "Three", :project_roles [:edit]}]             
   -- into

Thanks for your thoughts and suggestions. 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Style, Efficiency, and updating nested structure

Reply via email to