Re: Benchmarking structural sharing

2014-08-12 Thread Linus Ericsson
You could likely use System/identityHashCode to count the similarity of
objects all the objects.

I created a small function that only honors the clojure-visible
structure, and exposes every item in a tree structure (apart from the
arrays in PersistentVectors and some PersistentMaps)

(defn steam-roller [l]
gives a representation of all object pointers in a
nested structure by both adding pointers to everything seqable
as well as its contents
it also takes out keys and vals of maps, as well as the whole map
(let [l [l]]
  (loop [l1 l l2 '()]
(cond
 (sequential? (first l1)) (recur (concat (first l1) (rest l1)) (cons
(first l1) l2))
 (map? (first l1)) (recur (concat (keys (first l1)) (vals (first l1))
(rest l1)) (cons (first l1) l2))
 (empty? l1) (reverse l2)
 :else (recur (rest l1) (cons (first l1) l2))

(inspired by slovic's version of flatten at
http://clojuredocs.org/clojure_core/clojure.core/flatten)

Example output is:

(steam-roller *[:a :b {:c 2 3 4}]*)
- (*[:a :b {:c 2, 3 4}]* :a :b {:c 2, 3 4} :c 3 2 4)

You could then make some trivial statistics collection like

(map #(System/identityHashCode %) (steam-roller {1 2 3 4}))
- (359344022 504432400 1778018115 1172818142 256714182)

where (set) gives the unique HashCodes.

The reasoning behind not taking out mapEntries of maps, but just the keys
and vals is that maps usually is k-v-pairs in an Object-array or a shallow
tree of those Objectarrays.

This does *not* take care about either implementation details in either
PersistentVector or rrb-vector. I guess one could use
https://github.com/arohner/clj-wallhack or https://github.com/zcaudate/iroh
to walk around inside the inners of PersistentVector and rrb-vector.

The conclusion of this is I think the easiest way to make this work is to
just run the algorithm in both versions and watch the object allocation
statistics closely in VisualVM or similar. My intuition is that the there
will be a lot of copied arrays, but that's quite quick (not as quick as
shuffling longs with sun.misc.Unsafe, though).

/Linus


2014-08-11 14:56 GMT+02:00 Paul Butcher p...@paulbutcher.com:

 Is there any way to benchmark the degree of structural sharing achieved by
 a Clojure algorithm? I'm evaluating two different implementations of an
 algorithm, one which uses zippers and one which uses rrb-vector. It would
 be great if there were some way to quantify the degree to which they both
 achieved (or didn't) structural sharing.

 --
 paul.butcher-msgCount++

 Silverstone, Brands Hatch, Donington Park...
 Who says I have a one track mind?

 http://www.paulbutcher.com/
 LinkedIn: http://www.linkedin.com/in/paulbutcher
 Skype: paulrabutcher

 Author of Seven Concurrency Models in Seven Weeks: When Threads Unravel
 http://pragprog.com/book/pb7con

 --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with
 your first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 ---
 You received this message because you are subscribed to the Google Groups
 Clojure group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to clojure+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Benchmarking structural sharing

2014-08-12 Thread Paul Butcher
On 12 August 2014 at 13:49:42, Linus Ericsson (oscarlinuserics...@gmail.com) 
wrote:

The conclusion of this is I think the easiest way to make this work is to just 
run the algorithm in both versions and watch the object allocation statistics 
closely in VisualVM or similar.

Yeah, that's exactly what I was hoping I might be able to avoid. Ah well.

 My intuition is that the there will be a lot of copied arrays

Indeed. But intuition often isn't a good guide when it comes to optimisation 
(hence my wish to measure and make decisions based on data instead of 
intuition).

--
paul.butcher-msgCount++

Silverstone, Brands Hatch, Donington Park...
Who says I have a one track mind?

http://www.paulbutcher.com/
LinkedIn: http://www.linkedin.com/in/paulbutcher
Skype: paulrabutcher

Author of Seven Concurrency Models in Seven Weeks: When Threads Unravel
http://pragprog.com/book/pb7con

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Benchmarking structural sharing

2014-08-12 Thread Michał Marczyk
It seems hard to answer in a completely generic fashion. If there's a
certain collection of vectors which you'd like to share structure, it
may be possible to put a number on the degree of sharing achieved by
examining the internals of those vectors. That wouldn't address the
issue of intermediate allocations internal to your algorithms, but
that might be fine.

As for a possible approach to the above, I've just added a function
called count-nodes to clojure.core.rrb-vector.debug (the JVM version
only) which counts all vector nodes used by a collection of vectors
(PV / gvec / RRB). An example from my REPL:

(let [vs (mapv (comp vec range) [2048 2048 2048])]
  [(apply dv/count-nodes (apply fv/catvec vs) vs)
   (apply dv/count-nodes (reduce into [] vs) vs)])
;= [203 396]

I might tweak the way it works a little (for example, it could take a
collection rather than varargs and count internal arrays rather than
nodes; the latter change could be beneficial, as tails are just arrays
without a node wrapper).

Hope this helps.

Cheers,
Michał


On 11 August 2014 14:56, Paul Butcher p...@paulbutcher.com wrote:
 Is there any way to benchmark the degree of structural sharing achieved by a
 Clojure algorithm? I'm evaluating two different implementations of an
 algorithm, one which uses zippers and one which uses rrb-vector. It would be
 great if there were some way to quantify the degree to which they both
 achieved (or didn't) structural sharing.

 --
 paul.butcher-msgCount++

 Silverstone, Brands Hatch, Donington Park...
 Who says I have a one track mind?

 http://www.paulbutcher.com/
 LinkedIn: http://www.linkedin.com/in/paulbutcher
 Skype: paulrabutcher

 Author of Seven Concurrency Models in Seven Weeks: When Threads Unravel
 http://pragprog.com/book/pb7con

 --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with your
 first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 ---
 You received this message because you are subscribed to the Google Groups
 Clojure group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to clojure+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Benchmarking structural sharing

2014-08-11 Thread Paul Butcher
Is there any way to benchmark the degree of structural sharing achieved by a 
Clojure algorithm? I'm evaluating two different implementations of an 
algorithm, one which uses zippers and one which uses rrb-vector. It would be 
great if there were some way to quantify the degree to which they both achieved 
(or didn't) structural sharing.

--
paul.butcher-msgCount++

Silverstone, Brands Hatch, Donington Park...
Who says I have a one track mind?

http://www.paulbutcher.com/
LinkedIn: http://www.linkedin.com/in/paulbutcher
Skype: paulrabutcher

Author of Seven Concurrency Models in Seven Weeks: When Threads Unravel
http://pragprog.com/book/pb7con

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.