Re: Reducers reduce my performance

2012-09-14 Thread Tassilo Horn
Kevin Downey  writes:

Hi Kevin,

>> This is the new version using reducers (:as r).  The problem here is
>> that to stop the iteration, one has to reduce the intermediate result
>> in every step and check if new reachable vertices (n) could be found.
>> If so, we need to do another iteration.
>
> this is almost certainly the problem, if you are using reducers you
> should not make intermediate results concrete.  given the structure of
> your computation it may not be a good fit for reducers.

I guessed that, but wanted to be extra-sure.

> you might be able to restructure in such a way as return something
> with a custom impl of CollReduce that performs this computation when
> reduced, which would get you out of returning concrete results.

Hm, yes, sounds good.  I think, I need to have another look at the
reducers implementation...

Thanks for the hint,
Tassilo

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Reducers reduce my performance

2012-09-14 Thread Kevin Downey
On Fri, Sep 14, 2012 at 4:22 AM, Tassilo Horn  wrote:
> Hi all,
>
> I have some code which uses a lot of map/mapcat/filter stuff and is
> totally eager (result is always an ordered set).  That looked to me like
> a good candidate for reducers.
>
> Basically, my code enables me to write something like this:
>
> --8<---cut here---start->8---
> (reachables p [p-seq
> [--> 'ContainsType]
> [p-+ [p-seq
>[<-- 'Defines]
>[--> 'Imports]
>[p-opt [<-- 'ContainsType])
> --8<---cut here---end--->8---
>
> which calculates the set of reachable vertices in a graph starting at a
> vertex p (or a set of vertices p) which can be reached by traversing a
> path matching the structure given as vector.  So from p, we navigate a
> sequence (p-seq) of one forward ContainsType-edge, then one-or-many
> times (p-+) traversing the sequence of an incoming Defines-edge followed
> by an outgoing Imports-edge followed by optionally (p-opt) an incoming
> ContainsType-edge.
>
> p-seq, p-+, p-opt, <--, --> are all functions receiving a set of
> vertices and the nested rest of the vector or edge type symbol.  The
> current implementation simply recurses over the vector and applies the
> functions, combining their results.
>
> Ok, that works very good and performs well, but I've though reducers
> might give me some extra-performance.  So now I rewrote all those
> functions to use reducers and return reducibles instead of ordered sets.
> The shape has changed to this, so instead of recursively applying the
> functions in a vector we create one big reducer function that does the
> job, and reachables simply applies that to p.
>
> --8<---cut here---start->8---
> (reachables p (p-seq
>(--> 'ContainsType)
>(p-+ (p-seq
>  (<-- 'Defines)
>  (--> 'Imports)
>  (p-opt (<-- 'ContainsType))
> --8<---cut here---end--->8---
>
> The good thing is that it calculates exactly the same set.  The bad
> thing is that it's about a factor of 2 or 3 slower.
>
> I tried to track down the cause, and the main bottleneck is the p-+
> function which got much slower than before.
>
> This is the new version using reducers (:as r).  The problem here is
> that to stop the iteration, one has to reduce the intermediate result in
> every step and check if new reachable vertices (n) could be found.  If
> so, we need to do another iteration.

this is almost certainly the problem, if you are using reducers you
should not make intermediate results concrete. given the structure of
your computation it may not be a good fit for reducers.

you might be able to restructure in such a way as return something
with a custom impl of CollReduce that performs this computation when
reduced, which would get you out of returning concrete results.

> --8<---cut here---start->8---
> (defn ^:private p-*-or-+
>   [p include-coll]
>   (fn [coll] ;; coll is a reducible
> (loop [ret (if include-coll
>  (into (ordered.set/ordered-set) coll)
>  (ordered.set/ordered-set))
>n (into (ordered.set/ordered-set)
>(if include-coll
>  (r/remove ret (p coll))
>  (p coll)))]
>   (if (seq n)
> (recur (into ret n)
>(into (ordered.set/ordered-set)
>  (r/remove ret (p n
> ret
>
> (defn p-+ [p]
>   (p-*-or-+ p false))
> --8<---cut here---end--->8---
>
> This is the original version, where the ordered set of vertices to start
> with is already given as first parameter, and the fn does the job itself
> rather than returning a fn that can do it.  (As you can see, the old
> version uses reducers internally, too, but the functions themselves all
> receive and return sets).
>
> --8<---cut here---start->8---
> (defn ^:private p-*-or-+
>   [v p ret]
>   (let [n (into (ordered.set/ordered-set)
> ;; oset is like set for ordered-sets
> (r/remove ret (oset (*p-apply* v p]
> (if (seq n)
>   (recur n p (into-oset ret n))
>   ret)))
>
> (defn p-* [v p]
>   (p-*-or-+ v p (oset v)))
> --8<---cut here---end--->8---
>
> So is that simply a use-case where reducers don't fit in very well, or
> am I doing something wrong?
>
> Thanks for any hints,
> Tassilo
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group

Reducers reduce my performance

2012-09-14 Thread Tassilo Horn
Hi all,

I have some code which uses a lot of map/mapcat/filter stuff and is
totally eager (result is always an ordered set).  That looked to me like
a good candidate for reducers.

Basically, my code enables me to write something like this:

--8<---cut here---start->8---
(reachables p [p-seq
[--> 'ContainsType]
[p-+ [p-seq
   [<-- 'Defines]
   [--> 'Imports]
   [p-opt [<-- 'ContainsType])
--8<---cut here---end--->8---

which calculates the set of reachable vertices in a graph starting at a
vertex p (or a set of vertices p) which can be reached by traversing a
path matching the structure given as vector.  So from p, we navigate a
sequence (p-seq) of one forward ContainsType-edge, then one-or-many
times (p-+) traversing the sequence of an incoming Defines-edge followed
by an outgoing Imports-edge followed by optionally (p-opt) an incoming
ContainsType-edge.

p-seq, p-+, p-opt, <--, --> are all functions receiving a set of
vertices and the nested rest of the vector or edge type symbol.  The
current implementation simply recurses over the vector and applies the
functions, combining their results.

Ok, that works very good and performs well, but I've though reducers
might give me some extra-performance.  So now I rewrote all those
functions to use reducers and return reducibles instead of ordered sets.
The shape has changed to this, so instead of recursively applying the
functions in a vector we create one big reducer function that does the
job, and reachables simply applies that to p.

--8<---cut here---start->8---
(reachables p (p-seq
   (--> 'ContainsType)
   (p-+ (p-seq
 (<-- 'Defines)
 (--> 'Imports)
 (p-opt (<-- 'ContainsType))
--8<---cut here---end--->8---

The good thing is that it calculates exactly the same set.  The bad
thing is that it's about a factor of 2 or 3 slower.

I tried to track down the cause, and the main bottleneck is the p-+
function which got much slower than before.

This is the new version using reducers (:as r).  The problem here is
that to stop the iteration, one has to reduce the intermediate result in
every step and check if new reachable vertices (n) could be found.  If
so, we need to do another iteration.

--8<---cut here---start->8---
(defn ^:private p-*-or-+
  [p include-coll]
  (fn [coll] ;; coll is a reducible
(loop [ret (if include-coll
 (into (ordered.set/ordered-set) coll)
 (ordered.set/ordered-set))
   n (into (ordered.set/ordered-set)
   (if include-coll
 (r/remove ret (p coll))
 (p coll)))]
  (if (seq n)
(recur (into ret n)
   (into (ordered.set/ordered-set)
 (r/remove ret (p n
ret

(defn p-+ [p]
  (p-*-or-+ p false))
--8<---cut here---end--->8---

This is the original version, where the ordered set of vertices to start
with is already given as first parameter, and the fn does the job itself
rather than returning a fn that can do it.  (As you can see, the old
version uses reducers internally, too, but the functions themselves all
receive and return sets).

--8<---cut here---start->8---
(defn ^:private p-*-or-+
  [v p ret]
  (let [n (into (ordered.set/ordered-set)
;; oset is like set for ordered-sets
(r/remove ret (oset (*p-apply* v p]
(if (seq n)
  (recur n p (into-oset ret n))
  ret)))

(defn p-* [v p]
  (p-*-or-+ v p (oset v)))
--8<---cut here---end--->8---

So is that simply a use-case where reducers don't fit in very well, or
am I doing something wrong?

Thanks for any hints,
Tassilo

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en