Re: Transducers: sequence versus eduction

Tassilo Horn Thu, 02 Apr 2015 06:39:13 -0700

Alex Miller <a...@puredanger.com> writes:

Hi Alex,


> If you're going to use expanding transformations and not realize all of the
> results then I think sequences are likely a better choice for you.

Ok, I see.

>> However, at least I had expected that in the case where all elements
>> are realized the transducer version should have been faster than the
>> traditional version which also needs to fully realize all
>> intermediate lazy seqs.  Why is it still three times slower?
>
> I think my main suggestion here is that you are using a non-reducible
> source (range) throughout these timings, so transducers have no
> leverage on the input side. CLJ-1515 will make range reducible and
> should help a lot on this particular example.

Well, even if I revamp the (admittedly contrieved) example to have
reducible vectors as source and also intermediates

  (let [v (vec (range 0 1000))
        vs (zipmap (range 0 1000)
                   (for [i (range 0 1000)]
                     (vec (range i 1000))))]
    (time (dorun (sequence (comp (mapcat (fn [i] (vs i)))
                                 (mapcat (fn [i] (vs i))))
                           v))))

it still takes 18 seconds instead of 21 with lazy seqs produced by
range, or just 7 seconds with normal lazy seq functions.

In my real scenario, I think there's also no IReduces paths because the
mapcat functions either return normal lazy seqs or Java Collections
(which are not actually clojure collections).  But usually, the
transformations are not so freaking expanding as the example above.  I
benchmarked a bit, and there sometimes using transducers is faster and
sometimes it is not.  So I've made than configurable (with normal lazy
seqs as default) so users can benchmark and then decide, and I don't
need to choose for them. :-)

Oh, and actually *you* have made that possible by making me aware of

  (sequence (comp xform*) start-coll)

is almost identical to

  (->> start-coll xform*)

that is, when my macro computes xforms as if they were meant for
transducing, I can also use them "traditionally" with ->>.

Until now, I've newer used ->> but before I had implemented the
expansion for transducers, I used a for with gensyms for intermediates
like:

  (for [G__1 start-coll
        G__2 (xform1 G__1)
        G__3 (xform2 G__2)]
    G__3)

That's pretty much different to generate.  But since the xforms for
transducers and ->> are the same, switching between lazy seq fns and
transducers is just changing how start-coll and xforms are composed.
Awesome!

>> So my conclusion is that you cannot use transducers as a kind of
>> drop-in replacement of traditional sequence manipulation functions.
>> They pay off only when you can make very strong assumptions about the
>> sizes and compututation costs of intermediate collections, and I
>> think you cannot do that in general.  Or well, maybe you can when you
>> program an application but you almost certainly cannot when you
>> program a library and thus have no clue about how that's gonna be
>> used by users.
>
> Transducers make different trade offs than sequences and there will
> always be cases where one or the other is a better choice.  I really
> appreciate this thread as highlighting some of the nuances.

Yes, thanks a lot for your patience.  I appreciate that very much.

> Transducers break transformations into three parts - source iteration,
> composed transforms, and output collection.  In the case of reducible
> inputs, multiple transforms, and full realization, transducers can be
> much faster.  If not all of those are in play, then the results are
> more subtle.  One thing I've found in perf testing a lot of stuff is
> that chunked sequences continually surprise me at how fast they can
> be.

Then maybe I should experiment with chunged seqs.

Bye,
Tassilo

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Transducers: sequence versus eduction

Reply via email to