I have been playing a bit more with the concept of a data stream for
building computational pipelines. The result has replaced my previous
experimental stream-utils library in clojure.contrib:
http://code.google.com/p/clojure-contrib/source/browse/trunk/src/
clojure/contrib/stream_utils.clj
Examples are provided as well:
http://code.google.com/p/clojure-contrib/source/browse/trunk/src/
clojure/contrib/stream_utils/examples.clj
This new stream-utils library is for the now standard Clojure branch,
not for the experimental streams branch.
This module introduces a new way to represent streams in the form of
stream generator closures. It also contains an interface to streams
through a multimethod that is implemented for stream generator
closures and for lazy sequences. Next, there is a stream transformer
monad plus associated utility functions/macros.
The basic idea behind the stream generator closures is that a stream
is not represented by its elements (as in Rich's experimental stream
implementation in the streams branch of Clojure), but by its current
state plus a function that converts the current state into the next
value plus the new state, the state and the function being wrapped up
together in a closure. The closure is called with an arbitrary end-of-
stream value, and returns a vector containing the next value in the
sequence (or end-of-stream) plus a stream generator closure
encapsulating the new state.
The implementation of such a stream generator looks quite similar to
the implementation of a lazy seq. Compare for example the following
two functions implementing a simple random-number generator:
(defn rng-gen [seed]
(fn [eos]
(let [m 259200
value (/ (float seed) (float m))
next (rem (+ 54773 (* 7141 seed)) m)]
[value (rng-gen next)])))
(defn rng-seq [seed]
(lazy-seq
(let [m 259200
value (/ (float seed) (float m))
next (rem (+ 54773 (* 7141 seed)) m)]
(cons value (rng-seq next)))))
While generator closures are less convenient than lazy seqs because
they are less well integrated into Clojure's ecosystem, they do have
a couple of advantages:
1) No values are ever cached.
2) It is easy to keep a reference to a stream state and recreate the
stream from it at any time.
3) I expect them to be faster because they don't need to go through a
caching mechanism, but I didn't actually do any timings until now.
It would be nice if stream generators could be made seq-able,
removing the need for stream-as-seq, but this doesn't look simple,
requiring at least proxy if not gen-class. Any suggestions about how
to do this are welcome, as is any other feedback on the stream-utils
library.
Konrad.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---