I have been playing a bit more with the concept of a data stream for  
building computational pipelines. The result has replaced my previous  
experimental stream-utils library in clojure.contrib:

        http://code.google.com/p/clojure-contrib/source/browse/trunk/src/ 
clojure/contrib/stream_utils.clj

Examples are provided as well:

        http://code.google.com/p/clojure-contrib/source/browse/trunk/src/ 
clojure/contrib/stream_utils/examples.clj

This new stream-utils library is for the now standard Clojure branch,  
not for the experimental streams branch.

This module introduces a new way to represent streams in the form of  
stream generator closures. It also contains an interface to streams  
through a multimethod that is implemented for stream generator  
closures and for lazy sequences. Next, there is a stream transformer  
monad plus associated utility functions/macros.

The basic idea behind the stream generator closures is that a stream  
is not represented by its elements (as in Rich's experimental stream  
implementation in the streams branch of Clojure), but by its current  
state plus a function that converts the current state into the next  
value plus the new state, the state and the function being wrapped up  
together in a closure. The closure is called with an arbitrary end-of- 
stream value, and returns a vector containing the next value in the  
sequence (or end-of-stream) plus a stream generator closure  
encapsulating the new state.

The implementation of such a stream generator looks quite similar to  
the implementation of a lazy seq. Compare for example the following  
two functions implementing a simple random-number generator:

(defn rng-gen [seed]
   (fn [eos]
     (let [m      259200
          value  (/ (float seed) (float m))
          next   (rem (+ 54773 (* 7141 seed)) m)]
       [value (rng-gen next)])))

(defn rng-seq [seed]
   (lazy-seq
     (let [m      259200
          value  (/ (float seed) (float m))
          next   (rem (+ 54773 (* 7141 seed)) m)]
       (cons value (rng-seq next)))))

While generator closures are less convenient than lazy seqs because  
they are less well integrated into Clojure's ecosystem, they do have  
a couple of advantages:
1) No values are ever cached.
2) It is easy to keep a reference to a stream state and recreate the  
stream from it at any time.
3) I expect them to be faster because they don't need to go through a  
caching mechanism, but I didn't actually do any timings until now.

It would be nice if stream generators could be made seq-able,  
removing the need for stream-as-seq, but this doesn't look simple,  
requiring at least proxy if not gen-class. Any suggestions about how  
to do this are welcome, as is any other feedback on the stream-utils  
library.

Konrad.


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to