Re: Lazy-seq of a binary file
On Wed, Dec 14, 2011 at 11:47 PM, Simone Mosciatti mweb@gmail.com wrote: Ok thank you so much, i got it. Thanks again ;-) You're welcome. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
On Wed, Dec 14, 2011 at 12:04 AM, Simone Mosciatti mweb@gmail.com wrote: Thank you so much, just one last thing, why you use a char-array ? Reader returns chars. If I want use a byte-array, and no map all the whole sequence ? Use an InputStream rather than a reader if you're reading binary files (or text files as binary). If you're not consuming the whole sequence, again, have the part of the code that consumes some and then stops also create the stream and be responsible for closing it, passing it to the lazy sequence maker; use with-open, and if that part of the code still emits a sequence (e.g. (take some-number (remove icky? (..., rather than a single object (extracted, reduced, or whatever), wrap that sequence in (doall ...) inside the with-open so all the needed stream I/O actually is performed before the stream gets closed. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
Ok thank you so much, i got it. Thanks again ;-) Simone On Dec 14, 3:22 am, Cedric Greevey cgree...@gmail.com wrote: On Wed, Dec 14, 2011 at 12:04 AM, Simone Mosciatti mweb@gmail.com wrote: Thank you so much, just one last thing, why you use a char-array ? Reader returns chars. If I want use a byte-array, and no map all the whole sequence ? Use an InputStream rather than a reader if you're reading binary files (or text files as binary). If you're not consuming the whole sequence, again, have the part of the code that consumes some and then stops also create the stream and be responsible for closing it, passing it to the lazy sequence maker; use with-open, and if that part of the code still emits a sequence (e.g. (take some-number (remove icky? (..., rather than a single object (extracted, reduced, or whatever), wrap that sequence in (doall ...) inside the with-open so all the needed stream I/O actually is performed before the stream gets closed. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
Here's a version I hacked up a while ago: https://gist.github.com/1472163 -S -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
No, I'm sure to not use all the sequence, so I will follow your second advice, but... Cause of my non-perfect english I've not really understand the last part. Who is the caller ? You suggest something like this: (let [fl (clojure.java.io/reader path/filename) rd (lazy-reader fl)] (do-my-operation-with-lazy-reader) (.close fl)) ??? Or what ? On Dec 13, 1:20 am, Cedric Greevey cgree...@gmail.com wrote: You also probably want more efficiency. Try something closer to: (defn lazy-reader [filename] (let [rd (fn [rdr] (let [buf (char-array 4096) n (.read rdr buf 0 4096)] (condp == n -1 (.close rdr) 0 (recur rdr) (take n buf lr (fn lr [rdr] (lazy-seq (if-let [b (rd rdr)] (concat b (lr rdr)] (lr (clojure.java.io/reader filename which should buffer the reads and terminate when the stream is consumed (untested). I'm not sure how to additionally make the seq chunked. I avoided with-open to avoid the stream closing before the lazy seq is consumed. Instead, it closes only when the lazy seq is fully consumed, or when it's been discarded and the GC runs the stream's finalizer after discovering that it's become unreachable. The latter could take a while (and isn't guaranteed to happen short of the JVM exiting) so there's a risk of running out of file handles using this a lot without fully consuming the lazy seqs. If that could be an issue, I'd suggest modifying it to take a reader rather than a filename, and making the caller responsible for instantiating the reader, passing it in, (partially) consuming the sequence, and closing the reader if the sequence may not have been fully consumed. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
On Tue, Dec 13, 2011 at 10:14 PM, Simone Mosciatti mweb@gmail.com wrote: No, I'm sure to not use all the sequence, so I will follow your second advice, but... Cause of my non-perfect english I've not really understand the last part. Who is the caller ? You suggest something like this: (let [fl (clojure.java.io/reader path/filename) rd (lazy-reader fl)] (do-my-operation-with-lazy-reader) (.close fl)) ??? Pretty much, yes. The lazy-reader function will of course not call clojure.java.io/reader on its argument, instead just passing it directly to its internal function lr. And if (do-my-operation-with-lazy-reader) returns a lazy sequence that's backed by the lazy-reader sequence, then either it needs to be doall'd before (.close fl) or the responsibility for creating and closing the reader needs to be pushed up to *its* user. Generally, if dealing with data that might not fit in main memory you'd need to push the reader's lifetime management up to the level where you're not just wrapping the seq but reducing over it, searching it for an item (with something like (first (filter ...)) or (first (remove ...)) or (last ...) or (some ...)), or picking a limited sample (e.g. (take some-finite-number (filter foo (some-wrapped-lazy-reader-seq rdr. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
Ok, now by now i think to have understand... To do right, I should build a macro similar to let where I pass the filename and after execute the body close the stream, right ? On Dec 13, 9:42 pm, Cedric Greevey cgree...@gmail.com wrote: On Tue, Dec 13, 2011 at 10:14 PM, Simone Mosciatti mweb@gmail.com wrote: No, I'm sure to not use all the sequence, so I will follow your second advice, but... Cause of my non-perfect english I've not really understand the last part. Who is the caller ? You suggest something like this: (let [fl (clojure.java.io/reader path/filename) rd (lazy-reader fl)] (do-my-operation-with-lazy-reader) (.close fl)) ??? Pretty much, yes. The lazy-reader function will of course not call clojure.java.io/reader on its argument, instead just passing it directly to its internal function lr. And if (do-my-operation-with-lazy-reader) returns a lazy sequence that's backed by the lazy-reader sequence, then either it needs to be doall'd before (.close fl) or the responsibility for creating and closing the reader needs to be pushed up to *its* user. Generally, if dealing with data that might not fit in main memory you'd need to push the reader's lifetime management up to the level where you're not just wrapping the seq but reducing over it, searching it for an item (with something like (first (filter ...)) or (first (remove ...)) or (last ...) or (some ...)), or picking a limited sample (e.g. (take some-finite-number (filter foo (some-wrapped-lazy-reader-seq rdr. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
On Tue, Dec 13, 2011 at 11:12 PM, Simone Mosciatti mweb@gmail.com wrote: Ok, now by now i think to have understand... To do right, I should build a macro similar to let where I pass the filename and after execute the body close the stream, right ? Easier to just use the pre-existing one: with-open. Something like: (with-open [rdr (clojure.java.io/reader foo)] (reduce something (filter whatever (transmogrify (lazy-reader rdr) will auto-close rdr after the reduction is done, and evaluate to the result of the reduction. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
Where by now: (defn lazy-reader [fl] (assert ...) (lazy-seq (cons (.read fl) (lazy-reader fl The first one ? This means that I haven't understand nothing, right? (I'm so sorry for this stupid question... :embarassed: ) On Dec 13, 10:23 pm, Cedric Greevey cgree...@gmail.com wrote: On Tue, Dec 13, 2011 at 11:12 PM, Simone Mosciatti mweb@gmail.com wrote: Ok, now by now i think to have understand... To do right, I should build a macro similar to let where I pass the filename and after execute the body close the stream, right ? Easier to just use the pre-existing one: with-open. Something like: (with-open [rdr (clojure.java.io/reader foo)] (reduce something (filter whatever (transmogrify (lazy-reader rdr) will auto-close rdr after the reduction is done, and evaluate to the result of the reduction. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
On Tue, Dec 13, 2011 at 11:33 PM, Simone Mosciatti mweb@gmail.com wrote: Where by now: (defn lazy-reader [fl] (assert ...) (lazy-seq (cons (.read fl) (lazy-reader fl The first one ? Er, buffering of the I/O is probably preferable, but that would probably work OK in many cases. Just removing the (clojure.java.io/reader ...) wrapping the argument in the code I wrote will make it take the reader directly as its argument. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
Thank you so much, just one last thing, why you use a char-array ? If I want use a byte-array, and no map all the whole sequence ? On Dec 13, 10:39 pm, Cedric Greevey cgree...@gmail.com wrote: On Tue, Dec 13, 2011 at 11:33 PM, Simone Mosciatti mweb@gmail.com wrote: Where by now: (defn lazy-reader [fl] (assert ...) (lazy-seq (cons (.read fl) (lazy-reader fl The first one ? Er, buffering of the I/O is probably preferable, but that would probably work OK in many cases. Just removing the (clojure.java.io/reader ...) wrapping the argument in the code I wrote will make it take the reader directly as its argument. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
If I do just something like that: (def fl (clojure.java.io/reader /path/to/file)) (defn lazy-reader [fl] (lazy-seq (cons (.read fl) (lazy-reader fl Can work ? (0.03696 ms for 500 char) Possible problem ? On Dec 11, 9:49 pm, Stephen Compall stephen.comp...@gmail.com wrote: On Sat, 2011-12-10 at 23:13 -0800, Simone Mosciatti wrote: Anyway, i'm looking for read a file byte by byte, perfect would be get a lazy-seq of every byte in the file, it's looks, for me, very weird that there isn't a built-in or some easy way to do that The tradeoffs aren't universal enough to justify one true way to do that. You might be willing to pay the overhead of one LazySeq, function, lock, and Byte per byte, but many can't afford it. To be honest, I need to read N byte at time (where I know N before to read) so I can built anytime a bite-array and put in the necessary byte which (def stream (new java.io.FileInputStream filename)) (def vettore (byte-array N)) (.read stream vettore) But i really don't like this way, the (.read ) is horrible, I guess... It's not Clojure style to replace absolutely everything you might do with Java interop with a standard Clojure-style library. Else, there would be no motivation to make interop so easy. It is Clojure style, though, to use those Clojure libraries that do exist, when appropriate. In this case, see `reader' in the clojure.java.io module included with the standard library, as a replacement for your `new' call. As for the reading part itself, I suggest careful usage of `iterate' and `take-while' (or `lazy-seq' and `cons') from the standard library, with appropriate .read calls. Byte-by-byte reading can be accomplished by writing a single iterate, a single take-while, and a single .read call. -- Stephen Compall ^aCollection allSatisfy: [:each|aCondition]: less is better -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
On Mon, 2011-12-12 at 10:21 -0800, Simone Mosciatti wrote: (defn lazy-reader [fl] (lazy-seq (cons (.read fl) (lazy-reader fl Can work ? (0.03696 ms for 500 char) Possible problem ? You need a termination case; your lazy-reader currently always yields an infinite sequence. -- Stephen Compall ^aCollection allSatisfy: [:each|aCondition]: less is better -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
I thought to just put it into a take... (take number-of-byte-necessary (lazy-reader (clojure.java.io/reader path/to/file))) On Dec 12, 12:34 pm, Stephen Compall stephen.comp...@gmail.com wrote: On Mon, 2011-12-12 at 10:21 -0800, Simone Mosciatti wrote: (defn lazy-reader [fl] (lazy-seq (cons (.read fl) (lazy-reader fl Can work ? (0.03696 ms for 500 char) Possible problem ? You need a termination case; your lazy-reader currently always yields an infinite sequence. -- Stephen Compall ^aCollection allSatisfy: [:each|aCondition]: less is better -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
Ok, I found a possible problem, if i try to put all together, so write something like this: (defn lazy-reader [filename] (with-open [fl (clojure.java.io/reader filename)] (loop [bite (.read fl)] (lazy-seq (cons (bite) (recur (.read fl))) Obviously doesn't work... Any suggest of how fix that ? Error: CompilerException java.lang.UnsupportedOperationException: Can only recur from tail position, compiling:(NO_SOURCE_PATH:42) On Dec 12, 12:42 pm, Simone Mosciatti mweb@gmail.com wrote: I thought to just put it into a take... (take number-of-byte-necessary (lazy-reader (clojure.java.io/reader path/to/file))) On Dec 12, 12:34 pm, Stephen Compall stephen.comp...@gmail.com wrote: On Mon, 2011-12-12 at 10:21 -0800, Simone Mosciatti wrote: (defn lazy-reader [fl] (lazy-seq (cons (.read fl) (lazy-reader fl Can work ? (0.03696 ms for 500 char) Possible problem ? You need a termination case; your lazy-reader currently always yields an infinite sequence. -- Stephen Compall ^aCollection allSatisfy: [:each|aCondition]: less is better -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
On Mon, 2011-12-12 at 20:03 -0800, Simone Mosciatti wrote: Any suggest of how fix that ? In general, avoid loop. Specifically, try using letfn or (fn SOME-NAME-HERE [args...] ...) as your recursion target. -- Stephen Compall ^aCollection allSatisfy: [:each|aCondition]: less is better -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
You also probably want more efficiency. Try something closer to: (defn lazy-reader [filename] (let [rd (fn [rdr] (let [buf (char-array 4096) n (.read rdr buf 0 4096)] (condp == n -1 (.close rdr) 0 (recur rdr) (take n buf lr (fn lr [rdr] (lazy-seq (if-let [b (rd rdr)] (concat b (lr rdr)] (lr (clojure.java.io/reader filename which should buffer the reads and terminate when the stream is consumed (untested). I'm not sure how to additionally make the seq chunked. I avoided with-open to avoid the stream closing before the lazy seq is consumed. Instead, it closes only when the lazy seq is fully consumed, or when it's been discarded and the GC runs the stream's finalizer after discovering that it's become unreachable. The latter could take a while (and isn't guaranteed to happen short of the JVM exiting) so there's a risk of running out of file handles using this a lot without fully consuming the lazy seqs. If that could be an issue, I'd suggest modifying it to take a reader rather than a filename, and making the caller responsible for instantiating the reader, passing it in, (partially) consuming the sequence, and closing the reader if the sequence may not have been fully consumed. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Lazy-seq of a binary file
Hi Guys, I'm pretty new of clojure so sorry for the maybe stupid question... Anyway, i'm looking for read a file byte by byte, perfect would be get a lazy-seq of every byte in the file, it's looks, for me, very weird that there isn't a built-in or some easy way to do that, but I haven't find nothing, it's possible that I haven't search enough ? To be honest, I need to read N byte at time (where I know N before to read) so I can built anytime a bite-array and put in the necessary byte which (def stream (new java.io.FileInputStream filename)) (def vettore (byte-array N)) (.read stream vettore) But i really don't like this way, the (.read ) is horrible, I guess... Any suggest ? -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Lazy-seq of a binary file
On Sat, 2011-12-10 at 23:13 -0800, Simone Mosciatti wrote: Anyway, i'm looking for read a file byte by byte, perfect would be get a lazy-seq of every byte in the file, it's looks, for me, very weird that there isn't a built-in or some easy way to do that The tradeoffs aren't universal enough to justify one true way to do that. You might be willing to pay the overhead of one LazySeq, function, lock, and Byte per byte, but many can't afford it. To be honest, I need to read N byte at time (where I know N before to read) so I can built anytime a bite-array and put in the necessary byte which (def stream (new java.io.FileInputStream filename)) (def vettore (byte-array N)) (.read stream vettore) But i really don't like this way, the (.read ) is horrible, I guess... It's not Clojure style to replace absolutely everything you might do with Java interop with a standard Clojure-style library. Else, there would be no motivation to make interop so easy. It is Clojure style, though, to use those Clojure libraries that do exist, when appropriate. In this case, see `reader' in the clojure.java.io module included with the standard library, as a replacement for your `new' call. As for the reading part itself, I suggest careful usage of `iterate' and `take-while' (or `lazy-seq' and `cons') from the standard library, with appropriate .read calls. Byte-by-byte reading can be accomplished by writing a single iterate, a single take-while, and a single .read call. -- Stephen Compall ^aCollection allSatisfy: [:each|aCondition]: less is better -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en