On Tuesday, May 5, 2015 at 11:18:56 PM UTC-4, Sam Raker wrote: > > I've got two really big CSV files that I need to compare. Stay tuned for > the semi-inevitable "how do I optimize over this M x N space?" question, > but for now I'm still trying to get the data into a reasonable format--I'm > planning on converting each line into a map, with keys coming from either > the first line of the file, or a separate list I was given. Non-lazy > approaches run into memory limitations; lazy approaches run into "Stream > closed" exceptions while trying to coordinate `with-open` and `line-seq`. > Given that memory is already tight, I'd like to avoid leaving open > files/file descriptors/readers/whatever-the-term-in-clojure-is lying > around. I've tried writing a macro, I've tried transducers, I've tried > passing around the open reader along with the lazy seq, none successfully, > albeit none necessarily particularly well. Any suggestions on streaming > such big files? >
Something like this didn't work? (with-open [rdr1 ... rdr2 ...] (let [l1 (line-seq rdr1) l2 (line-seq rdr2)] (->> (map something l1 l2) (filter whatever) (first)))) For instance, to check if two text files are the same, something would be not= and whatever would be identity, and the result would be nil if they were the same, and something truthy otherwise. The first has the effect of short circuiting when the result is known, and neither line-seq's head should be held. The first also has the effect of ensuring the with-open scope is not left until as much of both line-seqs are consumed as will be needed. Reduce and the use of trans/reducers that get reduced would have the same effect. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.