I mostly revert to good ole loop/recur for these large file processing
exercises. Here's a template you could use (includes a try/catch so
you can see errors as you go);

(import '(java.io BufferedReader FileReader PrintWriter File))

(defn process-log-file
  "Read a log file tracting lines matching regx."
  [in-fp out-fp regx]
  (with-open [rdr (BufferedReader. (FileReader. (File. in-fp)))
              wtr (PrintWriter. (File. out-fp))]
      (loop [line (.readLine rdr) i 0]
        (if line
          (try
            (let [fnd (re-matches regx line)]
              (when-not (nil? fnd)
                (.println wtr line))) ; or whatever
              (recur (.readLine rdr) (inc i))
            (catch Exception e (prn line e)))
          ))))

Regards, Adrian.



On Mon, Aug 31, 2009 at 4:44 PM, wangzx<wangzaixi...@gmail.com> wrote:
>
> I just want to learn clojure by using it to parse log file and
> generate reports. and one question is: for a large text file, can we
> use it as a sequence effectively? for example, for a 100M log file, we
> need to check each line for some pattern match.
>
> I just using the (line-seq rdr) but it will cause
> OutOfMemoryException.
>
> demo code
>
> (defn buffered-reader [file]
>        (new java.io.BufferedReader
>                (new java.io.InputStreamReader
>                        (new java.io.FileInputStream file))))
>
> (def -reader (buffered-reader "test.txt"))
> (filter #(= "some" %) -reader)
>
> even there is no lines match "some", the filter operation will cause
> OutOfMemoryException.
>
> Is there other APIs like the Sequence but provide stream-like API?
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to