My first thought is that you need to tweak your JVM settings.  Try
allocation a minimum of 512MB to the total.

My second thought is that you need to use laziness to your advantage.
Remove the print expression from the mapping operation.  It's useful
for debugging/prototyping, but shouldn't be in the final version.
Spit the processed json-seq into a file when you're done instead.
This way you can process one input file at a time, and simply append
your results to the output file.

My $.02
Sean

On Jul 26, 9:53 am, atucker <agjf.tuc...@googlemail.com> wrote:
> Hi all!  I have been trying to use Clojure on a student project, but
> it's becoming a bit of a nightmare.  I wonder whether anyone can
> help?  I'm not studying computer science, and I really need to be
> getting on with the work I'm actually supposed to be doing :)
>
> I am trying to work from a lot of Twitter statuses that I saved to
> text file.  (Unfortunately I failed to escape quotes and such, so the
> JSON is not valid.  Anyone know a good way of coping with that?)
>
> Here is my function:
>
> (defn json-seq []
>   (apply concat
>          (map #(do (print "f") (str/split (slurp %) #"\nStatusJSONImpl"))
>               out-files)))
>
> Now there are forty files and five thousand statuses per file, which
> sounds like a lot, and I don't suppose I can hope to hold them all in
> memory at the same time.  But I had thought that my function might
> produce a lazy sequence that would be more manageable.  However I
> typically get:
>
> twitter.core> (nth (json-seq dir-name) 5)
> ffff"{createdAt=Fri .... etc.   GOOD
>
> twitter.core> (nth (json-seq dir-name) 5000)
> ffff
> Java heap space
>   [Thrown class java.lang.OutOfMemoryError]   BAD
>
> And at this point my REPL is done for.  Any further instruction will
> result in another OutOfMemoryError.  (Surely that has to be a bug just
> there?  Has the garbage collector just given up?)
>
> Anyway I am thinking that the sequence is not behaving as lazily as I
> need it to.  It's not reading one file at a time, and it's not reading
> thirty-two as I might expect from "chunks", but something in the
> middle.  I did try the "dechunkifying" code from page 339 of "Joy of
> Clojure", but that doesn't compile at all :(
>
> I do seem to keep running into memory problems with Clojure.  I have
> 2GB RAM and am using Snow Leopard, Aquamacs 2.0, Clojure 1.2.0 beta1
> and Leiningen 1.2.0.
>
> Cheers
> Alistair

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to