Thanks Sean, your first suggestion was a very good one :)

Tweaking JVM settings feels like advanced magic, and I am a little
surprised that it is necessary at such an early stage in my Clojure
journey.  But googling confirms that the default JVM settings are
miserly to an extreme, and I need at least to insert an :jvm-opts ["-
server"] in my project.clj.  I would suggest to the author of
Leiningen that perhaps this should be made the default?

I am getting a lot further now, but still running into OutOfMemory
errors sometimes.  And it is still the case that once I have suffered
an OutOfMemoryError, they keep coming.  It does feel as if there must
be some large memory leak in the emacs/lein swank repl.  Is this a
recognised issue?

The (print "f") is indeed there only for debugging purposes.  I don't
think it affects the laziness?  And unfortunately I am not quite sure
how to act on your other suggestions regarding processing workflow,
since at the moment this is more of an exploratory project.

I shall read the other suggestions regarding laziness later, and
hopefully get somewhere with those.  Thanks all!

Alistair.


On Jul 26, 3:18 pm, Sean Devlin <francoisdev...@gmail.com> wrote:
> My first thought is that you need to tweak your JVM settings.  Try
> allocation a minimum of 512MB to the total.
>
> My second thought is that you need to use laziness to your advantage.
> Remove the print expression from the mapping operation.  It's useful
> for debugging/prototyping, but shouldn't be in the final version.
> Spit the processed json-seq into a file when you're done instead.
> This way you can process one input file at a time, and simply append
> your results to the output file.
>
> My $.02
> Sean
>
> On Jul 26, 9:53 am, atucker <agjf.tuc...@googlemail.com> wrote:
>
> > Hi all!  I have been trying to use Clojure on a student project, but
> > it's becoming a bit of a nightmare.  I wonder whether anyone can
> > help?  I'm not studying computer science, and I really need to be
> > getting on with the work I'm actually supposed to be doing :)
>
> > I am trying to work from a lot of Twitter statuses that I saved to
> > text file.  (Unfortunately I failed to escape quotes and such, so the
> > JSON is not valid.  Anyone know a good way of coping with that?)
>
> > Here is my function:
>
> > (defn json-seq []
> >   (apply concat
> >          (map #(do (print "f") (str/split (slurp %) #"\nStatusJSONImpl"))
> >               out-files)))
>
> > Now there are forty files and five thousand statuses per file, which
> > sounds like a lot, and I don't suppose I can hope to hold them all in
> > memory at the same time.  But I had thought that my function might
> > produce a lazy sequence that would be more manageable.  However I
> > typically get:
>
> > twitter.core> (nth (json-seq dir-name) 5)
> > ffff"{createdAt=Fri .... etc.   GOOD
>
> > twitter.core> (nth (json-seq dir-name) 5000)
> > ffff
> > Java heap space
> >   [Thrown class java.lang.OutOfMemoryError]   BAD
>
> > And at this point my REPL is done for.  Any further instruction will
> > result in another OutOfMemoryError.  (Surely that has to be a bug just
> > there?  Has the garbage collector just given up?)
>
> > Anyway I am thinking that the sequence is not behaving as lazily as I
> > need it to.  It's not reading one file at a time, and it's not reading
> > thirty-two as I might expect from "chunks", but something in the
> > middle.  I did try the "dechunkifying" code from page 339 of "Joy of
> > Clojure", but that doesn't compile at all :(
>
> > I do seem to keep running into memory problems with Clojure.  I have
> > 2GB RAM and am using Snow Leopard, Aquamacs 2.0, Clojure 1.2.0 beta1
> > and Leiningen 1.2.0.
>
> > Cheers
> > Alistair

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to