The one thing I'm aware of holding on to is a filtered file-seq:
(def the-files (filter #(s/ends-with? (.getName %) ".xml" ) (rest (file-seq
There are 7,000+ files; but I'm assuming the elements there are just
file-references and shouldn't take much space.
The rest of the process is a transducer sequence:
(def requirement-seq (sequence
Those functions are admittedly space inefficient (lots of work with
zippers); but are pure. What comes out the other end is a sequence of
Clojure maps. Could holding on to the file references prevent all that
processing effluvia from being collected?
The original files add up to 1.3 gigs altogether. I'd expect the gleaned
data to be significantly smaller; but I'd better check into how close
that's getting to the default heap-size.
On Tuesday, August 8, 2017 at 1:20:21 AM UTC-7, Peter Hull wrote:
> On Tuesday, 8 August 2017 06:20:56 UTC+1, Nathan Smutz wrote:
>> Does this message sometimes present because the non-garbage data is
>> getting too big?
> Yes, it's when most of your heap is non-garbage, so the GC has to keep
> running but doesn't succeed in freeing much memory each time.
> You can increases the heap but that might only defer the problem.
> As you process all your files, are you holding on to references to objects
> that you don't need any more?
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to firstname.lastname@example.org
Note that posts from new members are moderated - please be patient with your
To unsubscribe from this group, send email to
For more options, visit this group at
You received this message because you are subscribed to the Google Groups
To unsubscribe from this group and stop receiving emails from it, send an email
For more options, visit https://groups.google.com/d/optout.