On Mon, Feb 22, 2010 at 5:30 AM, Johann Kraus <johann.kr...@gmail.com>wrote:

> However, when loading with read-lines from
> clojure.contrib.duck-streams and (map #(Double/parseDouble %) (.split
> line ",")) clojure requires several GB of RAM. Any suggestions for how
> to get this down to 400MB? And what would be the overhead if reading
> into a clojure vector, which I really would prefer to using java
> arrays?
>

You could consider using transients. This will help prevent creating a lot
of ephemeral objects. The following untested:

(let [v (transient [])]
  (doseq [n (.spit line ",")]
    (conj! v (Double/parseDouble n)))
  (persistent! n))

But just using Java arrays will certainly make your code more memory
efficient when you're dealing with such a large number of values. You also
might want to pick a different strategy for parsing the string instead of
splitting the string immediately into 50000000 parts.

David


> Thanks
> Johann
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to