On Jul 31, 4:45 am, Andy Fingerhut <[email protected]>
wrote:
> I thought I'd follow up my own question with some programs that I
> should have already known about for memory profiling, which were
> already installed on my Mac as part of the standard Java installation
> from Apple (who are just passing them on from Sun, I'm sure), but I
> didn't know about them:
>
> jconsole -- Good for seeing how fast a java process is allocating
> memory, and garbage collecting.
>
> jmap -- Good for a quick summary of the above, or with the "-histo"
> option, a much more detailed list of what kinds of objects are taking
> up the most memory.
>
> I also learned that Clojure's cons and lazy cons structures take up 48
> bytes per element (at least on a Mac with 'java -client' and java
> 1.6.0_<foo>), which gets significant when your program has sequences
> of several millions of elements about.
>
> I've updated my github repo with a pretty decent version of the
> reverse-complement benchmark in Clojure.  It isn't as sequence-y as it
> could be, but the more sequence-y version generates and collects
> garbage so fast that it really slows things down significantly.  Same
> lesson from other flavors of Lisp, I guess -- you can write the
> straightforward easy-to-write-and-test-and-understand code that conses
> a lot (i.e. allocates memory quickly that typically becomes garbage
> quite soon), or you can write the more loopy code that doesn't, but
> typically starts to merge many things that you'd otherwise prefer to
> separate into different functions.  Just compare revcomp.clj-5.clj and
> revcomp.clj-6.clj in my git repo for an example.
>
> The nice thing is that when you don't need the "uglier" code, Clojure
> and other Lisps usually let you write code much more concisely than
> lower level languages.  Get it working first, then optimize it.  Since
> I'm comparing run times of the Clojure programs versus those submitted
> to the language shootout benchmark web site, some of which appear
> quite contorted in order to gain performance, I wanted to do some
> optimizations that you wouldn't necessarily want to do otherwise.
>
> git://github.com/jafingerhut/clojure-benchmarks.git
>
> You can see my latest run time results here.  I've got 4 benchmarks
> written in Clojure so far, with my current versions being 6x, 8x, 12x,
> and 15x more CPU time than the Java programs submitted to the language
> shootout benchmark web site.
>
> http://github.com/jafingerhut/clojure-benchmarks/blob/20d21bc169d52ca...
>
> I could make some of these significantly closer in speed to the Java
> versions, but I suspect that they will start looking more and more
> like the Java versions if I do, except with Clojure syntax for Java
> calls.  I'm happy to be proved wrong on that, if someone finds better
> Clojure versions than I've got.
>
> Thanks,
> Andy
>
> On Jul 30, 11:00 am, Andy Fingerhut <[email protected]>
> wrote:
>
> > I'm gradually adding a few more Clojure benchmark programs to my
> > repository here:
>
> > git://github.com/jafingerhut/clojure-benchmarks.git
>
> > The one I wrote for the "reverse-complement" benchmark is here:
>
> >http://github.com/jafingerhut/clojure-benchmarks/blob/4ab4f41c6f96344...
>
> > revcomp.clj-4.clj is the best I've got so far, but it runs out of
> > memory on the full size benchmark.
>
> > If you clone the repository, and successfully run the init.sh script
> > to generate the big input and expected output files, the file rcomp/
> > long-input.txt contains 3 DNA sequences in FASTA format. The first is
> > 50,000,000 characters long, the second is 75,000,000 characters long,
> > and the third is 125,000,000 characters long. Each needs to be
> > reversed, have each character replaced with a different one, and
> > printed out, so we need to store each of the strings one at a time,
> > but it is acceptable to deallocate/garbage-collect the previous one
> > when starting on the next. I think my code should be doing that, but I
> > don't know how to verify that.
>
> > I've read that a Java String takes 2 bytes per character, plus about
> > 38 bytes of overhead per string. That is about 250 Mbytes for the
> > longest one. I also read in a seq of lines, and these long strings are
> > split into lines with 60 characters (plus a newline) each. Thus the
> > string's data needs to be stored at least twice temporarily -- once
> > for the many 60-character strings, plus the final long one.  Also, the
> > Java StringBuilder that Clojure's (str ...) function uses probably
> > needs to be copied and reallocated periodically as it outgrows its
> > current allocation. So I could imagine needing about 3 * 250 Mbytes
> > temporarily, but that doesn't explain why my 1536 Mbytes of JVM memory
> > are being exhausted.
>
> > It would be possible to improve things by not creating all of the
> > separate strings, one for each line, and then concatenating them
> > together. But first I'd like to explain why it is using so much,
> > because I must be missing something.
>
> > Thank,
>  Andy

Thanks Andy for your earlier help.  I created a blog entry with the
results of the benchmark, mine and yours.

I mention 3 profilers jrat, netbeans and Eclipse's MAT.  I say, use
all 3.  The netbeans was the most prolific.

But just a word of warning.  If your code takes 1 minute to run
standalone.  It may take a hour to run completely with a profiler
analyzing all the results.  That is just my experience with the
applications I run.  That is why I tend to favor something like jrat
which takes the results after the application has run as opposed to
live results.

And just a note, I am not a language design person and still new to
Clojure, but my results say that Clojure has a 'memory' problem, not
necarrily a raw speed problem.    Every statistic says that Clojure
creates MANY objects even with the default core library, I wonder if
reducing the amount of new objects might reduce having the JVM have to
garbage collect all of those objects.

http://berlinbrowndev.blogspot.com/2009/07/jvm-notebook-basic-clojure-java-and-jvm.html

(sorry I am not much of the writer but I print out a lot of stuff).
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to