On Jul 31, 12:06 pm, Berlin Brown <[email protected]> wrote:
> On Jul 31, 4:45 am, Andy Fingerhut <[email protected]>
> wrote:
>
>
>
> > I thought I'd follow up my own question with some programs that I
> > should have already known about for memory profiling, which were
> > already installed on my Mac as part of the standard Java installation
> > from Apple (who are just passing them on from Sun, I'm sure), but I
> > didn't know about them:
>
> > jconsole -- Good for seeing how fast a java process is allocating
> > memory, and garbage collecting.
>
> > jmap -- Good for a quick summary of the above, or with the "-histo"
> > option, a much more detailed list of what kinds of objects are taking
> > up the most memory.
>
> > I also learned that Clojure's cons and lazy cons structures take up 48
> > bytes per element (at least on a Mac with 'java -client' and java
> > 1.6.0_<foo>), which gets significant when your program has sequences
> > of several millions of elements about.
>
> > I've updated my github repo with a pretty decent version of the
> > reverse-complement benchmark in Clojure. It isn't as sequence-y as it
> > could be, but the more sequence-y version generates and collects
> > garbage so fast that it really slows things down significantly. Same
> > lesson from other flavors of Lisp, I guess -- you can write the
> > straightforward easy-to-write-and-test-and-understand code that conses
> > a lot (i.e. allocates memory quickly that typically becomes garbage
> > quite soon), or you can write the more loopy code that doesn't, but
> > typically starts to merge many things that you'd otherwise prefer to
> > separate into different functions. Just compare revcomp.clj-5.clj and
> > revcomp.clj-6.clj in my git repo for an example.
>
> > The nice thing is that when you don't need the "uglier" code, Clojure
> > and other Lisps usually let you write code much more concisely than
> > lower level languages. Get it working first, then optimize it. Since
> > I'm comparing run times of the Clojure programs versus those submitted
> > to the language shootout benchmark web site, some of which appear
> > quite contorted in order to gain performance, I wanted to do some
> > optimizations that you wouldn't necessarily want to do otherwise.
>
> > git://github.com/jafingerhut/clojure-benchmarks.git
>
> > You can see my latest run time results here. I've got 4 benchmarks
> > written in Clojure so far, with my current versions being 6x, 8x, 12x,
> > and 15x more CPU time than the Java programs submitted to the language
> > shootout benchmark web site.
>
> >http://github.com/jafingerhut/clojure-benchmarks/blob/20d21bc169d52ca...
>
> > I could make some of these significantly closer in speed to the Java
> > versions, but I suspect that they will start looking more and more
> > like the Java versions if I do, except with Clojure syntax for Java
> > calls. I'm happy to be proved wrong on that, if someone finds better
> > Clojure versions than I've got.
>
> > Thanks,
> > Andy
>
> > On Jul 30, 11:00 am, Andy Fingerhut <[email protected]>
> > wrote:
>
> > > I'm gradually adding a few more Clojure benchmark programs to my
> > > repository here:
>
> > > git://github.com/jafingerhut/clojure-benchmarks.git
>
> > > The one I wrote for the "reverse-complement" benchmark is here:
>
> > >http://github.com/jafingerhut/clojure-benchmarks/blob/4ab4f41c6f96344...
>
> > > revcomp.clj-4.clj is the best I've got so far, but it runs out of
> > > memory on the full size benchmark.
>
> > > If you clone the repository, and successfully run the init.sh script
> > > to generate the big input and expected output files, the file rcomp/
> > > long-input.txt contains 3 DNA sequences in FASTA format. The first is
> > > 50,000,000 characters long, the second is 75,000,000 characters long,
> > > and the third is 125,000,000 characters long. Each needs to be
> > > reversed, have each character replaced with a different one, and
> > > printed out, so we need to store each of the strings one at a time,
> > > but it is acceptable to deallocate/garbage-collect the previous one
> > > when starting on the next. I think my code should be doing that, but I
> > > don't know how to verify that.
>
> > > I've read that a Java String takes 2 bytes per character, plus about
> > > 38 bytes of overhead per string. That is about 250 Mbytes for the
> > > longest one. I also read in a seq of lines, and these long strings are
> > > split into lines with 60 characters (plus a newline) each. Thus the
> > > string's data needs to be stored at least twice temporarily -- once
> > > for the many 60-character strings, plus the final long one. Also, the
> > > Java StringBuilder that Clojure's (str ...) function uses probably
> > > needs to be copied and reallocated periodically as it outgrows its
> > > current allocation. So I could imagine needing about 3 * 250 Mbytes
> > > temporarily, but that doesn't explain why my 1536 Mbytes of JVM memory
> > > are being exhausted.
>
> > > It would be possible to improve things by not creating all of the
> > > separate strings, one for each line, and then concatenating them
> > > together. But first I'd like to explain why it is using so much,
> > > because I must be missing something.
>
> > > Thank,
> > Andy
>
> Thanks Andy for your earlier help. I created a blog entry with the
> results of the benchmark, mine and yours.
>
> I mention 3 profilers jrat, netbeans and Eclipse's MAT. I say, use
> all 3. The netbeans was the most prolific.
>
> But just a word of warning. If your code takes 1 minute to run
> standalone. It may take a hour to run completely with a profiler
> analyzing all the results. That is just my experience with the
> applications I run. That is why I tend to favor something like jrat
> which takes the results after the application has run as opposed to
> live results.
>
> And just a note, I am not a language design person and still new to
> Clojure, but my results say that Clojure has a 'memory' problem, not
> necarrily a raw speed problem. Every statistic says that Clojure
> creates MANY objects even with the default core library, I wonder if
> reducing the amount of new objects might reduce having the JVM have to
> garbage collect all of those objects.
>
> http://berlinbrowndev.blogspot.com/2009/07/jvm-notebook-basic-clojure...
>
> (sorry I am not much of the writer but I print out a lot of stuff).
rcomp tests
I noticed, those tests did really well as far as Clojure is
concerned. Interesting.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---