"retard" <[email protected]> wrote in message news:[email protected]... > Sat, 19 Feb 2011 14:32:27 -0500, dsimcha wrote: > >> On 2/19/2011 12:50 PM, Ulrik Mikaelsson wrote: >>> Just a thought; I guess the references to the non-GC-scanned strings >>> are held in GC-scanned memory, right? Are the number of such references >>> also increased linearly? >> >> Well, first of all, the benchmark I posted seems to indicate otherwise. >> Second of all, I was running this program before on yeast DNA and it >> was ridiculously fast. Then I tried to do the same thing on human DNA >> and it became slow as molasses. Roughly speaking, w/o getting into the >> biology much, I've got one string for each gene. Yeast have about 1/3 >> as many genes as humans, but the genes are on average about 100 times >> smaller. Therefore, the difference should be at most a small constant >> factor and in actuality it's a huge constant factor. >> >> Note: I know I could make the program in question a lot more space >> efficient, and that's what I ended up doing. It works now. It's just >> that it was originally written for yeast, where space efficiency is >> obviously not a concern, and I would have liked to just try a one-off >> calculation on the human genome without having to rewrite portions of >> it. > > Probably one reason for this behavior is the lack of testing. My desktop > only has 24 GB of DDR3. I have another machine with 16 GB of DDR2, but > don't know how to combine the address spaces via clustering. This would > also horribly drag down GC performance. Even JVM is badly tuned for > larger systems, they might use the Azul Java runtimes instead..
*Only* 24GB of DDR3, huh? :) Makes me feel like a pauper: I recently upgraded from 1GB to 2GB of DDR1 ;) (It actually had been 2GB a few years ago, but I cannablized half of it to build my Linux box.) Out of curiosity, what are you running on that? (Multiple instances of Crysis? High-definition voxels?)
