Hello !
Very relevant question considering the work I have done over the past weeks. > ________________________________________ > De : Matthew MacManes [macma...@gmail.com] > Date d'envoi : 11 juillet 2011 11:00 > À : Ray Mailing List > Objet : [Denovoassembler-users] The system is out of memory > > Hi All, > > I am hoping to get some help troubleshooting a problem. I am using Ray-1.6.0 > on one of the TeraGrid resources, and having problems with a memory issue. I > can run Ray successfully with k=31, but not with K=41 or k=51 (I am trying > k=21 right now).. When I try, I get "Critical exception: The system is out of > memory, returned NULL." usually during the "computing vertices" steps, I am > assembling a genome of size ~1Gb and have 384Gb of Illumina reads in 16 > libraries. > Over the last weeks, I attacked this very problem of memory usage in Ray. Basically, the k-mer counting fragment the memory in a very bad way. The problems (in Ray <=1.6.0) were: (1) K-mers occurring once are basically useless. Fix: Therefore, they are now destroyed (and their memory is freed) after the graph is built. (2) The hash table I was using was basically worthless because collisions were managed with a linked list on a per-bucket basis. Fix: Now I am using open addressing with double hashing so Ray does not use linked list for collisions now. \see http://en.wikipedia.org/wiki/Double_hashing (3) At some point an hash table needs to grow, but this is slow if done all at once. Fix: incremental resizing wherein growing the hash table is done in numerous slices. \see http://en.wikipedia.org/wiki/Hash_table#Incremental_resizing (this allows constant time insertion) (4) Hash tables basically always have a lot of empty buckets which will never be used because of (1) capacity watchdog and (2) bucket probing statistically never hits some buckets at all. Fix: I implemented a sparse hash table (very similar to Google Sparse Hash) but my implementation has 64 elements per group, not 48. Also, I use double hashing probing instead of quadratic probing. \see http://google-sparsehash.googlecode.com/svn/trunk/doc/implementation.html \see https://github.com/sebhtml/ray/blob/master/code/structures/MyHashTable.h (5) Finally, the culprit of high memory usage: memory fragmentation. Fix: I implemented a memory compactor. When an object allocates memory, the allocator returns a smart pointer instead of a standard void*. Thus, the underlying location of any smart pointer *will* change during memory compaction. Firefox also has problems with memory fragmentation, see: https://wiki.mozilla.org/Performance/MemShrink \see https://github.com/sebhtml/ray/blob/master/code/memory/ChunkAllocatorWithDefragmentation.cpp \see https://github.com/sebhtml/ray/blob/master/code/memory/DefragmentationLane.cpp \see https://github.com/sebhtml/ray/blob/master/code/memory/DefragmentationGroup.cpp For a DefragmentationGroup, allocation is constant time and deallocation is constant time except if there are at least 2048 freed elements in the fragmented slice. In that case, it is O(65536) for compacting. So the amortized time complexity of deallocate is pretty good overall. You can download Ray v1.6.1-rc3 at https://github.com/sebhtml/ray/zipball/v1.6.1-rc3 to compile for low-memory usage: make MAXKMERLENGTH=64 FORCE_PACKING=y PREFIX=build-1.6.1-rc3 make install FORCE_PACKING=y basically does not align addresses on a 8-byte boundary. Address alignment is required on some architectures such as UltraSparc, Itanium and some Intel Xeon processors. FORCE_PACKING=y will cause bus errors on some Intel Xeon processors and all Intel Itanium processors. > I have tried running with more cores (as many as 512) to distribute memory > requirements over more cores and with less "wayness", which basically > allocates more RAM per process... When I look at RAM usage, it does not look > like I max out-- the most I have seen is ~9Gb, out of ~30Gb available per > node. > > Can you please help me troubleshoot this issue? > What exactly is "wayness" ? Also on an unrelated note, Ray now does a network test for like 5 seconds to measure the network latency. Check for the file PREFIX.NetworkTest.txt Example: [sboisver12@colosse1 ~]$ head parrot-BGI-Assemblathon2-k41-20110711.NetworkTest.txt # average latency in microseconds (10^-6 seconds) when requesting a reply for a message of 4000 bytes # Message passing interface rank Name Latency in microseconds 0 r104-n13 152 1 r104-n13 155 2 r104-n13 155 3 r104-n13 155 4 r104-n13 155 5 r104-n13 154 6 r104-n13 154 7 r104-n13 155 > Command: > /work/01301/mmacmane/Ray-1.6.0/build/Ray -k 41 \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/091223HHMI_Tg_hfs1_36PE_270_s_7_12_sequence.fq.trimmed.fastq > \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/091223HHMI_Tg_hfs2_36PE_317_s_8_12_sequence.fq.trimmed.fastq > \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100503GA1_Tg_PJC1_76PE_334bp_s_3_12_sequence.fq.trimmed.fastq > \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100503GA1_Tg_PJC1_76PE_334bp_s_4_12_sequence.fq.trimmed.fastq > \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100503GA1_Tg_PJC1_76PE_334bp_s_5_12_sequence.fq.trimmed.fastq > \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100527GA2_Tg_PJC2_101PE_384bp_s_1_12_sequence.fq.trimmed.fastq > \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100527GA2_Tg_PJC2_101PE_384bp_s_2_12_sequence.fq.trimmed.fastq > \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100819GA2_Tg_PJC1_76PE_334bp_s_6_12_sequence.fq.trimmed.fastq > \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100820GA3_Tg_PJC2_101PE_384bp_s_3_12_sequence.fq.trimmed.fastq > \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/110324HS1A_Tg_PJC09_100PE_455bp_s_7_12_sequence.fq.trimmed.fastq > \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/110324HS1A_Tg_PJC10_100PE_526bp_s_8_12_sequence.fq.trimmed.fastq > \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/110331HS2A_Tg_PJC11_100PE_923bp_s_5_12_sequence.fq.trimmed.fastq > \ > -i > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/110331HS2A_Tg_PJC12_100PE_1213bp_s_6_12_sequence.fq.trimmed.fastq > \ > -s > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100506GA2_Tg_PJC2_101PE_384bp_s_8_sequence.fq.fastq > \ > -s > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/110316HS3B_Tg_PJC11_100PE_923bp_s_5_sequence.fq.fastq > \ > -s > /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/110316HS3B_Tg_PJC12_100PE_1213bp_s_6_sequence.fq.fastq > \ > -show-memory-usage -o /scratch/01301/mmacmane/spider.runs/Rayk41/k41 > Tip: you can add comment in your bash submission script: `: library DTAAPE, Illumina sequences from BGI, insert size: 10000` \ -p \ /rap/nne-790-ab/Datasets/Assemblathon-2-Bird/BGI_illumina_data/PARprgDAADTAAPE/110514_I263_FC81P81ABXX_L5_PARprgDAADTAAPE_1.fq.fastq \ /rap/nne-790-ab/Datasets/Assemblathon-2-Bird/BGI_illumina_data/PARprgDAADTAAPE/110514_I263_FC81P81ABXX_L5_PARprgDAADTAAPE_2.fq.fastq \ `: library DTAAPE, Illumina sequences from BGI, insert size: 10000` is a comment. > Thanks, Matt > ____________________________________________ > Matthew MacManes, Ph.D. > University of California- Berkeley > Museum of Vertebrate Zoology > Phone: 510-495-5833 > Lab Website: http://ib.berkeley.edu/labs/lacey > Personal Website: http://macmanes.com/ > > Sébastien http://github.com/sebhtml ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users