[Denovoassembler-users] RE : The system is out of memory

Sébastien Boisvert Mon, 11 Jul 2011 09:21:18 -0700

Hello !


Very relevant question considering the work I have done over the past weeks.


> ________________________________________
> De : Matthew MacManes [macma...@gmail.com]
> Date d'envoi : 11 juillet 2011 11:00
> À : Ray Mailing List
> Objet : [Denovoassembler-users] The system is out of memory
> 
> Hi All,
> 
> I am hoping to get some help troubleshooting a problem. I am using Ray-1.6.0 
> on one of the TeraGrid resources, and having problems with a memory issue. I 
> can run Ray successfully with k=31, but not with K=41 or k=51 (I am trying 
> k=21 right now).. When I try, I get "Critical exception: The system is out of 
> memory, returned NULL." usually during the "computing vertices" steps, I am 
> assembling a genome of size ~1Gb and have 384Gb of Illumina reads in 16 
> libraries.
>




Over the last weeks, I attacked this very problem of memory usage in Ray.




Basically, the k-mer counting fragment the memory in a very bad way.


The problems (in Ray <=1.6.0) were:

(1) K-mers occurring once are basically useless. 

  Fix: Therefore, they are now destroyed (and their memory is freed) after the 
graph is built.


(2) The hash table I was using was basically worthless because collisions were 
managed with a linked list on a per-bucket basis.

  Fix: Now I am using open addressing with double hashing so Ray does not use 
linked list for collisions now.


   \see http://en.wikipedia.org/wiki/Double_hashing




(3) At some point an hash table needs to grow, but this is slow if done all at 
once.

  Fix: incremental resizing wherein growing the hash table is done in numerous 
slices.

    \see http://en.wikipedia.org/wiki/Hash_table#Incremental_resizing
 

    (this allows constant time insertion)


(4) Hash tables basically always have a lot of empty buckets which will never 
be used because of (1) capacity watchdog and (2) bucket probing statistically 
never hits some buckets at all.

  Fix: I implemented a sparse hash table (very similar to Google Sparse Hash) 
but my implementation has 64 elements per group, not 48. Also, I use double 
hashing probing instead of
        quadratic probing.

    
    \see 
http://google-sparsehash.googlecode.com/svn/trunk/doc/implementation.html

    \see 
https://github.com/sebhtml/ray/blob/master/code/structures/MyHashTable.h


(5) Finally, the culprit of high memory usage: memory fragmentation.

  Fix: I implemented a memory compactor. When an object allocates memory, the 
allocator returns a smart pointer instead of a standard void*. Thus, the 
underlying location of any smart pointer 
*will* change during memory compaction. 


Firefox also has problems with memory fragmentation, see: 
https://wiki.mozilla.org/Performance/MemShrink


   \see 
https://github.com/sebhtml/ray/blob/master/code/memory/ChunkAllocatorWithDefragmentation.cpp

   \see 
https://github.com/sebhtml/ray/blob/master/code/memory/DefragmentationLane.cpp

   \see 
https://github.com/sebhtml/ray/blob/master/code/memory/DefragmentationGroup.cpp

 For a DefragmentationGroup, allocation is constant time and deallocation is 
constant time except if there are at least 2048 freed elements in the 
fragmented slice. In that case, it is O(65536) for compacting.

So the amortized time complexity of deallocate is pretty good overall.


You can download Ray v1.6.1-rc3 at 
https://github.com/sebhtml/ray/zipball/v1.6.1-rc3


to compile for low-memory usage:


make MAXKMERLENGTH=64 FORCE_PACKING=y PREFIX=build-1.6.1-rc3
make install
 

FORCE_PACKING=y basically does not align addresses on a 8-byte boundary.

Address alignment is required on some architectures such as UltraSparc, Itanium 
and some Intel Xeon processors.


FORCE_PACKING=y will cause bus errors on some Intel Xeon processors and all 
Intel Itanium processors.



> I have tried running with more cores (as many as 512) to distribute memory 
> requirements over more cores and with less "wayness", which basically 
> allocates more RAM per process... When I look at RAM usage, it does not look 
> like I max out-- the most I have seen is ~9Gb, out of ~30Gb available per 
> node.
> 
> Can you please help me troubleshoot this issue?
> 


What exactly is "wayness" ?


Also on an unrelated note, Ray now does a network test for like 5 seconds to 
measure the network latency.

Check for the file PREFIX.NetworkTest.txt


Example:

[sboisver12@colosse1 ~]$ head 
parrot-BGI-Assemblathon2-k41-20110711.NetworkTest.txt
# average latency in microseconds (10^-6 seconds) when requesting a reply for a 
message of 4000 bytes
# Message passing interface rank        Name    Latency in microseconds
0       r104-n13        152
1       r104-n13        155
2       r104-n13        155
3       r104-n13        155
4       r104-n13        155
5       r104-n13        154
6       r104-n13        154
7       r104-n13        155



> Command:
> /work/01301/mmacmane/Ray-1.6.0/build/Ray -k 41 \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/091223HHMI_Tg_hfs1_36PE_270_s_7_12_sequence.fq.trimmed.fastq
>  \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/091223HHMI_Tg_hfs2_36PE_317_s_8_12_sequence.fq.trimmed.fastq
>  \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100503GA1_Tg_PJC1_76PE_334bp_s_3_12_sequence.fq.trimmed.fastq
>  \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100503GA1_Tg_PJC1_76PE_334bp_s_4_12_sequence.fq.trimmed.fastq
>  \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100503GA1_Tg_PJC1_76PE_334bp_s_5_12_sequence.fq.trimmed.fastq
>  \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100527GA2_Tg_PJC2_101PE_384bp_s_1_12_sequence.fq.trimmed.fastq
>  \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100527GA2_Tg_PJC2_101PE_384bp_s_2_12_sequence.fq.trimmed.fastq
>  \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100819GA2_Tg_PJC1_76PE_334bp_s_6_12_sequence.fq.trimmed.fastq
>  \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100820GA3_Tg_PJC2_101PE_384bp_s_3_12_sequence.fq.trimmed.fastq
>  \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/110324HS1A_Tg_PJC09_100PE_455bp_s_7_12_sequence.fq.trimmed.fastq
>  \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/110324HS1A_Tg_PJC10_100PE_526bp_s_8_12_sequence.fq.trimmed.fastq
>  \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/110331HS2A_Tg_PJC11_100PE_923bp_s_5_12_sequence.fq.trimmed.fastq
>  \
> -i 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/110331HS2A_Tg_PJC12_100PE_1213bp_s_6_12_sequence.fq.trimmed.fastq
>  \
> -s 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/100506GA2_Tg_PJC2_101PE_384bp_s_8_sequence.fq.fastq
>  \
> -s 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/110316HS3B_Tg_PJC11_100PE_923bp_s_5_sequence.fq.fastq
>  \
> -s 
> /scratch/01301/mmacmane/spider_genome/clean.reads/ray.reads/110316HS3B_Tg_PJC12_100PE_1213bp_s_6_sequence.fq.fastq
>  \
> -show-memory-usage -o /scratch/01301/mmacmane/spider.runs/Rayk41/k41
>

Tip: you can add comment in your bash submission script:

`: library DTAAPE, Illumina sequences from BGI, insert size: 10000` \
-p \
   
/rap/nne-790-ab/Datasets/Assemblathon-2-Bird/BGI_illumina_data/PARprgDAADTAAPE/110514_I263_FC81P81ABXX_L5_PARprgDAADTAAPE_1.fq.fastq
 \
   
/rap/nne-790-ab/Datasets/Assemblathon-2-Bird/BGI_illumina_data/PARprgDAADTAAPE/110514_I263_FC81P81ABXX_L5_PARprgDAADTAAPE_2.fq.fastq
 \


`: library DTAAPE, Illumina sequences from BGI, insert size: 10000` is a 
comment.

 
> Thanks, Matt
> ____________________________________________
> Matthew MacManes, Ph.D.
> University of California- Berkeley
> Museum of Vertebrate Zoology
> Phone: 510-495-5833
> Lab Website: http://ib.berkeley.edu/labs/lacey
> Personal Website: http://macmanes.com/
> 
> 

                                                     Sébastien

http://github.com/sebhtml
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

[Denovoassembler-users] RE : The system is out of memory

Reply via email to