I have the same concern: samtools sort uses much more RAM than the requested 
size. I guess this is because samtools is reserving excessive RAM to reduce 
malloc() calls. It would be good to make improvements. For example, we may 
periodically free unused memory.

I sometimes use sambamba for sorting. It also uses more RAM than the requested 
size, but not as much.

Heng

On Jan 19, 2015, at 7:52, Devon Ryan <dpr...@dpryan.com> wrote:

> Hi Matei,
> 
> Just as a point of reference, the -m option sets an approximate size limit on 
> the memory needed to hold the alignment that are to be sorted. That doesn't 
> include the memory needed to do the actual sorting. The memory needed for 
> that will actually depend on how many alignments were needed to hit the -m 
> option. I would guess that it's that additional memory required that's 
> causing the problem (someone would have to look into how the actual merge 
> sort is implemented to see if the additional 50% is reasonable).
> 
> Devon
> 
> --
> Devon Ryan, Ph.D.
> Email: dpr...@dpryan.com
> Laboratory for Molecular and Cellular Cognition
> German Centre for Neurodegenerative Diseases (DZNE)
> Ludwig-Erhard-Allee 2
> 53175 Bonn
> Germany
> 
> On Fri, Jan 16, 2015 at 6:35 PM, Matei David <ma...@cs.toronto.edu> wrote:
> Hi,
> 
> I'm using samtools 1.0, and I'm seeing a process started with
> "samtools sort -@ 4 -m 5G ..."
> reach VSZ 30441316 (and get killed by SGE or the kernel because of
> reaching ulimit). 4 threads x 5G should be 20G at most, right?
> 
> The samtools sort documentation mentions that "-m" is "approximately"
> max mem per thread. I'm not sure what to make of that, is there a
> range we could expect? In my case, the approximation seems to be quite
> far off: 30/4 = 7.5, which is 150% of what I specified as max.
> 
> If that might matter, the reads I'm dealing with are quite big (>4Kbp
> average length), could that be causing problems with RAM usage
> estimation?
> 
> What h_vmem would should I ask for to safely sort with "-@ 4 -m 5G"?
> 
> Thanks,
> Matei
> 
> ------------------------------------------------------------------------------
> New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
> GigeNET is offering a free month of service with a new server in Ashburn.
> Choose from 2 high performing configs, both with 100TB of bandwidth.
> Higher redundancy.Lower latency.Increased capacity.Completely compliant.
> http://p.sf.net/sfu/gigenet
> _______________________________________________
> Samtools-help mailing list
> Samtools-help@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/samtools-help
> 
> ------------------------------------------------------------------------------
> New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
> GigeNET is offering a free month of service with a new server in Ashburn.
> Choose from 2 high performing configs, both with 100TB of bandwidth.
> Higher redundancy.Lower latency.Increased capacity.Completely compliant.
> http://p.sf.net/sfu/gigenet_______________________________________________
> Samtools-help mailing list
> Samtools-help@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/samtools-help


------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to