Hi Fauna,
samtools already has functionality for splitting and subsorting your
original BAM, you just don't know you're using it!
The `-m` parameter sets the maximum memory allocated to each sorting
thread, after which samtools will start writing to disk. If you set this
limit higher, you should get around having too many files open at once. The
error you've pasted tells me you've got at least 252 temporary files open,
so increasing the choice for -m (by default its 768M) to something around
2G, should reduce this considerably!

Sam

On Wed, Oct 31, 2018 at 10:02 PM Yarza, Fauna <fauna.ya...@ucsf.edu> wrote:

> Hi all,
>
> *I* am very new to working with sequencing data and have run into a
> problem when using SAMtools to sort some of my larger files. When running
> the following
>
> *samtools sort Sample_aligned.bam > Sample_aligned_sorted.bam*
>
> this error will be returned
>
> *Failed to open file ./samtools.7711.2619.tmp.0252.bam samtools sort: fail
> to open "./samtools.7711.2619.tmp.0252.bam”: Too many open files*
>
> This error only occurs when trying to sort larger .bam files. I am working
> locally to avoid running on a cluster, and was wondering if there is a way
> to split the large .bam files that does not rely on chromosomal information
> (I do not have this information in my reference transcriptome). My initial
> idea is to split the original .bam file into 2, sort each file, and then
> concatenate them with the goal of preserving the sort. Is there a way to do
> this without creating additional problems downstream (example: not
> splitting the files by line number and instead using a different metric)?
>
> Best,
> Fauna
> _______________________________________________
> Samtools-help mailing list
> Samtools-help@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/samtools-help
>
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to