Hello Ryan,

I'm in the exact same situation with my bowtie/tophat tools,
going back and forth between outputing a SAM, sorted SAM, BAM or sorted BAM,
and I'm still not sure what's the best method.

Storage wise - you're correct, just saving the sorted BAM is the best (even 
more with the fact the processing SAM files as text is so horrendous that I 
think alnost no tool uses them directly, always requiring intervals or sorted 
BAM).

But one annoyance (for me) is that samtools (the program) is very in-efficient 
- using only a single thread (and the sort part isn't doing a great job at 
that).

So if I give the "mapping" tool as a whole 20 threads or more, and a part of 
the running time (the samtools sort part) is only using a single-thread - I'm 
wasting the other threads, as they sit idle waiting for the sort to finish.

I also tried sorting the SAM file directly, using GNU sort (version 8.10 can 
use multiple threads, and the memory management actually works, as opposed to 
"samtools sort -m") - but I'm not sure it's worth the effort.

I didn't find an optimal solution that I like, and I'm interested to hear what 
others think.

-gordon

Ryan Golhar wrote, On 04/05/2011 01:08 PM:
> Hi all - I find it redundant to hold on to SAM output from NGS
> Mapping tools such when I end up converting the SAM files to BAM
> files anyway. The cleanup scripts require the history items to be
> deleted, but I don't want to delete them yet as I want the entire
> workflow to be kept until we are done analyzing our data.
> 
> So, I was thinking of a way to remove the intermediate SAM files and
> thought how I would do this on the command line...simply pipe the
> output of BWA to samtools to create a BAM file and never have a SAM
> file to deal with.
> 
> The BWA tool runner can be modified to pipe BWA output directly to
> samtools so a SAM file is never physical stored on disk.  Has anyone
> done this?  Does this seem like a good idea?
> 
> Ryan
> 

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to