Hi Johannes,

The sort functions aren't in the header, the main sort function isn't
really constructed in a way to be conducive to use in a library (i.e., you
can't just give it an htsFile* and have it run, you basically give it the
command line options for "samtools sort").

Doing the sorting isn't actually that difficult. You can either look at how
things are done in bam_sort.c within samtools or instead just look at a
slimmed down take on that that I use here:
https://github.com/dpryan79/bison/blob/master/sort.c There are some small
differences between "samtools sort" and the one in the link, but you'll get
the gist either way.

The general idea is to copy a bunch of alignments to a heap (ideally
keeping track of its size, unless you know that you have sufficient RAM),
sort the heap (possibly using multiple threads, with each thread just
sorting a subset of the heap), write that to a temp file, and finally merge
the temp files. You could also do all of that in memory, presuming you have
enough given the size of your BAM/CRAM files.

Best,
Devon


--
Devon Ryan, Ph.D.
Email: dpr...@dpryan.com
Laboratory for Molecular and Cellular Cognition
German Centre for Neurodegenerative Diseases (DZNE)
Ludwig-Erhard-Allee 2
53175 Bonn
Germany
<devon.r...@dzne.de>

On Thu, Jan 29, 2015 at 4:17 PM, Johannes Helmuth <helm...@molgen.mpg.de>
wrote:

> Dear list,
>
> I would like to programmatically access the samtools library for a C++
> program. Unfortunately, I
> cannot figure out how to sort a bam file. Indeed, bam.h states that it
> offers ways for sorting but I
> did not find a matching function:
>
> ------------------------------------------------------------------------
> //streams
> samfile_t *in = 0, *out = 0;
>
> //open samfile for reading
> const char* csampath = sampath.c_str();
> in = samopen(csampath, "r", 0);
> if (in == 0) {
>         stop("Fail to open SAM file " + sampath);
> }
>
> //open bamfile for writing
> const char* cbampath = bampath.c_str();
> out = samopen(cbampath, "wb", in->header);
> if (out == 0) {
>         stop("Fail to open BAM file ." + bampath);
> }
>
> //read sam and write to bam: adapted from sam_view.c:232-244
> bam1_t *b = bam_init1();
> while (samread(in, b) >= 0) { // read one alignment from `in'
>         samwrite(out, b); // write the alignment to `out'
> }
> bam_destroy1(b);
>
> //close streams
> samclose(in);
> samclose(out);
>
> //sort bam and build the index
> //bam_sort(); //
> bam_index_build(bampath.c_str()); //fails if bam is not correctly sorted
>
> ------------------------------------------------------------------------
>
> Thanks in advance,
> --
> Dipl.-Bioinf. Johannes Helmuth, PhD Student
> OWL Epigenomics
> Max Planck Institute for Molecular Genetics
> Ihnestraße 63-73
> D-14195 Berlin, Germany
> Fon: (+4930) 8413 1173
> Web: http://www.molgen.mpg.de/22701/Computational_Epigenomics
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is
> your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Samtools-help mailing list
> Samtools-help@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/samtools-help
>
------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to