I am implementing a parser that is able to read the BAM file in pairs so 
whenever I read a record where pos < mpos I search for the mate and I 
create a pair structure.
Once I find the mate I have to roll back to the second read and continue 
building the pairs.
I could create a circular array of BAM records to be filled sequentially 
and used then to build pairs, but I cannot be 100% sure that the array 
is large enough to contain the mate record that might be very far in the 
sorted BAM.
Do you see what I mean?
cheers,
Claudio



On 07/12/2015 19:12, James Bonfield wrote:
> On Mon, Dec 07, 2015 at 06:03:45PM +0100, Claudio Alberti wrote:
>> it seems that bgzf_seek and bgzf_tell work better even if I get a
>> crash after about 43000 reads...
> "Better" isn't good though if it still crashes.
>
> Fundamentally, if you're using bgzf_tell/seek then you have to do all
> I/O at the bgzf level and never call something on the higher level
> hts_file API.  This means losing some other functionality.
>
> What is it you're trying to achieve here?
>
> I guess the reason this isn't a problem for most people is that
> tell/seek isn't that useful for most applications.  Yes it offers a
> way of going to specific file locations, but usually tools want
> specific genomic locations instead.  Eg fetch me all the data for this
> gene or split by chromosome.
>
> These are dealt with using iterators and the bam/cram indices.
> They're well tested and used by many samtools components, unlike
> tell/seek which are more internal things.
>
> I can see some benefits to the file level slicing too (eg in some
> parallel processing techniques), so I think it's something we need to
> get working robustly, but not at top priority right now.
>
> James
>

-- 
Claudio Alberti
----------------------------------------------
EPFL SCI STI MM
ELG 140 (ELG Building)
Station 11
CH-1015 Lausanne - Switzerland
Tel. +41 21 6936869
----------------------------------------------


------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to