On Thu, Dec 14, 2017 at 1:57 PM, John Marshall
<john.w.marsh...@glasgow.ac.uk> wrote:
> On 15 Dec 2017, Brent Pedersen <bpede...@gmail.com> wrote:
>> With bam/tabix, we can recognize the stats bin in index 37450
>>
>> It is not documented how to find this bin for CSI which can have real
>> data in 37450.
>
> There exists a draft of a fleshed-out CSI document [1], but alas it still 
> needs to be rescued from the back burner. CSI was introduced in HTSlib and 
> the appropriate bin number for these bins in CSI can be gleaned from the 
> HTSlib source code, or the relevant information from that draft is below.
>
>> What is the way to find it?
>
> The information and layout inside the pseudo-bin is the same as in BAI, and 
> it appears as bin number bin_limit+1, where bin_limit() is the function 
> below. This is a generalisation of BAI's 37450, so this calculation produces 
> the right bin number for BAI too. The "+1" is an accident of history; there 
> was one single slot left vacant between the largest populatable BAI bin 
> number (37448) and 37450, but there's no particular discernible reason for 
> this and it doesn't affect anything in practice.
>
>     John
>
>
> /* calculate maximum bin number -- valid bin numbers range within 
> [0,bin_limit) */
> int bin_limit(int min_shift, int depth)
> {
>     return ((1 << (depth+1)*3) - 1) / 7;
> }
>
> [1] https://sourceforge.net/p/samtools/mailman/message/33475986/


Thanks for the clarification.
-Brent

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to