On Thu, Dec 14, 2017 at 1:57 PM, John Marshall <john.w.marsh...@glasgow.ac.uk> wrote: > On 15 Dec 2017, Brent Pedersen <bpede...@gmail.com> wrote: >> With bam/tabix, we can recognize the stats bin in index 37450 >> >> It is not documented how to find this bin for CSI which can have real >> data in 37450. > > There exists a draft of a fleshed-out CSI document [1], but alas it still > needs to be rescued from the back burner. CSI was introduced in HTSlib and > the appropriate bin number for these bins in CSI can be gleaned from the > HTSlib source code, or the relevant information from that draft is below. > >> What is the way to find it? > > The information and layout inside the pseudo-bin is the same as in BAI, and > it appears as bin number bin_limit+1, where bin_limit() is the function > below. This is a generalisation of BAI's 37450, so this calculation produces > the right bin number for BAI too. The "+1" is an accident of history; there > was one single slot left vacant between the largest populatable BAI bin > number (37448) and 37450, but there's no particular discernible reason for > this and it doesn't affect anything in practice. > > John > > > /* calculate maximum bin number -- valid bin numbers range within > [0,bin_limit) */ > int bin_limit(int min_shift, int depth) > { > return ((1 << (depth+1)*3) - 1) / 7; > } > > [1] https://sourceforge.net/p/samtools/mailman/message/33475986/
Thanks for the clarification. -Brent ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help