On 15 Dec 2017, Brent Pedersen <bpede...@gmail.com> wrote: > With bam/tabix, we can recognize the stats bin in index 37450 > > It is not documented how to find this bin for CSI which can have real > data in 37450.
There exists a draft of a fleshed-out CSI document [1], but alas it still needs to be rescued from the back burner. CSI was introduced in HTSlib and the appropriate bin number for these bins in CSI can be gleaned from the HTSlib source code, or the relevant information from that draft is below. > What is the way to find it? The information and layout inside the pseudo-bin is the same as in BAI, and it appears as bin number bin_limit+1, where bin_limit() is the function below. This is a generalisation of BAI's 37450, so this calculation produces the right bin number for BAI too. The "+1" is an accident of history; there was one single slot left vacant between the largest populatable BAI bin number (37448) and 37450, but there's no particular discernible reason for this and it doesn't affect anything in practice. John /* calculate maximum bin number -- valid bin numbers range within [0,bin_limit) */ int bin_limit(int min_shift, int depth) { return ((1 << (depth+1)*3) - 1) / 7; } [1] https://sourceforge.net/p/samtools/mailman/message/33475986/ ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help