Re: [Genome] Regarding bin calculation

Hiram Clawson Tue, 19 Jul 2011 14:15:45 -0700

Good Afternoon Eli:

The calculations are using the 0-based coordinate system.


The browser has a function in C-code: hAddBinToQuery()
in src/hg/lib/hdb.c that generates SQL statements with
the bin calculations included.  You can see these statements
with the test exerciser program in: src/hg/lib/tests/binTest.c

For example:
$ ./bin/x86_64//binTest -verbose=3  -start=1000000 -end=2000000 -sqlOnly
# (1000000, 2000000):
sql:"(bin>=592 and bin<=600 or bin>=73 and bin<=74 or bin=9 or bin=1 or bin=0 
or bin=4681 ) and ",

You can experiment with the test program.  Note the discussion there
in the src/hg/lib/tests/binTest.sh script file.

--Hiram

Eli Roberson wrote:
> The bin field is extremely useful for indexing purposes. I have found 
> the appropriate documentation and code for bit-shifting bin 
> calculations. I have two questions regarding bin implementation.
> 
> [1] Are the bin calculations in the UCSC genome-browser based on the 
> standard 0-based UCSC start position, half-open? Or are they calculated 
> based on 1-based fields, inclusive? I am using this field to intersect 
> with a custom field that uses bin and therefore the difference is 
> important. The code subtracting 1 from the end position makes me think 
> the coordinates are half-open, but I am still unsure of 0-base versus 
> 1-base.
> 
> [2] How is the calculation of which bins to search determined at query 
> time for the UCSC genome browser, i.e. is there a join of bin to a bin 
> table (or a subquery), is there an SQL function / stored proc to return 
> possible bins, or is something done at the level of AJAX/PHP to decide 
> which bins to look in for overlapping features?
> 
> Thanks for the clarifications.
> 
> Eli Roberson
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] Regarding bin calculation

Reply via email to