Can someone explain to me the standards for denoting intervals in UCSC  
BED format?  I'm struggling with what seem to me to be inconsistencies:

1. From the UCSC FAQ describing the BED format 
(http://genome.ucsc.edu/FAQ/FAQformat#format1 
), it sounds as if intervals are LEFT-CLOSED/RIGHT-OPEN.  For  
instance, a feature spanning bases 0-99 (inclusively) is denoted  
chromStart 0, chromEnd 100.
"The first three required BED fields are:

chrom - The name of the chromosome (e.g. chr3, chrY, chr2_random) or  
scaffold (e.g. scaffold10671).
chromStart - The starting position of the feature in the chromosome or  
scaffold. The first base in a chromosome is numbered 0.
chromEnd - The ending position of the feature in the chromosome or  
scaffold. The chromEnd base is not included in the display of the  
feature. For example, the first 100 bases of a chromosome are defined  
as chromStart=0, chromEnd=100, and span the bases numbered 0-99."
2. On the other hand when I download lists of exon start/stop  
positions from hg18>UCSC Genes>knownGene>Exons in BED format, the  
resulting intervals appear to be the opposite: LEFT-OPEN, RIGHT- 
CLOSED.  Here is a representative entry. Looking at the browser, the  
actual exon encompasses bases 2476-2584:
chr1    2475    2584    uc001aaa.2_exon_1_0_chr1_2476_f 0       +

3. I wondered if it was a strand issue, but here is an entry on the -  
strand, which is also LEFT-OPEN, RIGHT-CLOSED.
chr1    4832    4901    uc001aab.2_exon_1_0_chr1_4833_r 0       -

4. Finally, what convention does dbSNP use?  Many seem to use closed  
interval notation, eg rs5887673, listed at chr7:133624600-133624603.

Thanks for any help.

Tim
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to