I've loaded SNP130 into a local database (thank you very much for the data 
files, etc.) and have some questions about the data.

To start, my understanding is that chromosome positions are [start, end), i.e. 
from start (inclusive) to stop (exclusive).  Or, to put it another way if start 
- 5 and end = 6, then you have a 1 bp feature at position 5.

No?

Because I got these results from some searches:
mysql> select count(*) from snp130 where chromStart = chromEnd;
+----------+
| count(*) |
+----------+
| 2,632,502|
+----------+

mysql> select count(*) from snp130 where chromStart = chromEnd - 1;
+----------+
| count(*) |
+----------+
|15,322,316|
+----------+

The fact that you have roughly 6x SNPs where chromEnd - chromStart = 1 says to 
me that my understanding should be correct, but that leaves me wondering why 
there are 2.6 million "SNPs" that don't cover any bases.

Also, IIRC, the first base of a chromosome is base 0, yes?

TIA,

Greg
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to