Hi Chirag,

The first category which you have listed are actually zero-length items. 
These represent point insertions into the reference genome according to 
dbSNP's mapping of flanking sequences. In addition to having length 0, 
the 'class' and 'exception' columns may also be helpful in identifying 
these cases. For example:

mysql>  select chrom,chromStart,chromEnd,name,score,strand,class,exceptions 
from snp135Common where name in ('rs11466671', 'rs34348110', 'rs33986159', 
'rs70937055', 'rs9657973', 'rs2307539');
+-------+------------+----------+------------+-------+--------+-----------+--------------------------------------+
| chrom | chromStart | chromEnd | name       | score | strand | class     | 
exceptions                           |
+-------+------------+----------+------------+-------+--------+-----------+--------------------------------------+
| chr1  |    1142403 |  1142403 | rs11466671 |     0 | -      | insertion |     
                                 |
| chr1  |    8099134 |  8099134 | rs2307539  |     0 | -      | insertion | 
NonIntegerChromCount                 |
| chr1  |    1147985 |  1147985 | rs33986159 |     0 | -      | mixed     | 
ObservedTooLong                      |
| chr1  |    1147978 |  1147978 | rs34348110 |     0 | -      | single    | 
SingleClassZeroSpan,ObservedMismatch |
| chr1  |    1986733 |  1986733 | rs70937055 |     0 | +      | insertion | 
NonIntegerChromCount                 |
| chr1  |    7996478 |  7996478 | rs9657973  |     0 | -      | insertion |     
                                 |
+-------+------------+----------+------------+-------+--------+-----------+--------------------------------------+

mysql>  select chrom,chromStart,chromEnd,name,score,strand,class,exceptions 
from snp135Common where name in ('rs55998931', 'rs58108140', 'rs75454623', 
'rs71262674', 'rs71262673', 'rs75468675');
+-------+------------+----------+------------+-------+--------+--------+------------+
| chrom | chromStart | chromEnd | name       | score | strand | class  | 
exceptions |
+-------+------------+----------+------------+-------+--------+--------+------------+
| chr1  |      10491 |    10492 | rs55998931 |     0 | +      | single |        
    |
| chr1  |      10582 |    10583 | rs58108140 |     0 | +      | single |        
    |
| chr1  |      20303 |    20304 | rs71262673 |     0 | -      | single |        
    |
| chr1  |      20244 |    20245 | rs71262674 |     0 | -      | single |        
    |
| chr1  |      14929 |    14930 | rs75454623 |     0 | +      | single |        
    |
| chr1  |      33494 |    33495 | rs75468675 |     0 | +      | single |        
    |
+-------+------------+----------+------------+-------+--------+--------+------------+

Note that in the second category (the majority of SNPs), with length=1, 
the class is single and the exceptions column is often empty which is good.

In the first category, with length=0, the class is usually insertion, 
perhaps mixed, and possibly single -- but when the class is single we 
add the exception SingleClassZeroSpan that something is odd with the 
annotation. It is up to you to decide how to treat items with exceptions 
-- for example, you might want to exclude them from analysis, or drill 
down by clicking through the Genome Browser's item details page link to 
dbSNP's report page for that variant. Sometimes annotations that were a 
little off in the last release have been updated on dbSNP's web site.

I hope this information is useful and answers your question. Please 
contact us again at [email protected] if you have any further questions.

---
Luvina Guruvadoo
UCSC Genome Bioinformatics Group


On 6/22/2012 4:30 AM, Chirag Nepal wrote:
> Hi there,
>
> I was if there has been a mix-up of 0 base and1 base in dbsnp-135 files.
>
>
> Most data are 0 based, while there are around 5473 entires which seem to be 
> in 1 base
>
> For e.g. with 1 base
> chr1  1142403 1142403 rs11466671      0       -
> chr1  1147978 1147978 rs34348110      0       -
> chr1  1147985 1147985 rs33986159      0       -
> chr1  1986733 1986733 rs70937055      0       +
> chr1  7996478 7996478 rs9657973       0       -
> chr1  8099134 8099134 rs2307539       0       -
>
>
> While the remaining>  3million SNPs are 0 base
> chr1  10491   10492   rs55998931      0       +
> chr1  10582   10583   rs58108140      0       +
> chr1  14929   14930   rs75454623      0       +
> chr1  20244   20245   rs71262674      0       -
> chr1  20303   20304   rs71262673      0       -
> chr1  33494   33495   rs75468675      0       +
>
> Please let me know, why this discrepancy here, if for any reasons.
>
>
>
> cheers
> CN
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to