Hi Chirag,
The first category which you have listed are actually zero-length items.
These represent point insertions into the reference genome according to
dbSNP's mapping of flanking sequences. In addition to having length 0,
the 'class' and 'exception' columns may also be helpful in identifying
these cases. For example:
mysql> select chrom,chromStart,chromEnd,name,score,strand,class,exceptions
from snp135Common where name in ('rs11466671', 'rs34348110', 'rs33986159',
'rs70937055', 'rs9657973', 'rs2307539');
+-------+------------+----------+------------+-------+--------+-----------+--------------------------------------+
| chrom | chromStart | chromEnd | name | score | strand | class |
exceptions |
+-------+------------+----------+------------+-------+--------+-----------+--------------------------------------+
| chr1 | 1142403 | 1142403 | rs11466671 | 0 | - | insertion |
|
| chr1 | 8099134 | 8099134 | rs2307539 | 0 | - | insertion |
NonIntegerChromCount |
| chr1 | 1147985 | 1147985 | rs33986159 | 0 | - | mixed |
ObservedTooLong |
| chr1 | 1147978 | 1147978 | rs34348110 | 0 | - | single |
SingleClassZeroSpan,ObservedMismatch |
| chr1 | 1986733 | 1986733 | rs70937055 | 0 | + | insertion |
NonIntegerChromCount |
| chr1 | 7996478 | 7996478 | rs9657973 | 0 | - | insertion |
|
+-------+------------+----------+------------+-------+--------+-----------+--------------------------------------+
mysql> select chrom,chromStart,chromEnd,name,score,strand,class,exceptions
from snp135Common where name in ('rs55998931', 'rs58108140', 'rs75454623',
'rs71262674', 'rs71262673', 'rs75468675');
+-------+------------+----------+------------+-------+--------+--------+------------+
| chrom | chromStart | chromEnd | name | score | strand | class |
exceptions |
+-------+------------+----------+------------+-------+--------+--------+------------+
| chr1 | 10491 | 10492 | rs55998931 | 0 | + | single |
|
| chr1 | 10582 | 10583 | rs58108140 | 0 | + | single |
|
| chr1 | 20303 | 20304 | rs71262673 | 0 | - | single |
|
| chr1 | 20244 | 20245 | rs71262674 | 0 | - | single |
|
| chr1 | 14929 | 14930 | rs75454623 | 0 | + | single |
|
| chr1 | 33494 | 33495 | rs75468675 | 0 | + | single |
|
+-------+------------+----------+------------+-------+--------+--------+------------+
Note that in the second category (the majority of SNPs), with length=1,
the class is single and the exceptions column is often empty which is good.
In the first category, with length=0, the class is usually insertion,
perhaps mixed, and possibly single -- but when the class is single we
add the exception SingleClassZeroSpan that something is odd with the
annotation. It is up to you to decide how to treat items with exceptions
-- for example, you might want to exclude them from analysis, or drill
down by clicking through the Genome Browser's item details page link to
dbSNP's report page for that variant. Sometimes annotations that were a
little off in the last release have been updated on dbSNP's web site.
I hope this information is useful and answers your question. Please
contact us again at [email protected] if you have any further questions.
---
Luvina Guruvadoo
UCSC Genome Bioinformatics Group
On 6/22/2012 4:30 AM, Chirag Nepal wrote:
> Hi there,
>
> I was if there has been a mix-up of 0 base and1 base in dbsnp-135 files.
>
>
> Most data are 0 based, while there are around 5473 entires which seem to be
> in 1 base
>
> For e.g. with 1 base
> chr1 1142403 1142403 rs11466671 0 -
> chr1 1147978 1147978 rs34348110 0 -
> chr1 1147985 1147985 rs33986159 0 -
> chr1 1986733 1986733 rs70937055 0 +
> chr1 7996478 7996478 rs9657973 0 -
> chr1 8099134 8099134 rs2307539 0 -
>
>
> While the remaining> 3million SNPs are 0 base
> chr1 10491 10492 rs55998931 0 +
> chr1 10582 10583 rs58108140 0 +
> chr1 14929 14930 rs75454623 0 +
> chr1 20244 20245 rs71262674 0 -
> chr1 20303 20304 rs71262673 0 -
> chr1 33494 33495 rs75468675 0 +
>
> Please let me know, why this discrepancy here, if for any reasons.
>
>
>
> cheers
> CN
> _______________________________________________
> Genome maillist - [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome