Hello, Dr. Srebniak,
That is a bit of a complicated question. Any data in a database
should be viewed skeptically and should not be relied upon without
corroboration from other sources. That is one of the main reasons
we host so many data sets: So the professionals who use the data
can use it =all= to reach informed decisions based on their own
knowledge and experience.
The annotations on hg19 are as correct as we can make them. We
rely on data contributors to curate and map their own data and
also run our own sanity checks and quality assurance on the data.
We do not release any data with any known errors, but much data
come from automated pipelines and even a 1 in 10,000 error rate
could mean 1000s of errors in a table with millions of rows.
We have put some tracks on hg19 that have been remapped by the data
contributors, such as Database Genomic Variants (DGV). We reflect
what they produce. Because some researchers believe that some of
the data sources in DGV are less reliable than others (because of
low resolution probesets used to produce it), we provide a
configuration page so you can select for display the datasets in
which you have the most confidence.
The DECIPHER track has been lifted to hg19 by the DECIPHER team
in collaboration with colleagues at NCBI. They are likely as
reliable as the annotations for this track on hg18. The OMIM
track has been through our pipeline independently for hg19, and
our mappings are as accurate for hg19 as for hg18. We are
working with the team at OMIM to release their data in several
subtracks that should be more useful than the existing track.
We expect those tracks to be released within a few months.
Other datasets have been lifted to hg19 by our staff. These are
designated by a black ball with "18" on the track controls. If
you click on this ball, you can read about precautions to keep
in mind when using these data. In particular, no sequences in
hg19 that did not exist in hg18 can have annotations in a lifted
track. We are hoping that data contributors for these tracks will
provide us with their direct mappings to hg19. There is a track
in the top track group (Hg18 Diff) that allows you to visualize
where these differences lie. There is a similar track on hg18.
We follow the same convention for microarray tracks, separating
lifted tracks from tracks mapped by the array producers. We are
hoping that all data producers will provide us with probe mappings
to hg19 that they have produced themselves.
Finally, we have just released a new set of tracks representing
the dbSnp release 132 for hg19. If you are using sequencing for
any of your analysis you should read carefully the track descriptions
for the three tracks to determine you to best use them. We believe
separating the dbSnp data into these three tracks is an improvement
over the way we have represented snp131 and the earlier releases on
hg18.
> We would like to be sure that we will not miss any information.
You will have to compare for yourself to determine if all the data
tracks you like to use are present on hg19.
I cannot emphasize enough how important it is to use all available
data and your professional judgment before reaching any diagnostic
conclusions.
best wishes, and thank you for your interest in the Browser.
--b0b kuhn
ucsc genome bioinformatics group
On 4/13/2011 8:34 AM, M.I. Srebniak wrote:
> Dear Madam or Sir,
> I would like to ask whether all annotations are in both Hg18 en Hg 19 now
> correct. Some time ago we heard form a colleage of ours that not all
> annotations available in Hg 18 are already in Hg 19. We are a diagnostic
> laboratory and we are considering using Hg19 now. We would like to be sure
> that we will not miss any information. Is there any difference in
> annotations in Hg18 and Hg 19 (of course there some regions with other
> localisation). Are there arguments against using only Hg 19? Appear DGV
> ook correctly in Hg19?
> Best regards,
> Gosia Srebniak
>
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome