One clarification: You mentioned that you saw outdated data in the hg18 database (specifically with the UCSC Gene's track data). This version of the UCSC Genes track has not been updated for some time now and will not be updated in the future.
Instead, the UCSC Genes track in the most current human assembly, hg19, would be: 1) the track to use now for the most current UCSC Genes information and 2) the track that would (potentially) be updated in the future (although this has not been planned for at this time). This would also be true for most tracks present in hg18 - no updates. Some important exceptions are the dbSNP track (although dbSNP130 will be the final update), ENCODE tracks (ENCODE at this time is still based primarily on hg18), and GenBank tracks. Tracks based on GenBank data is updated for all active assemblies as I originally described (daily). GenBank data is not updated for assemblies present in the older assembly Archive. Apologies for any confusion, Jennifer On 4/28/10 6:28 PM, Jennifer Jackson wrote: > Hello Ivan, > > This is a good question. > > Some tracks are updated daily - specifically those based solely on > genbank sequences (RefSeq Genes, ESTs, mRNAs). > > Others are updated at certain times, the last update time is always > noted on the track description page. > > Tracks often come from contributors that send the data over once per > reference genome release. Others are based on public datasets that are > under continual revision, and we work on updating these on a priority > basis determined by the team, linked to the amount of change and the > available staff resource at any particular time. > > Some of the tables you note belong to a special track, UCSC Genes. This > is a complicated, conclusion layer track created by UCSC that is only > updated periodically, due to the amount of work it takes to create it. I > cannot give you an estimate about when the UCSC Genes track would next > be updated. > > The dbSNP track is also special. When there is an official release, we > try to update the track as quickly as possible, but it too takes time to > be processed and quality checked before release. You can expect a dbSNP > update fairly soon, as we have been keeping an eye out for the final, > complete db130 release. The current db130 SNP track in the browser was > provisional, based on hg18, and we lifted to hg19 to help users until > dbSNP had the data available. > > I hope this helps to explain our methods a bit more, > Jennifer > > --------------------------------- > Jennifer Jackson > UCSC Genome Informatics Group > http://genome.ucsc.edu/ > > On 4/28/10 6:12 PM, Ivan Adzhubey wrote: >> Hi, >> >> I was wondering if there is a defined policy for Genome Browser track >> updates? >> I am most interested in MySQL versions of the tracks available for download. >> We maintain a local mirror of many UCSC tracks and use them extensively. >> However, some of the tracks seem to be out of sync with the various external >> databases and other sources of information they utilize. >> >> I understand updating tracks based on results of previously published >> research >> can be difficult without some form of collaboration with the authors. But >> what >> about cross-references to external databases? For instance, our validation >> scripts indicate that currently about 12% of the UniProt identifiers in the >> hg18.kgXref table are wrong, with the majority of errors coming from >> incorrectly assigned protein isoforms. dbSNP track also shows substantial >> number of errors. Considering that dbSNP, RefSeq and UniProtKB databases are >> all fast moving targets, this accumulation of cross-referencing errors is not >> surprising. So, what is Genome Browser maintainers position in regard to >> tracks updates? >> >> Best, >> Ivan >> >> > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
