Hello, Laura.

As you mentioned, the refGene table does not actually list the Genbank
version number.  The hg19.gbStatus.version field does list the version
number, but the problem is that this is the current version number and not
necessarily the version that was current as of June 23, 2011.  There is also
a field called hg19.gbStatus.modDate that lists the last modified date, but
there are two problems with this.  First, our modDate does not necessarily
coincide precisely with the official Genbank version date (e.g., our modDate
for NM_021219.2 is March 21, 2012 while Genbank lists it as April 21, 2012).
Also, if the particular transcript you are looking at is a version 3 (e.g.,
NR_001458.3), the gbStatus table does not keep a history of previous
versions and modDates, so there is no way to know whether it was NR_001458.1
or NR_001458.2 on June 23, 2011.

We do not keep histories of the refGene table, so there is no June 23, 2011
version of refGene that we can direct you to.  There is no easy way to get a
snapshot of the data as it existed on June 23, 2011.  It is possible to look
directly at Genbank to find the dates corresponding with the various
transcript versions (e.g., http://www.ncbi.nlm.nih.gov/nuccore/NM_021219.1
shows that NM_021219.1 was released on April 24, 2002 and
http://www.ncbi.nlm.nih.gov/nuccore/NM_021219.2  shows that NM_021219.2 was
released on April 21, 2012), but if you have a large number of IDs, this
would be very tedious without some kind of custom script.

Please contact us again at [email protected] if you have any further
questions.

---
Steve Heitner
UCSC Genome Bioinformatics Group

-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Laura Smith
Sent: Tuesday, May 29, 2012 1:56 PM
To: [email protected]
Subject: [Genome] Downloading old refseq and ensemble transcripts with the
"version numbers" in the accession IDs.

Hello, 

I have been using the refseq transcripts and ensemble transcripts downloaded
from UCSC genome browser table on June 23 2011. The transcript IDs in these
datasets that were downloaded from UCSC do not have the version numbers
(such as NM_134564.2)  where ".2" is the version number after the period. 

However, recently, it turns out that I need to have the version numbers of
each transcript.  So, I tried to look for them and download them using the
info provided here, however there is no way for me to choose the refseq
transcripts for the date June 23 2011: 

https://lists.soe.ucsc.edu/pipermail/genome/2011-September/027099.html


Would it be possible for you to please send me the refseq and ensemble
transcripts for June 23 2011 from your archives please which includes the
version numbers for each transcript in them? 


Or if there is a way that I could access this data myself, if you could
please let me know I would very much appreciate it. 


Thank you,
Laura
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome


_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to