Hi Jennifer,

What I have found is that the locations of almost all human's NM transcripts in 
RefGene file are off from NCBI's by usually at least thousands of bp.  For 
example, NM_152486's transcript starts at 850984 and end at 869824 which I 
looked it up on NCBI's Entrez Gene.  But in RefGene file that I downloaded 
yesterday, it starts from 861120 and ends at 879961.  However, I looked this 
gene up from a RefGene file that I downloaded 2 weeks ago, the data match with 
what are in NCBI currently.


Thank you,
Phoenix

-----Original Message-----
From: Jennifer Jackson [mailto:[email protected]]
Sent: Tuesday, May 05, 2009 6:41 PM
To: Kwan, Phoenix
Cc: [email protected]
Subject: Re: [Genome] genomic locations in RefGene file

Hello,

RefSeq sequences are independently aligned using BLAT and the
coordinates are based on complete chromosomes. The genomic
version/source is noted on the gateway page for each assembly. Some
differences in alignment position are known and expected for this track,
but most should be the same or very similar for the same version of an
assembly and query RefSeq.

The UCSC Browser uses a different method of storing coordinates than
NCBI. This may be the source of the discrepancy. Please read the
documentation below and if you still have some questions, send a few
examples (database, refseqID, NCBI coordinates as you interpret them,
UCSC coordinates as you interpret them) for review and feedback.

The main table for the RefSeq Genes track is called refGene and is in
genePred format
http://genome.ucsc.edu/FAQ/FAQformat#format9

Description of UCSC Browser coordinate system
http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms

Jennifer Jackson
UCSC Genome Bioinformatics Group



Kwan, Phoenix wrote:
> Hi,
>
> I have found that most of the locations for the NM transcripts in the RefGene 
> file do not match with what are in NCBI.  Are the positions in the RefGene 
> file genomic locations relative to the full length of a chromosome?  Or 
> something else?
>
> Thank you very much for your time,
> Phoenix Kwan
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to