Hi Jennifer, What I have found is that the locations of almost all human's NM transcripts in RefGene file are off from NCBI's by usually at least thousands of bp. For example, NM_152486's transcript starts at 850984 and end at 869824 which I looked it up on NCBI's Entrez Gene. But in RefGene file that I downloaded yesterday, it starts from 861120 and ends at 879961. However, I looked this gene up from a RefGene file that I downloaded 2 weeks ago, the data match with what are in NCBI currently.
Thank you, Phoenix -----Original Message----- From: Jennifer Jackson [mailto:[email protected]] Sent: Tuesday, May 05, 2009 6:41 PM To: Kwan, Phoenix Cc: [email protected] Subject: Re: [Genome] genomic locations in RefGene file Hello, RefSeq sequences are independently aligned using BLAT and the coordinates are based on complete chromosomes. The genomic version/source is noted on the gateway page for each assembly. Some differences in alignment position are known and expected for this track, but most should be the same or very similar for the same version of an assembly and query RefSeq. The UCSC Browser uses a different method of storing coordinates than NCBI. This may be the source of the discrepancy. Please read the documentation below and if you still have some questions, send a few examples (database, refseqID, NCBI coordinates as you interpret them, UCSC coordinates as you interpret them) for review and feedback. The main table for the RefSeq Genes track is called refGene and is in genePred format http://genome.ucsc.edu/FAQ/FAQformat#format9 Description of UCSC Browser coordinate system http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms Jennifer Jackson UCSC Genome Bioinformatics Group Kwan, Phoenix wrote: > Hi, > > I have found that most of the locations for the NM transcripts in the RefGene > file do not match with what are in NCBI. Are the positions in the RefGene > file genomic locations relative to the full length of a chromosome? Or > something else? > > Thank you very much for your time, > Phoenix Kwan > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
