Hi Yunfei, The cdsStart and cdsEnd are always given in reference to the positive strand. In order to find the transcription start site, you will use the cdsStart for genes on the positive strand and cdsEnd for genes on the negative strand. Your second example, NR_024227, is actually a special case where there is no transcription start site since it is a non-coding gene. This is indicated by having the cdsStart equal the cdsEnd.
As far as where to start numbering the bases, note that the UCSC Browser uses a zero-based coordinate system for our internal databases, including the file you mention below, and a one-based coordinate system for display. For more information, please see this FAQ: http://genome.ucsc.edu/FAQ/FAQtracks#tracks1. I hope this information is helpful. Please feel free to contact the mail list again if you require further assistance. Best, Mary ------------------ Mary Goldman UCSC Bioinformatics Group On 6/24/10 1:23 AM, Li, Yunfei wrote: > Hello, > > I want to consult on how to understand the "Refgene.txt" file. > For example > " > 1654 NM_031497 chr5 + 140180782 140183255 > 140180782 140183255 1 140180782, 140183255, 0 > PCDHA3 cmpl incmpl 0, > 971 NR_024227 chr19 - 50595745 50595866 > 50595866 50595866 1 50595745, 50595866, 0 > SNAR-A6 unk unk -1, > " > If I would like to locate the Tss of the first gene, so I go through the > sequence of chr5, starting from the first base which count as "0", until the > base whose count equal to "cdsStart". Is this correct? > Besides, how can I locate the second gene then? Do I search from the > beginning of the reverse compliment of chr19 sequence? I mean is the first > base on 5' direction of the other strand of chr19 is viewed as the first base > now and count as "0"? > > Best, > > Yunfei Li > -------------------------------------------------------------------------------------- > Research Assistant > Department of Statistics& > School of Molecular Biosciences > Biotechnology Life Sciences Building 427 > Washington State University > Pullman, WA 99164-7520 > Phone: 509-339-5096 > http://www.wsu.edu/~ye_lab/people.html > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
