Thanks Mary. Your information is very helpful. Bests,
D. On 3/23/11 7:52 PM, Mary Goldman wrote: > Hi Duke, > > For non-coding genes (who, by definition, have a coding region size of > 0), cdsStart will always equal cdsEnd in the genePred format. Since > there is no coding region to indicate, it doesn't matter what the > actual genome coordinates are for the cdsStart and cdsEnd (just that > they are equal to each other). As a convention to help with > standardization, we have made the cdsStart equal the txtStart for > non-coding genes. Likewise, there are no UTRs (UnTranslated Regions) > for non-coding genes because there is no translated region (or coding > sequence). > > I hope this information is helpful. Please feel free to contact the > mail list again if you require further assistance. > > Best, > Mary > ------------------ > Mary Goldman > UCSC Bioinformatics Group > > On 3/23/11 6:59 AM, Duke wrote: >> Hi Bob, >> >> Thanks. Yes, after actually having some maths, I also recognized that it >> is more complicated than I thought, especially in cases UTR intron >> (intron inside UTR regions). This also applies for coding regions as >> well, if there is any intron in themselves. One thing I also found out >> (and not quite understand) is that in case of non-coding genes, for >> example Mrpl15 - NR_033530 in mouse mm9: >> >> Mrpl15 NR_033530 chr1 - 4763278 4775807 4775807 >> 4775807 4 4763278,4767605,4772648,4775653, >> 4764597,4767729,4772814,4775807, >> >> I understand that this is non-coding gene, so there is no coding region >> for it. But instead of two empty cordinates at cdsStart and cdsEnd, we >> have two identical cordinates 4775807. Does that mean coding region size >> = 0 at 4775807 or it is just a convenient way for genePred format? In >> this case, how do I understand the differentiation between 3' UTR and 5' >> UTR? Does that mean 5' UTR size = 0 and 3' UTR is (4763278, 4775807) or >> both of them are the same and are (4763278, 4775807)? >> >> Thanks, >> >> D. >> >> On 3/23/11 3:25 AM, robert kuhn wrote: >>> Hi, again, Duke, >>> >>> I would additionally point out that what you have would not work for >>> the size of the UTRs if the UTR was split by an intron. In that case, >>> you would have to account for the intron as well. >>> >>> --b0b >>> >>> >>> On 3/22/2011 4:05 PM, robert kuhn wrote: >>>> Hello, Duke, >>>> >>>> It looks as if you understand it correctly, though I would offer that >>>> if you actually perform the subtractions you show, then you would >>>> get the size, not the coordinates. Though if you interpret the "-" >>>> in your message to mean the "through", then you have defined the >>>> interval >>>> properly, though in reverse. E.g., txEnd-cdsEnd should read "cdsEnd >>>> through >>>> txEnd" if you mean the interval, as the txEnd should always be >>>> greater than >>>> the cdsEnd. >>>> >>>> best wishes, >>>> >>>> --b0b kuhn >>>> ucsc genome bioinformatics group >>>> >>>> On 3/21/2011 7:29 AM, Duke wrote: >>>>> Hi folks, >>>>> >>>>> Please correct me if I am wrong. I am dealing with how to get the >>>>> cordinates of different genome regions such as >>>>> UTR/intergenic/intragenic etc... and from the genePred format >>>>> (http://genome.ucsc.edu/FAQ/FAQformat.html#format9), I think I can >>>>> get them like follow: >>>>> >>>>> If Strand = '+': >>>>> >>>>> 3UTR = txEnd-cdsEnd >>>>> 5UTR = cdsStart-txStart >>>>> Intragenic(i) = exonEnds(i)-exonStarts(i) >>>>> Intergenic = all regions that do not overlap with gene cordinates >>>>> (between txStart and txEnd) >>>>> >>>>> For Strand = '-', everything should be reversed, such as 3UTR = >>>>> cdStart-txStart etc... >>>>> >>>>> Thank you very much in advance, >>>>> >>>>> D. >>>>> _______________________________________________ >>>>> Genome maillist - [email protected] >>>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome >>>> _______________________________________________ >>>> Genome maillist - [email protected] >>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome >> _______________________________________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
