Hello, Duke,
It looks as if you understand it correctly, though I would offer that
if you actually perform the subtractions you show, then you would
get the size, not the coordinates. Though if you interpret the "-"
in your message to mean the "through", then you have defined the interval
properly, though in reverse. E.g., txEnd-cdsEnd should read "cdsEnd through
txEnd" if you mean the interval, as the txEnd should always be greater than
the cdsEnd.
best wishes,
--b0b kuhn
ucsc genome bioinformatics group
On 3/21/2011 7:29 AM, Duke wrote:
> Hi folks,
>
> Please correct me if I am wrong. I am dealing with how to get the
> cordinates of different genome regions such as UTR/intergenic/intragenic
> etc... and from the genePred format
> (http://genome.ucsc.edu/FAQ/FAQformat.html#format9), I think I can get
> them like follow:
>
> If Strand = '+':
>
> 3UTR = txEnd-cdsEnd
> 5UTR = cdsStart-txStart
> Intragenic(i) = exonEnds(i)-exonStarts(i)
> Intergenic = all regions that do not overlap with gene cordinates
> (between txStart and txEnd)
>
> For Strand = '-', everything should be reversed, such as 3UTR =
> cdStart-txStart etc...
>
> Thank you very much in advance,
>
> D.
> _______________________________________________
> Genome maillist - [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome