Hi, again, Duke,
I would additionally point out that what you have would not work for
the size of the UTRs if the UTR was split by an intron. In that case,
you would have to account for the intron as well.
--b0b
On 3/22/2011 4:05 PM, robert kuhn wrote:
> Hello, Duke,
>
> It looks as if you understand it correctly, though I would offer that
> if you actually perform the subtractions you show, then you would
> get the size, not the coordinates. Though if you interpret the "-"
> in your message to mean the "through", then you have defined the interval
> properly, though in reverse. E.g., txEnd-cdsEnd should read "cdsEnd through
> txEnd" if you mean the interval, as the txEnd should always be greater than
> the cdsEnd.
>
> best wishes,
>
> --b0b kuhn
> ucsc genome bioinformatics group
>
> On 3/21/2011 7:29 AM, Duke wrote:
>> Hi folks,
>>
>> Please correct me if I am wrong. I am dealing with how to get the
>> cordinates of different genome regions such as UTR/intergenic/intragenic
>> etc... and from the genePred format
>> (http://genome.ucsc.edu/FAQ/FAQformat.html#format9), I think I can get
>> them like follow:
>>
>> If Strand = '+':
>>
>> 3UTR = txEnd-cdsEnd
>> 5UTR = cdsStart-txStart
>> Intragenic(i) = exonEnds(i)-exonStarts(i)
>> Intergenic = all regions that do not overlap with gene cordinates
>> (between txStart and txEnd)
>>
>> For Strand = '-', everything should be reversed, such as 3UTR =
>> cdStart-txStart etc...
>>
>> Thank you very much in advance,
>>
>> D.
>> _______________________________________________
>> Genome maillist - [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
> _______________________________________________
> Genome maillist - [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome