Hi Schragi,

I asked one of our engineers about this case.  Here is what she said:

----
A subtlety of the UCSC Genes pipeline is that it doesn't put two genes 
in the same cluster if they have different translation frames. In this 
case, these isoforms have different translation frames. The easiest way 
to see it is in the protein translations, which are totally different. 
So that's why these isoforms don't cluster together, even though they 
seem to overlap each other perfectly.
----

If you have further questions, please feel free to contact us again at 
[email protected].

--
Brooke Rhead
UCSC Genome Bioinformatics Group


On 01/17/11 01:08, Schragi Schwartz wrote:
> Hi,
> I was under the impression that the canonical genes set was a dataset of
> non-redundant genes, with different isoforms clustered as a single,
> representative transcript. However, I now came across the two ids
> "uc002hra.1" and "uc010cvq.1", which are obviously two splice isoforms, and
> yet they are both independently represented in the canonical gene dataset.
> I would be very grateful if you could clarify what's going on.
> Best,
> Schragi
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to