Hi John, I spoke to two engineers here about TMSL3 on chr4, and they agreed that it does look like a retroposed gene, but it may or may not be a pseudogene.
TMSL3 on chr4 does appear in a track that is not yet up on the main UCSC Genome Browser site: Retroposed Genes. You can see this track on our test site (but be aware that many tracks on the site, including this one, have not been through our QA process): http://genome-test.cse.ucsc.edu Here is a link to the relevant paper: http://www.biomedcentral.com/1471-2164/9/466 Baertsch R, Diekhans M, Kent J, Haussler D, Brosius J. Retrocopy contributions to the evolution of the human genome. BMC Genomics 2008 Oct 8;9:466. > Am I correct in assuming its validation depends on three mRNA > sequences which on the UCSC browser mRNA information page elicit a > warning about sequence correction to pseudogenes? - e.g. RefSeq originally alerted us to the problem that prompted us to display the warning. They probably don't use these mRNAs in their validations. Additionally, one of the engineers said: --- This does look like it is a retroposed gene whose parent is TBSM4X. However the coding region encodes a 44 aa protein in both cases and both have a Thymosin beta actin-binding motif. Thymosins are known to be short proteins since they are peptide hormones which are often short. The 5' end of the gene is in a CpG island which could indicate that it is transcribed. It could be protein-coding, but there is not enough evidence to indicate that it is even transcribed. The only mRNAs at this locus are from the Invitrogen/Genoscope project and these are not trustworthy. Based on that, I do not have confidence that there is enough evidence that this is a functional protein-coding gene. I will start a discussion with RefSeq and the Gencode group to get their opinions. --- So, thank you for bringing this gene (or pseudogene) to our attention. It has sparked some discussion among the annotators. -- Brooke Rhead UCSC Genome Bioinformatics Group On 05/02/11 01:38, John Edwards wrote: > Hi, > > I have a concern about the validation by ENCODE and REFSEQ of the > gene: > > TMSL3 chr4:91,759,560-91,760,450 > > It appears to be a processed pseudogene of the highly expressed > TMSB4X > > Am I correct in assuming its validation depends on three mRNA sequences which on the UCSC browser mRNA information page elicit a warning about sequence correction to pseudogenes? - e.g. > > http://genome.ucsc.edu/cgi-bin/hgchgsid=193876573&o=91759642&t=91760216&g=mrna&i=CR605307 > > The ESTs which align with TMSL3 appear at least equally, if not more likely to be transcripts from TMSB4X. I'm not clear why they appear aligned against chr4 > > Does this perhaps reflect failure of some pipeline step to detect the > short (11 aa) exon 3, therefore signalling a longer match to the pseudogene? > > John Edwards > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
