Dear UCSC genome folks:

I've made a "protein-coding homology" track for some of your genomes.
I think it might be of general interest: might you consider including
it in your browser?

The track shows regions that are homologous to protein coding DNA.
This usually means that the regions are themselves protein-coding, or
they used to be (i.e. they are pseudogenes).  It was constructed
simply by finding local alignments between the genome and all known
protein sequences.

To see why this is useful, look at this example for mouse:

http://genome.ucsc.edu/cgi-bin/hgTracks?db=mm9&position=chr10:57776301-57778100&hgt.customText=http://seq.cbrc.jp/~martin/cds-homology/mm9/cds-homology.psl.gz

This shows a region of chr10 that has high conservation, especially
PhastCons Vertebrate conservation.  This region has no other
annotations: no gene predictions, etc.  Without the new track, it
looks like it could be an interesting conserved enhancer element or
RNA gene.  But the new track shows it has protein-coding homology.
The alignment has frame disruptions, so it is a pseudogene.  (It would
be nice if you could click and see the alignment, but I don't know how
to do that.)  It appears conserved because the parent gene is
conserved and the pseudogene is recent.

In short, this track explains lots of apparently
evolutionarily-conserved elements that lack any other annotation.

Here are tracks for human, rat, dm3, ce6, cb3:

http://genome.ucsc.edu/cgi-bin/hgTracks?db=cb3&hgt.customText=http://seq.cbrc.jp/~martin/cds-homology/cb3/cds-homology.psl.gz

http://genome.ucsc.edu/cgi-bin/hgTracks?db=ce6&hgt.customText=http://seq.cbrc.jp/~martin/cds-homology/ce6/cds-homology.psl.gz

http://genome.ucsc.edu/cgi-bin/hgTracks?db=dm3&hgt.customText=http://seq.cbrc.jp/~martin/cds-homology/dm3/cds-homology.psl.gz

http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hgt.customText=http://seq.cbrc.jp/~martin/cds-homology/hg19/cds-homology.psl.gz

http://genome.ucsc.edu/cgi-bin/hgTracks?db=rn4&hgt.customText=http://seq.cbrc.jp/~martin/cds-homology/rn4/cds-homology.psl.gz

Here are the scripts for making the tracks automatically:
http://seq.cbrc.jp/~martin/cds-homology/cds-homology.zip

The track details page has some more info.

Have a nice weekend,
Martin Frith
http://www.cbrc.jp/~martin/
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to