Hello, Dan. One of our engineers had this to say:
We do our own mapping of RefSeq sequences to the genome. This gene maps to 19 different places in hg19 using our mapping procedure (some on chr4 some on chr10, others on random chromosomes). The procedure is documented here: http://genome.ucsc.edu/cgi-bin/hgTrackUi?c=chr4&g=refGene Blat will introduce gaps in the alignment if the sequence doesn't match the genome. We close blocks of 8bp or less when we make the gene models. Please contact us again at [email protected] if you have any further questions. --- Steve Heitner UCSC Genome Bioinformatics Group -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Dan Richards Sent: Wednesday, January 25, 2012 1:59 PM To: [email protected] Subject: [Genome] How does UCSC hg19 gene model add exons to RefSeqs? Hi, when using the human reference hg19 gene model, there are like this (in GTF output format from hgTable Genes and Gene Prediction group, RefSeq Genes track): chr10 hg19_refGene start_codon 135480472 135480474 0.000000 + . gene_id "NM_012147"; transcript_id "NM_012147"; chr10 hg19_refGene CDS 135480472 135481677 0.000000 + 0 gene_id "NM_012147"; transcript_id "NM_012147"; chr10 hg19_refGene exon 135480432 135481677 0.000000 + . gene_id "NM_012147"; transcript_id "NM_012147"; chr10 hg19_refGene CDS 135484982 135485230 0.000000 + 0 gene_id "NM_012147"; transcript_id "NM_012147"; chr10 hg19_refGene stop_codon 135485231 135485233 0.000000 + . gene_id "NM_012147"; transcript_id "NM_012147"; chr10 hg19_refGene exon 135484982 135485275 0.000000 + . gene_id "NM_012147"; transcript_id "NM_012147"; where the hg19 model has an exon that does not exon exist in the RefSeq accession (or any historical version of the RefSeq accession). How/why does the alignment introduce an intron in this case? Does it ensure there are plausible flanking splice junctions before inserting an intron to a RefSeq sequence that lacks it but it maps to? Dan _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
