Hi, I understand that the UCSC Genes annotation process consults a range of sources to assign a "genesymbol" to each transcript.
In an earlier post ( https://lists.soe.ucsc.edu/pipermail/genome/2006-April/010350.html), Fan outlined the general strategy - use the RefSeq symbol if a transcript's representative mRNA is a RefSeq, otherwise consult UniProt, etc. Finally, if no symbol can be assigned, use the ID of the mRNA. I'm looking at "uc004ftj.2" (hg19), which has a gene symbol of 'LOC401629'. This transcript seems to be based on a RefSeq (the 'refseq' column in kgxref is NR_002161, and the exons structure matches very closely). That RefSeq record has a gene name of 'NCRNA00230A', which is a HUGO symbol. I'd appreciate it someone might be able to explain the reason why in this case, the RefSeq symbol was not used, and perhaps where LOC401629 was derived from? Thanks for your time. _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
