I've found some things in refFlat that I don't understand. Perhaps somebody can help shed some light on this.
Intuitively it seemed to me that in most circumstances, all of the records with the same geneName should be in about the same place, and certainly in the same orientation on the same chromosome. However, I have found several situations where this is not the case. Some of these make sense to me, for example, genes in the PARs have records on both chrX and chrY. Also, there are several that have some records on the "hap" sequences. These I can understand. Others truly puzzle me. Maybe somebody can help me interpret. First example is MAGEA2. This gene has two locations on chrX: MAGEA2 chrX - 151918388 151922364 3 MAGEA2 chrX + 151883119 151887095 3 I don't understand how the same gene could be in two different places? In some cases they are even on different chromosomes. In many cases, there seem to be duplicates with different geneName/names. For example: MIR4509-1 NR_039732 chr15 - 22675147 22675241 MIR4509-2 NR_039733 chr15 - 22675147 22675241 MIR4509-3 NR_039734 chr15 - 22675147 22675241 MIR4509-1 NR_039732 chr15 + 28671636 28671730 MIR4509-2 NR_039733 chr15 + 28671636 28671730 MIR4509-3 NR_039734 chr15 + 28671636 28671730 MIR4509-1 NR_039732 chr15 - 28735897 28735991 MIR4509-2 NR_039733 chr15 - 28735897 28735991 MIR4509-3 NR_039734 chr15 - 28735897 28735991 In this case, there are three geneName/name combinations, and three loci, and each geneName/name has a record in each locus. There are hundreds of these that I've found. I get the impression that I'm not using this data correctly, and perhaps there would be a better table to be using for the purpose of locating genes and annotated transcripts on the genome. Can anybody explain this to me? Michael ________________________________ Email Disclaimer: www.stjude.org/emaildisclaimer _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
