Hello David, As an alternative, try using the UCSC Genes track. RefSeq is included along with other inputs.
To focus on a single transcript per gene bound, use the primary table (knownGene) along with the table that clusters the data and notes the canonical transcript (knownCanonical). You can link back the internal transcript identifiers to other alternate identifiers (including RefSeq) using the table kgAlias. The table kgXref can also be used, but be aware the linked IDs in this table can be simply "associated" identifiers, not this transcript's specific alternate identifiers. To see the linkage between tables for this or any other track, use the Table browser (perhaps the one at UCSC, if this was not included in your mirror). Bring up the Table browser, navigate to the genome & track of interest, and leaving the primary table selected - click on the button beside it called "describe table schema". The resulting page will define that primary table and list all associated tables (along with notes about how they are linked). Clicking on any of those tables will bring them to the "top" when their schema is defined and their related tables are listed. This is how we share the overall schema design with users. I hope this information is helpful. Please feel free to contact the help mailing list again if you require further assistance. Best regards, Jen UCSC Genome Browser Support http://genome.ucsc.edu/contacts.html [email protected] [email protected] On 6/25/10 8:38 AM, David Alexander wrote: > Hi all, > > I am trying to partition the human genome into a set of disjoint > regions, such that any particular SNP marker belongs to a single > region. As a first cut I would like these regions to correspond to > known protein-coding genes and their upstream regions. > > I wrote some Python&SQL code to query my personal mirror of the hg19 > database, but have had trouble using the refGene table. It seems that > refGene is a table of transcripts with information on where they align > in the genome. In particular this means that some of the rows in > refGene align to multiple places in the genome. Is there a way to > extract a canonical set of known, coding, uniquely-aligned genes from > refGene or some other table in your database? > > Thank you! > David Alexander > UCLA Department of Biomathematics > http://dalexander.bol.ucla.edu/ > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
