Hi all,

I am trying to partition the human genome into a set of disjoint
regions, such that any particular SNP marker belongs to a single
region.  As a first cut I would like these regions to correspond to
known protein-coding genes and their upstream regions.

I wrote some Python&SQL code to query my personal mirror of the hg19
database, but have had trouble using the refGene table.  It seems that
refGene is a table of transcripts with information on where they align
in the genome.  In particular this means that some of the rows in
refGene align to multiple places in the genome.  Is there a way to
extract a canonical set of known, coding, uniquely-aligned genes from
refGene or some other table in your database?

Thank you!
David Alexander
UCLA Department of Biomathematics
http://dalexander.bol.ucla.edu/
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to