Hi Abhi, UCSC Genes has a table called knownCanonical, which contains one transcript per gene cluster (more information about how this table is constructed can be found here: https://lists.soe.ucsc.edu/pipermail/genome/2010-September/023611.html). Using the table browser (http://genome.ucsc.edu/cgi-bin/hgTables) we can obtain a list of HUGO ids that accompany each gene in this table:
1) Select your clade, genome and assembly of interest. UCSC Genes should be automatically selected as the track 2) Select "knownCanonical" as the table. 3) Select "selected fields from primary and related tables" as the output format, enter a file name if desired, and click "get ouput". 4) Scroll down, select proteome.hgnc and click "Allow Selection From Checked Tables". 5) Select chrom, chromStart, chromEnd from hg19.knownCanonical and hgncId, symbol and name from proteome.hgnc and click "get output". Please keep in mind that some genes in the knownCanonical table may not have HUGO equivalents. I hope this information is helpful. Please feel free to contact the mail list again if you require further assistance. Best, Mary ------------------ Mary Goldman UCSC Bioinformatics Group On 9/28/10 10:52 AM, Pratap, Abhishek wrote: > Hi > > > I would like to know if there is a straight forward way to retrieve human > genes with start and end position. I want the gene names to be in HUGO format > meaning actual gene names and not the refseq/ucsc genes. Also I am interested > in genes and not the transcripts. I got the bed file for refseq genes but I > guess it has all the possible annotated transcripts for a gene and also the > gene name is in NCBI NM_** format. > > Thanks! > -Abhi > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
