Hi Kurinji, The IDs in your data set seem to be Entrez gene IDs, not UCSC. Though I wasn't familiar with DAVID, I took a look and it looks like they list both Entrez and UCSC gene IDs as supported. In any case, you can use these Entrez IDs to get the gene symbol using our table browser: http://genome.ucsc.edu/cgi-bin/hgTables
Use the following (I believe these are mm9 ids?): group: all tables db: mm9? table: knownToLocusLink (note: Locus Link = Entrez) filter: ...create a filter and paste a space separated list of your Entrez Gene IDs in the value field (e.g. value [does] match [22410 22411 22638 ....]) I know the field is small, but you can paste a long list in there. output format: selected fields from primary and related tables click "get output" In the linked tables list, check "kgXref" then scroll to bottom and click "Allow Selection From Checked Tables" Scroll up and check off "name" and "value" from knownToLocusLink, and "geneSymbol" from kgXref. Click get output. Please let us know if you have any additional questions: [email protected] - Greg Roe UCSC Genome Bioinformatics Group On 8/8/11 11:40 AM, Kurinji Pandiyan wrote: > Hello, > > I am trying to extract gene symbols from UCSC gene IDs I got from some > analysis in R. > > > Here is the data output I got from a package DESeq in R - just the head > of the table: I just want to extract the gene symbol from the id. > > id baseMean baseMeanA baseMeanB foldChange log2FoldChange > pval > 22410 653048 1820.968 0.000000 3641.937 Inf Inf > 2.183608e-96 > 22411 653067 1820.968 0.000000 3641.937 Inf Inf > 2.183608e-96 > 22412 653219 1820.968 0.000000 3641.937 Inf Inf > 2.183608e-96 > 22413 653220 1820.968 0.000000 3641.937 Inf Inf > 2.183608e-96 > 22638 9503 1820.968 0.000000 3641.937 Inf Inf > 2.183608e-96 > 22554 8277 1587.346 2.304498 3172.388 1376.60684 10.426901 > 3.572313e-84 > 1434 283120 2777.587 118.681666 5436.493 45.80735 5.517507 > 7.881047e-75 > 21708 100132399 1687.382 21.892735 3352.871 153.14994 7.258801 > 6.876206e-74 > 22485 729422 1682.980 21.892735 3344.067 152.74780 7.255008 > 9.705880e-74 > 22487 729431 1682.980 21.892735 3344.067 152.74780 7.255008 > 9.705880e-74 > > Where id refers to the USCS ID. I have 256 elements in the table so would > prefer not to do it manually! > > Looks like this is a custom UCSC reference so I am unable to use online > conversion tools such as DAVID to convert. I have been able to look up gene > symbols one by one but this is tedious. > > I would appreciate it if you would tell me - > > 1. how to extract the GENE symbols from these IDs in a high throughput > fashion > 2. what exactly these identifiers are - do they refer to specific > transcripts? how do I determine that? > > Thanks! > Kurinji > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
