Hi Schragi,
I am assuming that you want to be able to match the human genes to the
mouse genes and not end up with just a list of common genes without the
association to the original human refSeq identifier.
You can convert the human refseq ids to mouse UCSC gene identifiers by
using the Table Browser which you can get to by clicking on "Tables"
from the blue navigation bar.
Step 1: Find mouse UCSC known genes from human refSeq
Use the Table Browser to translate your Human RefSeqIDs to mouse
KnownGenesIDs
- Select the human genome that you are using (eg. hg19), then select the
following:
- For group: "All Tables"
- For table: "knownGene" (this table holds the relationship between
human refSeq and knownGeneIDs genes, among others).
- For identifiers (names/accessions): Click on "paste list". and then
paste your list of human refSeq identifiers. (NOTE: you may find not all
of them have matches which means you will need to remove those that
don't match one by one based on the message that displays on the
screen). Click "submit".
- For output type: "selected fields from primary and related tables".
- Click "get output".
- Select fields: 'name', and then under the Linked Tables list, select
"mmBlastTab" and click "Allow Selection from Linked Tables".
- Scroll up to to the hg19.mmBLastTab fields, select 'target', and then
click "get output".
You should have a list that looks like this:
#hg19.refGene.name hg19.mmBlastTab.target
NM_001001740 uc007ddz.1,uc007ddz.1,
NM_001080397 uc008vxt.1,uc008vxt.1,
NM_001145277 uc008vnu.1
NM_001145278 uc008vnu.1
NM_001195683 n/a
NM_001195684 n/a
NM_001918 uc008rcg.1
NM_002744 uc008wdd.1
NM_003243 uc008ylz.1,uc008ylz.1,uc008ylz.1,
NM_013943 uc008vgd.1,uc008vgd.1,
You'll have to ignore (delete) the repeated IDs in the right column.
Then copy the right column once it's a column of single IDs.
Step 2: Find mouse refSeq from mouse knownGenes
Use the table browser again
- Select mouse for the genome and the select the following group: "all
tables", and table: "knownToRefSeq"
- For identifiers (names/accessions): Click on "paste list" and then
paste your list of known Mouse genes. Click "submit"
- Click 'get output', select "name" and "value" at the top (deselect any
others) and 'get output' again.
You should have the list of known mouse to mouse refSeq, that matches
human genes:
#name value
uc008rcg.1 NM_010022
uc008vgd.1 NM_013885
uc008vnu.1 NM_025383
uc008vxt.1 NM_173774
uc008wdd.1 NM_008860
uc008ylz.1 NM_011578
You'll need to manually associate the known mouse gene columns ("#name"
from second query and "hg19.mmBlastTab.target" from first query) from
each query to complete the map.
You might also want to look into Galaxy (http://main.g2.bx.psu.edu/)
which may provide a simpler way to go about this. If you have any
questions about galaxy, please contact them for help.
Let us know if you have any additional questions.
-
Greg Roe
QA Team
UCSC Genome Browser
> ----- Original Message -----
> From: "Schragi Schwartz"<[email protected]>
> To: [email protected]
> Sent: Thursday, November 18, 2010 5:25:50 AM GMT -08:00 US/Canada Pacific
> Subject: [Genome] Finding refseq homologs
>
> Hi,
> I have a list of all refseq ids in human (in the format of NM_XXXX).
> I was wondering if there's any simple way to find the orthologous refseq
> genes in mouse for each of these human refseq ids.
> Thank you very much,
> Schragi
> _______________________________________________
> Genome maillist - [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome