Dear UCSC team,

I have a list of 2343 unique Gene Symbols for mouse.

What I want are two lists:

One list with at least one entry of chromosome, strand, start, end for each
of the 2343 Gene Symbols on mm9.

Another list with *all* Gene Symbols for mouse, together with again at least
one coordinate information consisting of chromosome, strand, start, end on
mm9.

What is the easiest way to retrive that?

I thought I know how to do that using the Table Browser but I run into
inconsistencies which make me unsure if I'm doing it correctly:

What I tried: I used the Table Browser with the kgXref table for mouse.
I upload my list of 2343 unique Gene Symbols for mouse.
Then I get a box:
*Error(s):*

   - Note: 596 of the 2343 given identifiers (e.g. Tm4sf12) have no match in
   table kgXref, field kgID or in alias table kgAlias, field alias. Try the
   "describe table schema" button for more information about the table and
   field.

So I would expect 1747 Gene Symbols to be left (this seems to be wrong, see
below).

I select "selected fields from primary and related tables" and press "get
output".

On the new page, under "Linked Tables", I check "mm9knownGeneGenes based on
RefSeq, GenBank, and UniProt." and press "Allow selection from checked
tables". I do this because I don't see a way to directly retrieve the
coordinates of the Gene Symbols, so I try to use the known genes as a kind
of intermediate (is there a direct way?).

Under "Select Fields from mm9.kgXref" I check geneSymbol, and under
"mm9.knownGene fields" I check name, chrom, strand, txStart, txEnd. Now I
press "Get Output".

The result is a file with 3866 entries, 1809 of them unique. I wonder why
1809 and not 1747?!

Thanks,
Anton
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to