Dear Anton,

To extract the data into tab delimited format you click on the "table browser" 
link from the blue navigation bar.

1) Set the drop downs to the proper clade, genome, and assembly.

2) Set group: "Genes and Gene Prediction Tracks"

3) Set track: "Known Genes"

4) Set table: "kgXref"

5) Next to identifiers (names/accessions) click: "paste list" and follow the 
instructions. You can paste your gene symbols here.

6) Set output format: "Selected fields from primary tables and related tables"

7) to save the file give it a file name in "output file:"

8) Click on "get output"

9) Select the fields from kgXref in which you are interested (kgID is the 
KnownGene unique identifier, so it may be of use to you):
             kgID
             geneSymbol
             description

10) Scroll down to Linked Tables to select tables from which you can get the 
other information:
            rn4.refSeqSummary
            uniProt.comment
and click on "Allow selection from Checked Tables." There aren't actually any 
fields from the uniProt.comment table that you will want to open (according to 
your request) but it reveals more Linked Tables that you do need.

11) Scroll down to Linked Tables and select uniProt.commentType and 
uniProt.commentVal and click "Allow selection from Checked Tables."

12) From rn4.refSeqSummary select the following field: summary from 
uniProt.commentType, select the following field: val and from 
uniProt.commentVal, select the following field: val 

Scroll up and select from uniProt.commentVal fields: 
            val        Text of comment

13) Scroll up and then click "get output"


If you have further questions, please feel free to contact the mailing list 
again.


Vanessa Kirkup Swing
UCSC Genome Bioinformatics Group


----- Original Message -----
From: "Anton Kratz" <[email protected]>
To: "UCSC Genome Browser Help Desk" <[email protected]>
Sent: Monday, August 9, 2010 10:02:18 PM GMT -08:00 US/Canada Pacific
Subject: [Genome] how to get certain fields for HUGO Gene Symbols, rat, 
UniProtKB etc

Dear UCSC team,

I have a very long list of HUGO Gene Symbols of rat (rn4).

I would like to expand each HUGO Gene Symbol into a tab-separated line
containing certain fields which I can see when I click such a gene in the
genome browser in the UCSC Known Genes track. The fields which I want are:

>From the "Rat Gene (gene name here) Description and Page Index" box the
fields: "Description" and "RefSeq Summary".
>From the "Comments and Description Text from UniProtKB " box the field:
FUNCTION (and maybe others).

So for example the HUGO Gene Symbol "Irs1" needs to expand to:

Irs1    insulin receptor substrate 1    a docking protein; may act to link
the insulin receptor kinase with enzymes regulating cellular growth and
metabolism [RGD].     May mediate the control of various cellular processes
by insulin. When phosphorylated by the insulin receptor binds specifically
to various cellular proteins containing SH2 domains such as
phosphatidylinositol 3-kinase p85 subunit or GRB2. Activates
phosphatidylinositol 3-kinase when bound to the regulatory p85 subunit.

The items I want to expand each Gene Symbol may change later, f.e. later I
might also want to include other pieces of information like those from the
microarray box or the KEGG data.

Could you please give me some hints on how to get which data from which
table? I am starting from HUGO Gene Symbol only, so I think I need some
unique identifier, or even several identifiers?! Basically I would like to
ask how to get the data when having only the Gene Symbol. I am able to
program in Perl, Awk but I don't know which data is which table and how to
retrieve by which identifier. Thank you.

best regards,
Anton
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to