Hi Adam,

The reason such a URL didn't work for organisms such as chicken, 
zebrafish, and drosophila is that these organisms don't have a UCSC 
Genes track. Therefore, you are attempting to create a URL to a page 
that doesn't exist.

You should be able to get the RefSeq Summary and Description from the 
RefSeq details page for most of the organisms using your web page 
parsing method. To verify, compare the description and summary of the 
UCSC Gene details page example you provided to the RefSeq details for 
the same gene, you'll see that RefSeq also has summary & description:
http://genome.cse.ucsc.edu/cgi-bin/hgGene?org=Human&hgg_gene=NM_007294&hgg_chrom=none&db=hg18
 
<http://genome.cse.ucsc.edu/cgi-bin/hgGene?org=Human&hgg_gene=NM_007294&hgg_chrom=none&db=hg18>
http://genome.cse.ucsc.edu/cgi-bin/hgc?org=Human&g=refGene&i=NM_007294 
<http://genome.cse.ucsc.edu/cgi-bin/hgc?org=Human&g=refGene&i=NM_007294>

For more information on using "org=" in your URL, please see this 
previously answered MLQ: 
https://lists.soe.ucsc.edu/pipermail/genome/2007-May/013437.html

Alternatively, you can also use our table browser to obtain this 
information (so you don't have to parse the HTML page). To do this, 
click on "Tables" on the top blue bar, select your clade, genome, and 
assembly of your choice and make the following selections:

group: Genes and Gene Prediction Tracks
track: RefSeq Genes
table: refGene
region: genome
identifiers (names/accessions): click one of the buttons and paste or 
upload the identifiers of your genes of interest; click submit
output format: selected fields from primary and related tables
output file: to have the results saved to a file instead of displaying 
in the browser window, enter the name you would like the output file to 
have, otherwise, leave blank
file type returned: choose one

Click "get output".  Select all the fields from the refGene table that 
you wish to see in your results, and then scroll down to "Linked Tables" 
and check the boxes next to the "gbCdnaInfo" (selecting this table 
reveals additional linked tables) and "refSeqSummary" tables; click 
"Allow Selection From Checked Tables." Now scroll back down to "Linked 
Tables" and select the check box next to the "description" table and 
click "Allow Selection From Checked Tables" again. Now, from the 
description table, select the "name" field and from the refSeqSummary 
table, select "summary;" scroll back to the top section and click "get 
output."

For more information about using the Table Browser see "Using the Table 
Browser" by scrolling down past the Table Browser form. It provides 
brief descriptions of the Table Browser controls. You can also see the 
"User's Guide" at http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html.

I hope this information is helpful! Please don't hesitate to contact the 
mail list again if you have any further questions.

Katrina Learned
UCSC Genome Bioinformatics Group



Adam Wasserstrom wrote, On 01/26/11 13:16:
> Dear Brooke,
> I asked you a few months ago a question regarding obtaining a list of genes
> for various organisms. Your answer then was very helpful, so I hope that
> it's okay I am replying back to you, as I have a follow-up question on the
> same matter.
> You previously suggested that I use RefSeq genes rather the KnownGenes (as
> it applies to a wider variety of organisms), and indeed I did so. However,
> I'm interested in obtaining more information per gene than what is present
> in the table schema. For example, I want the summary and description of a
> gene, as can be obtained in the page of the following sort:
> http://genome.cse.ucsc.edu/cgi-bin/hgGene?org=Human&hgg_gene=NM_007294&hgg_chrom=none&db=hg18
>   
> <http://genome.cse.ucsc.edu/cgi-bin/hgGene?org=Human&hgg_gene=NM_007294&hgg_chrom=none&db=hg18>
> <http://genome.cse.ucsc.edu/cgi-bin/hgGene?org=Human&hgg_gene=NM_007294&hgg_chrom=none&db=hg18
>   
> <http://genome.cse.ucsc.edu/cgi-bin/hgGene?org=Human&hgg_gene=NM_007294&hgg_chrom=none&db=hg18>>I
> want to be able to automatically obtain such a url given organism, gene and
> build, so that I can then automatically parse the web-page. For example in
> human and mouse I succssfully create the url as follows:
> "http://genome.ucsc.edu/cgi-bin/hgGene?org="; + ORGANISM_NAME +"&hgg_gene="
> + GENE_NAME + "&hgg_chrom=none&db=" + BUILD
> However, this didn't work in other organisms, such as chicken, zebrafish,
> drosophila, etc.
> Could you please suggest how I can obtain this information in a manner
> applicable to all organisms?
> I hope my explanation was clear enough :)
> Thank you very much for you help,
> Best regards,
> Adam
> _______________________________________________
> Genome maillist  [email protected]  <mailto:[email protected]>
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>    
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to