Hi Katrina, Thank you very much for your prompt and detailed reply. Indeed the link you suggested: http://genome.cse.ucsc.edu/cgi-bin/hgc?org=Human&g=refGene&i=NM_007294 <http://genome.cse.ucsc.edu/cgi-bin/hgc?org=Human&g=refGene&i=NM_007294>is of the sort I require, and I now know how to automatically create this URL given an organism and RefSeq details. However, this link does not work in all browsers: it opens well in Google chrome, but Internet explorer (I have version 8) returns an error: "hashMustFindVal: 'c' not found". This is important for me because I am automatically parsing web pages using MATLAB's 'urlread' command, which reads the web page with the same error and Internet explorer. Could you please suggest some alternative url that opens properly also in Windows explorer (hence also probably in MATLAB)? Cheers, Adam
On Thu, Jan 27, 2011 at 8:34 PM, Katrina Learned <[email protected]>wrote: > Hi Adam, > > The reason such a URL didn't work for organisms such as chicken, zebrafish, > and drosophila is that these organisms don't have a UCSC Genes track. > Therefore, you are attempting to create a URL to a page that doesn't exist. > > You should be able to get the RefSeq Summary and Description from the > RefSeq details page for most of the organisms using your web page parsing > method. To verify, compare the description and summary of the UCSC Gene > details page example you provided to the RefSeq details for the same gene, > you'll see that RefSeq also has summary & description: > > > http://genome.cse.ucsc.edu/cgi-bin/hgGene?org=Human&hgg_gene=NM_007294&hgg_chrom=none&db=hg18 > http://genome.cse.ucsc.edu/cgi-bin/hgc?org=Human&g=refGene&i=NM_007294 > > For more information on using "org=" in your URL, please see this > previously answered MLQ: > https://lists.soe.ucsc.edu/pipermail/genome/2007-May/013437.html > > Alternatively, you can also use our table browser to obtain this > information (so you don't have to parse the HTML page). To do this, click on > "Tables" on the top blue bar, select your clade, genome, and assembly of > your choice and make the following selections: > > group: Genes and Gene Prediction Tracks > track: RefSeq Genes > table: refGene > region: genome > identifiers (names/accessions): click one of the buttons and paste or > upload the identifiers of your genes of interest; click submit > output format: selected fields from primary and related tables > output file: to have the results saved to a file instead of displaying in > the browser window, enter the name you would like the output file to have, > otherwise, leave blank > file type returned: choose one > > Click "get output". Select all the fields from the refGene table that you > wish to see in your results, and then scroll down to "Linked Tables" and > check the boxes next to the "gbCdnaInfo" (selecting this table reveals > additional linked tables) and "refSeqSummary" tables; click "Allow Selection > From Checked Tables." Now scroll back down to "Linked Tables" and select the > check box next to the "description" table and click "Allow Selection From > Checked Tables" again. Now, from the description table, select the "name" > field and from the refSeqSummary table, select "summary;" scroll back to the > top section and click "get output." > > For more information about using the Table Browser see "Using the Table > Browser" by scrolling down past the Table Browser form. It provides brief > descriptions of the Table Browser controls. You can also see the "User's > Guide" at http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html. > > I hope this information is helpful! Please don't hesitate to contact the > mail list again if you have any further questions. > > Katrina Learned > UCSC Genome Bioinformatics Group > > > > Adam Wasserstrom wrote, On 01/26/11 13:16: > > Dear Brooke, > I asked you a few months ago a question regarding obtaining a list of genes > for various organisms. Your answer then was very helpful, so I hope that > it's okay I am replying back to you, as I have a follow-up question on the > same matter. > You previously suggested that I use RefSeq genes rather the KnownGenes (as > it applies to a wider variety of organisms), and indeed I did so. However, > I'm interested in obtaining more information per gene than what is present > in the table schema. For example, I want the summary and description of a > gene, as can be obtained in the page of the following > sort:http://genome.cse.ucsc.edu/cgi-bin/hgGene?org=Human&hgg_gene=NM_007294&hgg_chrom=none&db=hg18< > > <http://genome.cse.ucsc.edu/cgi-bin/hgGene?org=Human&hgg_gene=NM_007294&hgg_chrom=none&db=hg18>http://genome.cse.ucsc.edu/cgi-bin/hgGene?org=Human&hgg_gene=NM_007294&hgg_chrom=none&db=hg18>I > want to be able to automatically obtain such a url given organism, gene and > build, so that I can then automatically parse the web-page. For example in > human and mouse I succssfully create the url as follows:" > <http://genome.ucsc.edu/cgi-bin/hgGene?org=>http://genome.ucsc.edu/cgi-bin/hgGene?org= > " + ORGANISM_NAME + "&hgg_gene=" > + GENE_NAME + "&hgg_chrom=none&db=" + BUILD > However, this didn't work in other organisms, such as chicken, zebrafish, > drosophila, etc. > Could you please suggest how I can obtain this information in a manner > applicable to all organisms? > I hope my explanation was clear enough :) > Thank you very much for you help, > Best regards, > Adam > _______________________________________________ > Genome maillist - > [email protected]https://lists.soe.ucsc.edu/mailman/listinfo/genome > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
