Hello, You will need to decide on a track (UCSC Genes, RefSeq Genes, etc), click into the track description page or navigate to it in the Table browser, then the default table for the track will be in the table list. Associated tables are also there. Also read the FAQ for the table format for genePred - it explains which columns have exon and cds coordinates from the BLAT alignment. This works for any track - navigate the Table browser, find tables of interest, investigate format in FAQ, find out about linked tables in "view schema", then go into Downloads for the data ftp.
How to identify which files in the database directory are associated with which track can be confusing, which is why I suggested using the Table browser to navigate the data first, then go to Downloads once you know what you want to ftp. The Table browser can show you schema, data types, and table relationships (common keys) not available through other methods. You have to do these steps - I cannot just tell you which table, since there are several to choose from and the content for each track is different. Some are also labeled with UCSC identifiers that you will need to link to other tables (the kgXref and kgAlias tables I suggested) to obtain the gene name, gene symbol, and other common names. The external identifier data is normalized in specific ways to make it useful (meaning = can be joined with other tables) for as many tracks as possible. Once you do this, then whatever tables are your result are the same name as those you should ftp from Downloads. I explained below that the table name and file name are the same, with an added .txt.gz for the data and an added .sql for the mySQL formatted schema. You probably only need to ftp the .txt.gz file. Good luck, Jennifer ------------------------------------------------ Jennifer Jackson UCSC Genome Bioinformatics Group ----- "Peng Yu" <[email protected]> wrote: > From: "Peng Yu" <[email protected]> > To: [email protected] > Sent: Monday, November 30, 2009 12:31:03 PM GMT -08:00 US/Canada Pacific > Subject: Re: [Genome] How to download the exon regions (start and end > positions) of all genes? (mouse) > > On Mon, Nov 30, 2009 at 12:22 PM, Jennifer Jackson <[email protected]> > wrote: > > Hello, > > > > There are a few choices: ftp files from Downloads, extract file > (table) using the Table browser, or output the file (table) using the > public mySQL server. > > > > For most gene tracks, the primary table is in genePred format. This > includes the coordinates of exons and notes the utr and cds regions. > http://genome.ucsc.edu/FAQ/FAQformat#format9 > > > > Downloads: > > The name of a table in the database is the same as a file on the > Downloads server (with an added .txt.gz for the data and an added .sql > for the mySQL schema. All database files on the Downloads server are > located by following the links Downloads -> common name -> genomic > assembly version -> annotation database directory. > http://genome.ucsc.edu/FAQ/FAQdownloads#download1 > > I'm on the following page. There are many files. Which file I should > download to get the exon region information? > http://hgdownload.cse.ucsc.edu/goldenPath/mm9/database/ > > > Table browser: > > Probably the best option. Open the tool, set the controls to the > mouse assembly of interest, then track group "Gene and Gene > Predictions" and then the gene track of interest (UCSC Genes or RefSeq > or another of your choice). Then, leaving the primary table (the > genePred table) selected, click on the view schema button. The page > has three sections - the table/file schema, a list of associated > tables along with the linking keys, and sample data or the track > description (same description found by clicking on the track name from > the Assembly browser graphic page view). With this option, you can > link in related tables, customize the format of the output, save > results back into the browser as a custom track, and other functions. > Help with examples: > http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html > > > > MySQL: > > Instructions -> in the link download29 you had in your original > email (below). Using the table browser to understand the table names, > the schema, what linked tables are also a part of the track, and > example data can be helpful before building a query if it is complex. > > > > Useful tables to link in alternate/gene names or symbols: kgXref, > kgAlias > > > > To find out more about the output formats, go into the FAQ section > and click into "Data file formats". > > > > Hopefully this will get you started, > > Jennifer > > > > > > > > > > ------------------------------------------------ > > Jennifer Jackson > > UCSC Genome Bioinformatics Group > > > > ----- "Peng Yu" <[email protected]> wrote: > > > >> From: "Peng Yu" <[email protected]> > >> To: [email protected] > >> Sent: Monday, November 30, 2009 9:04:30 AM GMT -08:00 US/Canada > Pacific > >> Subject: [Genome] How to download the exon regions (start and end > positions) of all genes? (mouse) > >> > >> http://genome.ucsc.edu/FAQ/FAQdownloads#download29 > >> http://genome.ucsc.edu/cgi-bin/hgTables?org=Mouse > >> > >> I feel that I might be able to do the query by sql or tab browser > to > >> find the start and end positions of all the exons (mouse). But I'm > >> not > >> sure how to do it. Would you please give me some detailed > >> instructions > >> on this? > >> _______________________________________________ > >> Genome maillist - [email protected] > >> https://lists.soe.ucsc.edu/mailman/listinfo/genome > > > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
