Hello Ernando, Schema information starts with a track description. To read track methods in general, open the Genome Browser and then click on the track name. There will be a link to the primary table "view schema". Here the table schema is defined, linked tables are listed and the track description is returned.
File names are equal to table names. Use the Table browser to find out the table associated with your track (and joining fields to related tables) first. Once understood, download text files, use mySQL, or stay in the Table browser. Format is defined in several places: 1) table browser, click on the "describe table schema" button 2) http://genome.ucsc.edu/FAQ/FAQformat.html 3) by typing "desc <table_name>" in mySQL 4) files named as .sql in the Downloads server For Net and Chain tracks, these comparisons are genome to genome (2). For the Conservation track, these comparisons are genome to genome (> 40). The methods are different. For you last question, you can use text files from the Downloads server or the Table browser along with Galaxy. 1) Flat files ftp the gene track file and the alignment file of interest. Then use your own tools or some from the kent source to filter the data. http://genomewiki.cse.ucsc.edu/index.php/Kent_source_utilities http://genome.ucsc.edu/FAQ/FAQdownloads.html#download1 http://genome.ucsc.edu/FAQ/FAQdownloads.html#download27 http://hgdownload.cse.ucsc.edu/admin/exe/ 2) Table browser and Galaxy create a custom track that only contains the protein coding region of your genes of interest (by filtering UCSC Genes and saving as a custom track). Then send that custom track and either the chain/net or conservation track both over to Galaxy (unless is already saved there in the library). Once there, perform an interval intersection to slice out data from the alignments that correspond to the coordinates for your genes. http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#TableBrowser Galaxy help is available at their web site. Thanks, Jennifer --------------------------------- Jennifer Jackson UCSC Genome Informatics Group http://genome.ucsc.edu/ On 4/19/10 7:36 AM, Ernando Faddeev wrote: > Hello Jennifer. Thank you for fast response. I have been looking at the > options you gave me and I feel I am a bit feather, though not quite > there yet. > > 1) Table browser option. The human net track seams to align all the > human genome with lab mouse homologous nucleic sequences. After > selecting that table I got 279 MB tab separated file with thease headers: > > #bin score tName tSize tStart tEnd qName qSize qStrand qStart qEnd id > > I can guess what are some of the columns, though is there any > description for those names so for me to be sure I got it right? > > Having the nucleic genome aligned how can I extract only protein coding > regions that correspond to the genes that interest me? (at the end I > only need aligned mouse and human protein sequences of the genes that > correspond to DNA repair, let's say those from gene ontology database > that show up after the search on terms "DNA"+"repair" on human genome). > > 2) MySQL option. I got connected to the public server and I found a lot > of databases and tables, though I didn't find any ER diagrams and field > description on the site. What databases should I look at and where to > find field description? > > Greetings, > Hernando Sanchez > > On Fri, Apr 16, 2010 at 7:20 PM, Jennifer Jackson <[email protected] > <mailto:[email protected]>> wrote: > > Hello Hernando, > > There are a few options for you: > > 1) Use the Table browser for the batch query, possibly in > combination with Galaxy to perform full intersections. The > intermediate mapping tables could be the Conservation track or the > Chain/Net tracks between Human and Mouse. > > 2) Download the text files representing the tables in the database > for the datasets in #1, then create scripts to process your query. > Tools from the UCSC utility set may be helpful. > > 3) Use the public mySQL server to gain directly access to the > database and use mySQL, utilities from our source tree, your own > tools, etc. to process the query. > > 4) Create a local mirror of the Browser and do the same as #3, but > locally in your own instance. > > Do you have a preference? #4 would be the most private option, if > that is a concern for you, but would require the most up-front work > and may not be necessary. > > Please write back and let us know your preference and we can send > full details about suggested tables & utilities, file download/ftp > help, and mirroring assistance. > > Thank you, > Jennifer > > --------------------------------- > Jennifer Jackson > UCSC Genome Informatics Group > http://genome.ucsc.edu/ > > > On 4/16/10 8:10 AM, Ernando Faddeev wrote: > > I want to compile a list of conserved protein coding > transcripts(/genes) > between mouse > (mm9<http://genome.ucsc.edu/cgi-bin/hgGateway?db=mm9>) and > > human(hg19) that are involved with DNA repair. Basically I want > the table > with names next to aligned sequences of the transcripts. I have > the names > and accession numbers of around 300 human genes that I am > interested in, and > now I want to find the sequences of their mouse homologous. > GBrowser appears > to do so in some extend, though only gene by gene and > graphically. I have > the required computational skills required to install the > database and > manage scripts, though I do not know where to start and what > tools to use, > so therefor the question: what tools can I use to build this DB? > > Greetings, > Hernando Sanchez > _______________________________________________ > Genome maillist - [email protected] > <mailto:[email protected]> > https://lists.soe.ucsc.edu/mailman/listinfo/genome > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
