Hello again, To link data, use the Table browser. http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html
For UCSC Genes, the key tables are listed in this prior response from today. https://lists.soe.ucsc.edu/pipermail/genome/2009-June/019345.html How to find other linked tables and understand the contents is in the answer to your earlier question. Using the Table browser, navigate to the clade/genome/assembly/group/UCSC Genes track and with the primary table selected (knownGene), set output to "selected fields from primary and related tables". Name the file so it will download (gzip recommended). Click on "get output" - this will take you to a page that looks very similar to the "describe table schema" link, but will allow you to actually choose tables/fields to include in the output file. The data will merge as an outer-join with respect to the primary table. This means that certain tables with completely normalized data (such as kgAlias) can produce significant output where lines are duplicated except for the two fields in this table. The data may even become too large to extract from the Table browser (unless you group output cycles per chromosome). You may consider not linking in this data (and other large normalized tables - examine the number of lines and format in the schema) and instead download or ftp the file, manipulate it locally to create something more compact, then join it into the other data using an online tool like Galaxy or your own unix tools/methods. Some experimentation will be necessary on your part to produce a file that will meet your exact needs, as all of the data you mention is not readily available as a single file. Gene descriptions are in the tables above. For the main track description, ftp the goldenPath/mm9/database/trackDb.txt.gz file from the Downloads server. The 18 column is the html version. This can also be retrieved from the Table browser (group=All Tables, database=mm9, table=trackDb). But, it might be easiest to just cut and paste this off the browser window. Jennifer Jackson UCSC Genome Bioinformatics Group Jennifer Jackson wrote: > ---------------------------------- > Re-send of bounced message > ---------------------------------- > > Subject: > USCS Known Gene Tables > From: > David L Klinkebiel <[email protected]> > Date: > Thu, 25 Jun 2009 15:57:54 -0500 > To: > [email protected] > > To: > [email protected] > > > You have a list of Mouse UCSC known genes on the website assembled in > many individual pages. The annotation contains: Gene Symbol, Known Gene > ID, mRNA, UniProt, RefSeq Protein, and description. How do I get this > information in one table (text file). Thanks > > David Klinkebiel, Ph.D. > Department of Biochemistry and Molecular Biology > University of Nebraska Medical Center > 985870 Nebraska Medical Center > Omaha, NE 68198-5870 > Office Phone: 402-559-3842 > Lab Phone: 402-559-9303 > FAX: 402-559-6650 > > ***The University of Nebraska Medical Center E-Mail Confidentiality > Disclaimer*** > The information in this e-mail may be privileged and confidential, intended > only for the use of the addressee(s) above. > Any unauthorized use or disclosure of this information is prohibited. If > you > have received this email by mistake, > please delete and immediately contact the sender. > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
