Hello Denis, Many of the datasets on the "Gene and Gene Predictions" track group would be appropriate sources. All can be retrieved via the Table browser if less than 100k lines. Otherwise, use out Downloads files obtained via ftp and parse locally using your own tools.
The basic process is to bring up the track, link to associated tables to obtain alternate names (if needed, RefSeq ID, Uniprot, etc.) and output as a text file for download. Using "UCSC Genes" in hg17 as the example, the specific steps would be to: 1) Open Table browser: http://genome.ucsc.edu/cgi-bin/hgTables (more help, tutorials, guides are linked from top paragraph) 2) Set the following: clade=mammal genome=human assembly=Mar.2006 group=Gene and Gene prediction tracks track=UCSC Genes table=knownGene (default) 3) Click on "describe table schema" to view table contents. Note the "name" is actually a UCSC formatted transcript ID Strand is the third column. The next sections on page list associated tables. review.** 3) Go back to main form. 4) Make certain region=genome and all other filters are cleared. 5) Select output format=selected fields from primary & related tables 6) Name file for download (gzip is recommended) 7) Submit with "get output". The next page will allow you link to associated tables, limit output by specific fields (for example, gene symbol & strand), group transcripts by gene cluster, or limit by canonical transcripts only. **Tables to link in may include: kgXref,kgSpAlias,kgTxInfo,knownCanonical,knownIsoforms, plus others. For older and non-human species, the process will be very similar but the type and availability of some linked tables may be limited. We hope this helps! Jennifer Jackson UCSC Genome Bioinformatics Group Rybin, Denis V wrote: > Hello, > I work at Boston University on several genetics projects and would like > to get genome wide strand information for human build 36. Basically I > just want to have 2 columns of data: gene and strand. Could you please > help me with this? Thank you very much. > > Denis V. Rybin, MS > Research Data Analyst > BUSPH Data Coordinating Center > Biostatistics Consulting Group > (617)414-3795 > [email protected] > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
