Hi Yuval,

You can use our Table Browser to obtain these sequences. Click on 
"Tables" in the blue bar at the top of the page. Choose your preferred 
cow assembly (it will default to the most recent assembly: bosTau4, Oct. 
2007) and then make the following additional selections:

group: Genes and Prediction Tracks
track: RefSeq Genes (Note: MGC Genes is equally useful & if you choose 
to use it instead, select mgcGenes as the table rather than refGene)
table: refGene
region: position; type "chr1" in the field
output format: sequence
output file: enter the name of the file that will be created
file type returned: plain text

Click "get output"
Select "genomic" & click "submit" (this step will be skipped if you use 
the mgcGenes table)
Select all of the following:
  Promoter/Upstream by <type in '1000'> bases
  5' UTR Exons
  CDS Exons
  3' UTR Exons
  Introns
  One FASTA record per gene.
  Exons in upper case, everything else in lower case.
Click "get sequence"

Repeat for each chromosome (there is too much data to do the entire 
genome at once).

In the output files, each gene will have a brief header that starts with 
">". The header line will be followed by the 1000 bases which are 
upstream from the TSS and then the bases that make up the gene will 
follow. The genes will be in order according to position on the chromosome.

You will then need to parse the data to truncate the sequence 200 bases 
downstream of the TSS. It may be helpful to note that the first 
uppercase base will be the actual start of the gene.

Please don't hesitate to contact the mail list again if you require 
further assistance.

Katrina Learned
UCSC Genome Bioinformatics Group

Yuval Tabach wrote:
> Dear all
>
> I want to consult on you how to get the sequence flanking all the
>
> Transcription start sites (TSS)of the Cow genes. I would like to get 1000
> upstream and 200bp downstream from the TSS. How can I get this sequence?
>
> Thanks 
>
>  
>
>  
>
> Yuval Tabach, Ph.D.
> Ruvkun Laboratory
> Department of Molecular Biology
> Massachusetts General Hospital
> Department of Genetics
> Harvard Medical School
>
>  
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>   
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to