Dear Arek and Arnaud,

I am cc'ing Rob Nash from SGD who has provided the information below regarding 
how to retrieve information on yeast 3' UTRs. Is this information sufficient or 
would you still need Rob and his colleagues to make their data available 
through the BioMart interface? If the latter is the case, would you mind 
initiating a discussion with SGD?

Rob's comment:

>>>"http://www.yeastgenome.org/cgi-bin/reference/reference.pl?dbid=S000129301
Yassour M, Kaplan T, Fraser HB, Levin JZ, Pfiffner J, Adiconis X, Schroth G, 
Luo S, Khrebtukova I, Gnirke A, Nusbaum C, Thompson DA, Friedman N, Regev A  
(2009) Ab initio construction of a eukaryotic transcriptome by massively 
parallel mRNA sequencing.
Proc Natl Acad Sci U S A 106(9):3264-9
PMID: 19208812

This data was generated using RNA-seq and an algorithm to construct a 
transcript catalog.

To look at the data go to GBrowse:

http://browse.yeastgenome.org/fgb2/gbrowse/scgenome/

and go to the 'Select Tracks' tab and under 'Gene Structure' click the 'All on' 
box next to 'UTRs', then click on the 'Back to Browser' button at the bottom to 
browse the data.

Normally, you would be able to download the sequence from within GBrowse by 
selecting the tracks, clicking the floppy disk icon and then selecting FASTA 
output, but currently that feature isn't working. In the interim there is a 
possible solution to this problem that a bioinformatics analyst within our 
group proposed you try. Here are the steps she suggested you follow to get the 
sequence:

1. Download the appropriate tracks from our download site in BED format 
(http://www.yeastgenome.org/download-data/published-datasets-directory)
2. Grep or otherwise filter for those entries corresponding to the 3'UTRs.
3. Go to UCSC and click on the table browser option 
(http://genome.ucsc.edu/cgi-bin/hgTables?org=human)
4. Select clade "other", genome "S. cerevisiae" and assembly "April 2011"
5. Upload the BED files individually by clicking "Add custom tracks"
6. Select the output format to sequence and click on get output.

I hope this allows you to get the UTR sequences of interest."<<<

Thanks,
Claudio

________________________________________
From: Arek Kasprzyk [[email protected]]
Sent: Tuesday, February 07, 2012 6:48 AM
To: Claudio Joazeiro
Cc: Arnaud Kerhornou; [email protected]; Paul Kersey
Subject: Re: [BioMart Users] Yeast 3'UTRs

Dear Claudio,

We at BioMart do not host any data ourselves. We rely on the instances of 
BioMart set up by third parties. It would be probably best to ask SGD folks if 
they plan to make their data available through the BioMart interface in the 
future so those annotations could become available to the BioMart community. 
Failing that, perhaps Enesmbl genomes is planning to have those annotations? (I 
am cc'ing Paul Kersey who may want to comment on that)

a

On Thu, Feb 2, 2012 at 10:02 AM, Claudio Joazeiro 
<[email protected]<mailto:[email protected]>> wrote:

Dear Arnaud,

Thank you for the prompt response. Is there interest in BioMart's part to have 
yeast UTR information to provide through its portal? If so, I am certain SGD 
can help since that annotation is available. I can help mediate an introduction 
if you would like.

Regarding the length of flanking sequences, I realize that I can select any 
number. The problem is that yeast 3' UTRs have variable lengths so the output 
for any given specified number would not be accurate for every gene.

Regards,
Claudio
________________________________________
From: Arnaud Kerhornou [[email protected]<mailto:[email protected]>]
Sent: Thursday, February 02, 2012 3:03 AM
To: Claudio Joazeiro
Cc: [email protected]<mailto:[email protected]>
Subject: Re: [BioMart Users] Yeast 3'UTRs

Hi Claudio,

There is no UTR information held in Ensembl for Scerevisiae
Our data come from SGD GFF3 flat files, and I don't think they contain
UTR information.

Re. the length of the flanking sequences, you can specify any length you
wish in the filter page.

Regards,
Arnaud

On 02/02/2012 05:32, Claudio Joazeiro wrote:
> To whom it may concern:
>
> We are having a problem with the 3' UTR setting of the BioMart Central Portal 
> interface. It feels like a bug in the underlying database, so we are hoping 
> you would be able to diagnose/fix it.
>
> We need to retrieve the 3' UTRs of yeast genes. I have tried to do this in a 
> couple of ways:
>
> TRIAL 1:
>
> DATABASE: Ensembl
> Datasets: S cerevisiae genes
> Sequences: 3' UTR
> Upstream Flank: (blank)
> Downstream Flank (blank)
> Filters: (default)
> Header: Ensembl Gene ID and Associated Gene Name
>
> The result I got was: “Sequence unavailable” for all genes
>
> Then I attempted TRIAL 2:
>
> DATABASE: Ensembl
> Datasets: S cerevisiae genes
> Sequences: Flank-coding region (not ideal, though, as this is expected to 
> yield longer sequences than those we're looking for)
> Upstream Flank: (blank)
> Downstream Flank: 100 (this is also a problem because although the median 
> length of yeast 3' UTRs is 104 bp, they can be as long as ~1,000 bp, so we 
> would be missing sequences)
> Filters: (default)
> Header: Ensembl Gene ID and Associated Gene Name
>
> This works, but with the above caveats.
>
> Thanks in advance for your help.
>
> Sincerely,
>
> Claudio Joazeiro, Ph.D.
> Assistant Professor
> Department of Cell Biology
> The Scripps Research Institute
> CB-163
> 10550 N Torrey Pines Rd
> La Jolla, CA  92037
>
> Phone: (858) 784-7570<tel:%28858%29%20784-7570>
> Fax: (858) 784-9779<tel:%28858%29%20784-9779>
>
>
>
>
>
> _______________________________________________
> Users mailing list
> [email protected]<mailto:[email protected]>
> https://lists.biomart.org/mailman/listinfo/users
>

_______________________________________________
Users mailing list
[email protected]<mailto:[email protected]>
https://lists.biomart.org/mailman/listinfo/users



--

Arek Kasprzyk, MD, MSc, PhD
BioMart Project Lead

_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to