Dear Claudio, unless somebody else (Ensembl Genomes? I am cc'ing Paul Kersey here) is planning to take advantage of this data, the only way that BioMart can use it is if SGD guys set up their own BioMart server. We, of course, will be happy to help with any technical issues should they decide to go ahead with that but it should be their initiative
a On Wed, Feb 8, 2012 at 12:23 AM, Claudio Joazeiro <[email protected]>wrote: > Dear Arek and Arnaud, > > I am cc'ing Rob Nash from SGD who has provided the information below > regarding how to retrieve information on yeast 3' UTRs. Is this information > sufficient or would you still need Rob and his colleagues to make their > data available through the BioMart interface? If the latter is the case, > would you mind initiating a discussion with SGD? > > Rob's comment: > > >>>" > http://www.yeastgenome.org/cgi-bin/reference/reference.pl?dbid=S000129301 > Yassour M, Kaplan T, Fraser HB, Levin JZ, Pfiffner J, Adiconis X, Schroth > G, Luo S, Khrebtukova I, Gnirke A, Nusbaum C, Thompson DA, Friedman N, > Regev A (2009) Ab initio construction of a eukaryotic transcriptome by > massively parallel mRNA sequencing. > Proc Natl Acad Sci U S A 106(9):3264-9 > PMID: 19208812 > > This data was generated using RNA-seq and an algorithm to construct a > transcript catalog. > > To look at the data go to GBrowse: > > http://browse.yeastgenome.org/fgb2/gbrowse/scgenome/ > > and go to the 'Select Tracks' tab and under 'Gene Structure' click the > 'All on' box next to 'UTRs', then click on the 'Back to Browser' button at > the bottom to browse the data. > > Normally, you would be able to download the sequence from within GBrowse > by selecting the tracks, clicking the floppy disk icon and then selecting > FASTA output, but currently that feature isn't working. In the interim > there is a possible solution to this problem that a bioinformatics analyst > within our group proposed you try. Here are the steps she suggested you > follow to get the sequence: > > 1. Download the appropriate tracks from our download site in BED format ( > http://www.yeastgenome.org/download-data/published-datasets-directory) > 2. Grep or otherwise filter for those entries corresponding to the 3'UTRs. > 3. Go to UCSC and click on the table browser option ( > http://genome.ucsc.edu/cgi-bin/hgTables?org=human) > 4. Select clade "other", genome "S. cerevisiae" and assembly "April 2011" > 5. Upload the BED files individually by clicking "Add custom tracks" > 6. Select the output format to sequence and click on get output. > > I hope this allows you to get the UTR sequences of interest."<<< > > Thanks, > Claudio > > ________________________________________ > From: Arek Kasprzyk [[email protected]] > Sent: Tuesday, February 07, 2012 6:48 AM > To: Claudio Joazeiro > Cc: Arnaud Kerhornou; [email protected]; Paul Kersey > Subject: Re: [BioMart Users] Yeast 3'UTRs > > Dear Claudio, > > We at BioMart do not host any data ourselves. We rely on the instances of > BioMart set up by third parties. It would be probably best to ask SGD folks > if they plan to make their data available through the BioMart interface in > the future so those annotations could become available to the BioMart > community. Failing that, perhaps Enesmbl genomes is planning to have those > annotations? (I am cc'ing Paul Kersey who may want to comment on that) > > a > > On Thu, Feb 2, 2012 at 10:02 AM, Claudio Joazeiro <[email protected] > <mailto:[email protected]>> wrote: > > Dear Arnaud, > > Thank you for the prompt response. Is there interest in BioMart's part to > have yeast UTR information to provide through its portal? If so, I am > certain SGD can help since that annotation is available. I can help mediate > an introduction if you would like. > > Regarding the length of flanking sequences, I realize that I can select > any number. The problem is that yeast 3' UTRs have variable lengths so the > output for any given specified number would not be accurate for every gene. > > Regards, > Claudio > ________________________________________ > From: Arnaud Kerhornou [[email protected]<mailto:[email protected]>] > Sent: Thursday, February 02, 2012 3:03 AM > To: Claudio Joazeiro > Cc: [email protected]<mailto:[email protected]> > Subject: Re: [BioMart Users] Yeast 3'UTRs > > Hi Claudio, > > There is no UTR information held in Ensembl for Scerevisiae > Our data come from SGD GFF3 flat files, and I don't think they contain > UTR information. > > Re. the length of the flanking sequences, you can specify any length you > wish in the filter page. > > Regards, > Arnaud > > On 02/02/2012 05:32, Claudio Joazeiro wrote: > > To whom it may concern: > > > > We are having a problem with the 3' UTR setting of the BioMart Central > Portal interface. It feels like a bug in the underlying database, so we are > hoping you would be able to diagnose/fix it. > > > > We need to retrieve the 3' UTRs of yeast genes. I have tried to do this > in a couple of ways: > > > > TRIAL 1: > > > > DATABASE: Ensembl > > Datasets: S cerevisiae genes > > Sequences: 3' UTR > > Upstream Flank: (blank) > > Downstream Flank (blank) > > Filters: (default) > > Header: Ensembl Gene ID and Associated Gene Name > > > > The result I got was: “Sequence unavailable” for all genes > > > > Then I attempted TRIAL 2: > > > > DATABASE: Ensembl > > Datasets: S cerevisiae genes > > Sequences: Flank-coding region (not ideal, though, as this is expected > to yield longer sequences than those we're looking for) > > Upstream Flank: (blank) > > Downstream Flank: 100 (this is also a problem because although the > median length of yeast 3' UTRs is 104 bp, they can be as long as ~1,000 bp, > so we would be missing sequences) > > Filters: (default) > > Header: Ensembl Gene ID and Associated Gene Name > > > > This works, but with the above caveats. > > > > Thanks in advance for your help. > > > > Sincerely, > > > > Claudio Joazeiro, Ph.D. > > Assistant Professor > > Department of Cell Biology > > The Scripps Research Institute > > CB-163 > > 10550 N Torrey Pines Rd > > La Jolla, CA 92037 > > > > Phone: (858) 784-7570<tel:%28858%29%20784-7570> > > Fax: (858) 784-9779<tel:%28858%29%20784-9779> > > > > > > > > > > > > _______________________________________________ > > Users mailing list > > [email protected]<mailto:[email protected]> > > https://lists.biomart.org/mailman/listinfo/users > > > > _______________________________________________ > Users mailing list > [email protected]<mailto:[email protected]> > https://lists.biomart.org/mailman/listinfo/users > > > > -- > > Arek Kasprzyk, MD, MSc, PhD > BioMart Project Lead > > -- Arek Kasprzyk, MD, MSc, PhD BioMart Project Lead
_______________________________________________ Users mailing list [email protected] https://lists.biomart.org/mailman/listinfo/users
