Hi Alexandre
On Mon, 2008-07-28 at 09:39 +0200, Alexandre Gattiker wrote: > Hello Syed, > > Thank you very much -- indeed, replacing MartURLLocation with > MartDBLocation did solve the problem completely. So it seems indeed that > the warnings are not propagated over the web service. However, it does > not explain why, when I use the web service and request a flank > sequence, specifying a correct upstream_flank value, I still get a blank > screen, although the same request yields sequences when using the DB > service. > > Do you recommend using the BioMart web or database service for a > production server? I had favored the web approach to beat firewalls, and > performance is not too much of an issue. I would certainly recommend BioMart Web service access over the directDB (MartDBLocation) connection, however in this particular case, there is some sort of bug which is causing trouble. Otherwise, the warnings do get propagated just the same as in directDB(MartDBLocation) type connection. > > As for the question about TSS +/- 2kb: I thought that the TSS, the > transcript start position, and the first exon start position were all > the same value (trans-splicing making that picture more complicated). So > my question was whether you might consider developing the option to > specify an upstream and downstream "flank" relative to the transcript > start position. Apologies, I took it too far. Sounds a reasonable request. On the requirements list now. For now, you may retrieve upstream sequence + Unspliced Transcript, then read the number of downstream bases you are interested in from the transcript start position. cheers syed > Currently, if I choose to retrieve the flank, I cannot > have that flank extend into the transcript. > > Best, > Alexandre > > Syed Haider wrote: > > Hi Alexandre, > > > > On Fri, 2008-07-25 at 09:05 +0200, Alexandre Gattiker wrote: > > > >> Thank you very much, that (mostly) solved it! > >> > >> I can now retrieve transcript/gene/etc. sequences. But if I select one > >> of the four Flank options, I get a blank result and a warning in the log: > >> Use of uninitialized value in concatenation (.) or string at > >> biomart-perl/lib/BioMart/Web.pm line 2446 > >> > > > > Ok, this is probably because the warning message isnt getting propagated > > correctly over the webservice. Can you configure your apache using > > <MartDBLocation >.... </MartDBLocation> type connection params in your > > registry instead of <MartURLLocation>. I sent you these in my first > > email yesterday. That will connect you to ensembl public databases > > directly instead of going via www.biomart.org. I am hoping this would > > resolve the flanks problem. > > > > > > > >> Running the same query at biomart.org however yields a screen message > >> and alert box: > >> Validation Error: Requests for flank sequence must be accompanied by an > >> upstream_flank or downstream_flank request > >> > >> <http://www.biomart.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=mmusculus_gene_ensembl.default.sequences.gene_stable_id|mmusculus_gene_ensembl.default.sequences.str_chrom_name|mmusculus_gene_ensembl.default.sequences.struct_biotype|mmusculus_gene_ensembl.default.sequences.coding_gene_flank&FILTERS=mmusculus_gene_ensembl.default.filters.ensembl_gene_id."ENSMUSG00000055866"&VISIBLEPANEL=resultspanel> > >> > >> > >> Then, if I do select an upstream_flank, I still get a blank page, while > >> the query works at biomart.org: > >> <http://www.biomart.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=mmusculus_gene_ensembl.default.sequences.gene_stable_id|mmusculus_gene_ensembl.default.sequences.str_chrom_name|mmusculus_gene_ensembl.default.sequences.struct_biotype|mmusculus_gene_ensembl.default.sequences.coding_gene_flank|mmusculus_gene_ensembl.default.sequences.upstream_flank."10"&FILTERS=mmusculus_gene_ensembl.default.filters.ensembl_gene_id."ENSMUSG00000055866"&VISIBLEPANEL=resultspanel> > >> > >> NB I'm using biomart embedded into Ensembl. > >> > >> > >> Another issue: > >> I'm trying to fetch the sequence around the transcription start site > >> (e.g. -2 kb to +2 kb) for promoter analysis. Is there a way to do that? > >> > > > > You can only retrieve the +/- seqs w.r.t the start of first exon. TSS's > > coordinates I guess are not available in ensembl database. cc'ed Glenn > > to confirm. > > > > cheers > > syed > > > > > > > >> Best regards > >> Alexandre > >> > >> > >> Syed Haider wrote: > >> > >>> an even better solution, just add this to your existing registry: > >>> > >>> <MartURLLocation > >>> name = "sequence" > >>> displayName = "Sequence (release 49)" > >>> host = "www.biomart.org" > >>> port = "80" > >>> visible = "" > >>> default = "" > >>> includeDatasets = "mmusculus_genomic_sequence" > >>> martUser = "" > >>> /> > >>> > >>> > >>> On Thu, 2008-07-24 at 18:22 +0200, Alexandre Gattiker wrote: > >>> > >>> > >>>> Hello, > >>>> > >>>> Kudos for this great piece of software. I managed to whip up a very > >>>> functional biomart by mashing up some lab data with the Ensembl biomart, > >>>> almost accidentally, as I didn't even expect that to be possible! It's > >>>> rare enough that software works even better than advertised and in such > >>>> a modular way. > >>>> > >>>> I have a small issue, however. When I go to the Attributes -> Sequences > >>>> page, the SEQUENCES section has: > >>>> > >>>> No visible attributes in collection seq_scope_type > >>>> No visible attributes in collection upstream > >>>> No visible attributes in collection downstream > >>>> > >>>> I assume that's linked to warnings I get running configure.pl: > >>>> > >>>> Setting possible links between datasets > >>>> ....(scanning) 33% WARNING: Pointer attributes from > >>>> mmusculus_genomic_sequence will not be available as > >>>> mmusculus_genomic_sequence not in registry > >>>> WARNING: Pointer attributes from mmusculus_genomic_sequence will > >>>> not be available as mmusculus_genomic_sequence not in registry > >>>> WARNING: Pointer attributes from mmusculus_genomic_sequence will > >>>> not be available as mmusculus_genomic_sequence not in registry > >>>> > >>>> My config is as follows. I have biomart 0.7. > >>>> > >>>> <MartURLLocation > >>>> name = "ensembl" > >>>> displayName = "Ensembl Genes (release 49)" > >>>> host = "www.biomart.org" > >>>> port = "80" > >>>> visible = "1" > >>>> default = "" > >>>> includeDatasets = "mmusculus_gene_ensembl" > >>>> martUser = "" > >>>> /> > >>>> > >>>> I tried > >>>> includeDatasets = "mmusculus_gene_ensembl,mmusculus_genomic_sequence" > >>>> but that didn't solve the problem. I also tried to leave includeDatasets > >>>> empty but I still get the warning (now for all species). > >>>> > >>>> Best, > >>>> Alexandre > >>>> > >>>> > >>>> > >>> -- > >>> ====================================== > >>> Syed Haider. > >>> EMBL-European Bioinformatics Institute > >>> Wellcome Trust Genome Campus, Hinxton, > >>> Cambridge CB10 1SD, UK. > >>> ====================================== > >>> > >>> > >>> > >>> > >> Richard Holland <[EMAIL PROTECTED]> > >> > >> > >> > > -- > > ====================================== > > Syed Haider. > > EMBL-European Bioinformatics Institute > > Wellcome Trust Genome Campus, Hinxton, > > Cambridge CB10 1SD, UK. > > ====================================== > > > > > > -- ====================================== Syed Haider. EMBL-European Bioinformatics Institute Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. ======================================
