Hello Syed,

Thank you very much -- indeed, replacing MartURLLocation with MartDBLocation did solve the problem completely. So it seems indeed that the warnings are not propagated over the web service. However, it does not explain why, when I use the web service and request a flank sequence, specifying a correct upstream_flank value, I still get a blank screen, although the same request yields sequences when using the DB service.

Do you recommend using the BioMart web or database service for a production server? I had favored the web approach to beat firewalls, and performance is not too much of an issue.

As for the question about TSS +/- 2kb: I thought that the TSS, the transcript start position, and the first exon start position were all the same value (trans-splicing making that picture more complicated). So my question was whether you might consider developing the option to specify an upstream and downstream "flank" relative to the transcript start position. Currently, if I choose to retrieve the flank, I cannot have that flank extend into the transcript.

Best,
Alexandre

Syed Haider wrote:
Hi Alexandre,

On Fri, 2008-07-25 at 09:05 +0200, Alexandre Gattiker wrote:
Thank you very much, that (mostly) solved it!

I can now retrieve transcript/gene/etc. sequences. But if I select one
of the four Flank options, I get a blank result and a warning in the log:
Use of uninitialized value in concatenation (.) or string at
biomart-perl/lib/BioMart/Web.pm line 2446

Ok, this is probably because the warning message isnt getting propagated
correctly over the webservice. Can you configure your apache using
<MartDBLocation >.... </MartDBLocation> type connection params in your
registry instead of <MartURLLocation>. I sent you these in my first
email yesterday. That will connect you to ensembl public databases
directly instead of going via www.biomart.org. I am hoping this would
resolve the flanks problem.


Running the same query at biomart.org however yields a screen message
and alert box:
Validation Error: Requests for flank sequence must be accompanied by an
upstream_flank or downstream_flank request

<http://www.biomart.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=mmusculus_gene_ensembl.default.sequences.gene_stable_id|mmusculus_gene_ensembl.default.sequences.str_chrom_name|mmusculus_gene_ensembl.default.sequences.struct_biotype|mmusculus_gene_ensembl.default.sequences.coding_gene_flank&FILTERS=mmusculus_gene_ensembl.default.filters.ensembl_gene_id."ENSMUSG00000055866"&VISIBLEPANEL=resultspanel>


Then, if I do select an upstream_flank, I still get a blank page, while the 
query works at biomart.org:
<http://www.biomart.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=mmusculus_gene_ensembl.default.sequences.gene_stable_id|mmusculus_gene_ensembl.default.sequences.str_chrom_name|mmusculus_gene_ensembl.default.sequences.struct_biotype|mmusculus_gene_ensembl.default.sequences.coding_gene_flank|mmusculus_gene_ensembl.default.sequences.upstream_flank."10"&FILTERS=mmusculus_gene_ensembl.default.filters.ensembl_gene_id."ENSMUSG00000055866"&VISIBLEPANEL=resultspanel>

NB I'm using biomart embedded into Ensembl.


Another issue:
I'm trying to fetch the sequence around the transcription start site
(e.g. -2 kb to +2 kb) for promoter analysis. Is there a way to do that?

You can only retrieve the +/- seqs w.r.t the start of first exon. TSS's
coordinates I guess are not available in ensembl database. cc'ed Glenn
to confirm.

cheers
syed


Best regards
Alexandre


Syed Haider wrote:
an even better solution, just add this to your existing registry:

<MartURLLocation
            name         = "sequence"
             displayName  = "Sequence (release 49)"
             host         = "www.biomart.org"
             port         = "80"
             visible      = ""
             default      = ""
             includeDatasets = "mmusculus_genomic_sequence"
             martUser     = ""
         />


On Thu, 2008-07-24 at 18:22 +0200, Alexandre Gattiker wrote:

Hello,

Kudos for this great piece of software. I managed to whip up a very
functional biomart by mashing up some lab data with the Ensembl biomart,
almost accidentally, as I didn't even expect that to be possible! It's
rare enough that software works even better than advertised and in such
a modular way.

I have a small issue, however. When I go to the Attributes -> Sequences
page, the SEQUENCES section has:

No visible attributes in collection seq_scope_type
No visible attributes in collection upstream
No visible attributes in collection downstream

I assume that's linked to warnings I get running configure.pl:

Setting possible links between datasets
....(scanning) 33%      WARNING:  Pointer attributes from
mmusculus_genomic_sequence will not be available as
mmusculus_genomic_sequence not in registry
      WARNING:  Pointer attributes from mmusculus_genomic_sequence will
not be available as mmusculus_genomic_sequence not in registry
      WARNING:  Pointer attributes from mmusculus_genomic_sequence will
not be available as mmusculus_genomic_sequence not in registry

My config is as follows. I have biomart 0.7.

        <MartURLLocation
            name         = "ensembl"
            displayName  = "Ensembl Genes (release 49)"
            host         = "www.biomart.org"
            port         = "80"
            visible      = "1"
            default      = ""
            includeDatasets = "mmusculus_gene_ensembl"
            martUser     = ""
        />

I tried
includeDatasets = "mmusculus_gene_ensembl,mmusculus_genomic_sequence"
but that didn't solve the problem. I also tried to leave includeDatasets
empty but I still get the warning (now for all species).

Best,
Alexandre


--
======================================
Syed Haider.
EMBL-European Bioinformatics Institute
Wellcome Trust Genome Campus, Hinxton,
Cambridge CB10 1SD, UK.
======================================



Richard Holland <[EMAIL PROTECTED]>


--
======================================
Syed Haider.
EMBL-European Bioinformatics Institute
Wellcome Trust Genome Campus, Hinxton,
Cambridge CB10 1SD, UK.
======================================



--
--------------------------------------------------------
Alexandre Gattiker   Bioinformatics & Biostatistics Core Facility
EPFL School of Life Sciences / Faculté des Sciences de la vie FSV
http://people.epfl.ch/Alexandre.Gattiker

Reply via email to