Thanks a lot Steffen for the details, hopefully things would begin to fall in place in next few months.

Best,
Syed


Steffen Durinck wrote:
Hi All,

Since December we stopped parsing the XML in biomaRt and get attribute
and filter names through requests to the web service.  This update has
been available to our users as a developmental package. Bioconductor
has a cycle of releasing a new release every 6 months and
unfortunately there has not been a new release since last October.  By
the end of this month however a new release of Bioconductor comes out
which includes the updated version of biomaRt.

There is thus no need for changing attribute/filter names.  Ask users
to give the version number of biomaRt by using the R command
sessionInfo()
They'll see the following:

sessionInfo()
R version 2.8.0 (2008-10-20)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] biomaRt_1.99.9

loaded via a namespace (and not attached):
[1] RCurl_0.91-0 XML_1.98-1


if the biomaRt version is not higher than 1.99.0 they need to update
to using the developmental version of biomaRt which can be downloaded
here:

http://bioconductor.org/packages/2.4/bioc/html/biomaRt.html

By the end of this month this will be the release version of biomaRt
and it will be installed by default.  So there will be hopefully less
confusion.

Cheers,
Steffen

On Wed, Apr 15, 2009 at 3:50 AM, Syed Haider <[email protected]> wrote:
Hi All,

here is the cause of the trouble. It seems 'biomaRt' is parsing XMLs
directly to retrieve attribute and filter names. The equivalent of this is
to retrieve the names from appropriate web service requests. www.biomart.org
or www.ensembl.org upon parsing XMLs makes all internalNames lower case
hence users interacting with the web server have lower case internalNames.
In case of this problematic attribute, the internalName does have 'F' at the
end in XML only which gets resolved to lower case upon configuration,
however, biomaRt does not do this. I would suggest:

a- short term solution is to lets try and change the name as Rhoda suggested
to lower case but a one liner makeInternalNameToLowerCase() in biomaRt would
be a *bullet proof solution*.

b- long term solution is to avoid direct interrogation of XMLs and talk to
web service requests to retrieve att and filt names. In case some features
are missing from web service requests, please do bring this forward and we
will add these to the web service response.

Thanks to all for your patience with this,

Best,
Syed


Rhoda Kinsella wrote:
Hi Syed,
Thats exactly my point, what i am  unable to understand is that where the
'F' comes from in the first place. Its not present in ensembl.
It is present in the Ensembl configuration files as an internal name,
which is where biomaRt gets the information from. They don't have
agilent_g2519F hardcoded anywhere in biomaRt. It is pulled out of our meta
tables as far as I understand.
Hope that clarifies things a bit better,
Regards,
Rhoda


If its a typo during hard-coded procedure while naming the atts in
biomaRt, then ensembl should not really be changing anything.



Rhoda Kinsella wrote:
Hi Syed,
If you do listAtributes(ensembl) in biomaRt you will see the 'F'
(attribute number 3):
listAttributes(ensembl)
                                         name
   description
1                                affy_zebrafish
  Affy zebrafish
2                                agilent_g2518a
  Agilent g2518a
3                                agilent_g2519F
  Agilent g2519f
4                                 agilent_probe
   Agilent Probe
etc...
So the capital F is in the internal name and the lowercase 'f' in the
display name which we see on the mart interface. If I do the same query
using agilent_g2518a (which has lowercase a in filter and attribute) I can
pull out results. I agree it is not ideal to change these internal names,
but in this case I think it will be necessary to fix the problem.
Regards,
Rhoda
On 15 Apr 2009, at 10:19, Syed Haider wrote:
Which ever is the case, the name even if with small 'f' still is not
the same as with_*. What i see right now is that both att and filt have
small 'f'. I wonder where is the capital 'F' is coming from ?

wwe also need to be careful in changing names since other clients also
have saved queries which might break.

Syed


Damian Smedley wrote:
On Wed, Apr 15, 2009 at 9:31 AM, Syed Haider <[email protected]
<mailto:[email protected]>> wrote:
 Hi Ruben,
 Have you tried executing your query getBM(...) with
 'agilent_gf2519f' ? biomaRt throws exception in both case
 (agilent_gf2519f and agilent_gf2519F). The real attribute is
 'agilent_gf2519f' which works fine from www.biomart.org
 <http://www.biomart.org>. I am not sure how the new release is going
 to fix this bug, may be i am missing something here. By trying it on
 R, i feel that its a problem with R API of biomaRt, cc'ing Steffen
 who would know how to debug this.
Its a few years since I was involved with this but I have a feeling
BioConductor hard codes the names of attributes and filters you can use and
if these don't match the names in the EnsemblMart config then things break.
At least that used to be the case. So it seems either Rhoda has to change
the name back in the next ensembl release or the hard coded name in
BioConductor will need changing. Steffen will know more
Cheers
Damian
  Best,
 Syed
 Ruben wrote:
     Hi Syed,
     in biomaRt from bioconductor the attribute name is
     'agilent_g2519F'. If I execute this code
      > ensembl = useMart("ensembl");
      > ensembl = useDataset("drerio_gene_ensembl", mart=ensembl);
      > atr <- listAttributes(ensembl)
      > atr$name[3]
     I get this:
      > [1] "agilent_g2519F"
     It seems that there is a bug, but the new release will fix it.
     Thanks again,
     Rubén.
     Syed Haider wrote:
         Ruben,
         the attribute name is: 'agilent_g2519f' not 'agilent_g2519F'
         hope this works.
         Best,
         Syed
         Rhoda Kinsella wrote:
             Hi Ruben,
             I suspect that there is inconsistency between the
             spelling of the internal name of the filter and the
             attribute. I will look into it and try to fix it for
             release 54 (approx end of April). Many apologies for any
             inconvenience caused.
             Regards,
             Rhoda
             On 14 Apr 2009, at 12:27, Ruben wrote:
                 Hi to all,
                 I am trying to invoke the following R code,
                 ensembl = useMart("ensembl");
                 ensembl = useDataset("drerio_gene_ensembl",
                 mart=ensembl);
                 ids <- getBM(attributes =
                 c("ensembl_gene_id","agilent_g2519F"), filters =
                 "with_agilent_g2519f", values =TRUE,mart=ensembl);
                 but the result is always the same:
                 1 Query ERROR: caught BioMart::Exception::Usage:
                 Attribute agilent_g2519F NOT FOUND
                 Error en getBM(attributes = c("ensembl_gene_id",
                 "agilent_g2519F"), filters =
                 c("with_agilent_g2519f"),  :
                 Number of columns in the query result doesn't equal
                 number of attributes in query.  This is probably an
                 internal error, please report.
                 However, if I retrieve the list of available
                 attributes I can see "agilent_g2519F" in the list.
                 Can you help me? There is a mistake in my code or
                 there is something wrong in biomart?
                 Thanks in advance,
                 Rubén.
             Rhoda Kinsella Ph.D.
             Ensembl Bioinformatician,
             European Bioinformatics Institute (EMBL-EBI),
             Wellcome Trust Genome Campus,
             Hinxton
             Cambridge CB10 1SD,
             UK.
Rhoda Kinsella Ph.D.
Ensembl Bioinformatician,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.
Rhoda Kinsella Ph.D.
Ensembl Bioinformatician,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.

Reply via email to