Hi Syed,

This order is not be always preserved though when you do a query across two
datasets.
We had issues with that in the past, I'm not sure if this still happens.

Cheers,
Steffen


On Thu, May 23, 2013 at 2:17 AM, Syed Haider <[email protected]> wrote:

> Hi Stephen, the order of header attributes should be the same as the order
> of attributes in the query you send. So, in theory, you can match these
> back!
>
> Syed
>
>
> On 22 May 2013 23:53, Steffen Durinck <[email protected]> wrote:
>
>> Hi Thomas,
>>
>> I figured out what goes wrong on the R side for the following query:
>>
>> <?xml version='1.0' encoding='UTF-8'?><!DOCTYPE Query><Query
>>  virtualSchemaName = 'default' uniqueRows = '1' count = '0'
>> datasetConfigVersion = '0.6' header='1' requestid= 'biomaRt'> <Dataset name
>> = 'hsapiens_gene_ensembl'><Attribute name = 'ensembl_gene_id'/><Attribute
>> name = 'hsapiens_paralog_ensembl_gene'/><Attribute name =
>> 'hsapiens_paralog_perc_id'/><Attribute name =
>> 'hsapiens_paralog_perc_id_r1'/><Filter name = 'ensembl_gene_id' value =
>> 'ENSG00000001561' /></Dataset></Query>
>>
>> When a result comes back from the BioMart server, I map the header names
>> (e.g. "Ensembl Gene ID") back to the attribute name that was used in the
>> query (ensembl_gene_id), and I return a matrix with the attribute names
>> instead of the attribute descriptions back to the user.
>>
>> However in case of the hsapiens_paralog_perc_id attributes, they get as
>> header "% Identity with respect to query gene" in the results from the
>> BioMart server.  As there are many attribute names with this same
>> description, I can not map these back to the original attribute name and
>> the R query crashes.
>>
>> Is there a way to make my XML query as such that I get the attribute name
>> back instead of the attribute description in the header so I don't have to
>> map things back?
>>
>> Cheers,
>> Steffen
>>
>>
>>
>> On Wed, May 22, 2013 at 2:55 PM, Steffen Durinck <[email protected]>wrote:
>>
>>> Hi Thomas, Benjamin,
>>>
>>> The problem is on the R side, the "%" symbol in the attributes names (%
>>> Identity with respect to query gene) are causing trouble, I am working
>>> on a fix.  Until then you can add bmHeader=FALSE to your getBM query and
>>> things should work (see below):
>>>
>>> human = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
>>> > attributes =
>>> c("ensembl_gene_id","mmusculus_homolog_ensembl_gene","mmusculus_homolog_perc_id_r1")
>>> > attributes=c(attributes,"mmusculus_homolog_orthology_type",
>>> "mmusculus_homolog_subtype", "mmusculus_homolog_perc_id")
>>> >  orth.mouse = getBM( attributes,filters="with_homolog_mmus",values
>>> =TRUE, mart = human, bmHeader=FALSE)
>>> > dim(orth.mouse)
>>> [1] 22886     6
>>>
>>> Best,
>>> Steffen
>>>
>>>
>>>
>>> On Wed, May 22, 2013 at 7:05 AM, Thomas Maurel <[email protected]> wrote:
>>>
>>>> Dear Benjamin,
>>>>
>>>> I can't see what's wrong with your query, but it looks like the issue
>>>> is coming from the following attributes:
>>>>  "mmusculus_homolog_orthology_type"
>>>> "mmusculus_homolog_subtype"
>>>> "mmusculus_homolog_perc_id"
>>>>
>>>> If you do the following query you will get the same error back:
>>>>
>>>> > library(biomaRt)
>>>> > human = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
>>>> > attributes =
>>>> c("ensembl_gene_id","mmusculus_homolog_ensembl_gene","mmusculus_homolog_orthology_type")
>>>> > orth.mouse = getBM(attributes,
>>>>  filters="with_homolog_mmus",values=TRUE, mart = human, 
>>>> uniqueRows=TRUE)this
>>>> might be a bug coming from the biomaRt package. I would advise you to email
>>>> the bioconductor list: [email protected]
>>>> Error in `[.data.frame`(result, , attributes) :
>>>>   undefined columns selected
>>>>
>>>> It's when you start adding one of the previous attribute that the query
>>>> fail.
>>>> I have also try your query with the host pointing to ensembl.org and I
>>>> am getting the same error.
>>>> Since the query is working fine on the biomart interface, this might be
>>>> a bug coming from the biomaRt package. I would advise you to email the
>>>> bioconductor list: [email protected]
>>>>
>>>> Hope this helps,
>>>> Regards,
>>>> Thomas
>>>> On 22 May 2013, at 09:47, Benjamin Dubreuil wrote:
>>>>
>>>> Hi folks,
>>>>
>>>> I'm having a problem with getBM function.
>>>> I would like to retrieve the orthologs genes between Human and Mouse,
>>>> with their percentage of identity to one another, their orthology
>>>> relationship and their common ancestor.
>>>>
>>>> I did this, and it works fine :
>>>>
>>>> >library(biomaRt)
>>>> >human = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
>>>> >attributes =
>>>> c("ensembl_gene_id","mmusculus_homolog_ensembl_gene","mmusculus_homolog_perc_id_r1")
>>>> > orth.mouse = getBM(attributes,
>>>> filters="with_homolog_mmus",values=TRUE, mart = human, uniqueRows=TRUE)
>>>> > dim(orth.mouse)
>>>> [1] 22886     3
>>>> >
>>>>
>>>> But when I'm adding some attributes, I get an error :
>>>>
>>>> >attributes=c(attributes,"mmusculus_homolog_orthology_type",
>>>> "mmusculus_homolog_subtype", "mmusculus_homolog_perc_id")
>>>> > orth.mouse = getBM( attributes,filters="with_homolog_mmus",values
>>>> =TRUE, mart = human)
>>>> Error in `[.data.frame`(result, , attributes) :
>>>>   undefined columns selected
>>>>
>>>>
>>>> I've checked the attributes names, I didnt make any typos:
>>>> > listAttributes(human)[c(1,567,573:576),]
>>>>  *                  name
>>>>    description*
>>>> 1                    ensembl_gene_id
>>>>                  Ensembl Gene ID
>>>> 567                mmusculus_homolog_ensembl_gene           Mouse
>>>> Ensembl Gene ID
>>>> 573                mmusculus_homolog_orthology_type          Homology
>>>> Type
>>>> 574                mmusculus_homolog_subtype
>>>> Ancestor
>>>> 575                mmusculus_homolog_perc_id                      %
>>>> Identity with respect to query gene
>>>> 576                mmusculus_homolog_perc_id_r1                 %
>>>> Identity with respect to Mouse gene
>>>>
>>>> Can anyone see what I'm doing wrong ?
>>>>
>>>>
>>>> Best.
>>>>
>>>> Dubreuil Benjamin
>>>> E. Levy Group (The Cell architecture Lab)
>>>> Weitzmann Insitute of Science, ISRAEL
>>>> Kimmelman Building, 4th floor, room 410
>>>> _______________________________________________
>>>> Users mailing list
>>>> [email protected]
>>>> https://lists.biomart.org/mailman/listinfo/users
>>>>
>>>>
>>>> --
>>>> Thomas Maurel
>>>> Bioinformatician - Ensembl Production Team
>>>> European Bioinformatics Institute (EMBL-EBI)
>>>> Wellcome Trust Genome Campus, Hinxton
>>>> Cambridge - CB10 1SD - UK
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> [email protected]
>>>> https://lists.biomart.org/mailman/listinfo/users
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Users mailing list
>> [email protected]
>> https://lists.biomart.org/mailman/listinfo/users
>>
>
>
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to