The error is thrown on the BioMart side, I don't think it likes quotes in
gene symbols.
In addition, if you want a mapping between all gene symbols and ensembl gene
ids you could do this in one query in R:
library(biomaRt)
mart.ens <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
map = getBM(attributes=c("ensembl_gene_id", "hgnc_symbol"),
filters="with_hgnc", values=TRUE, mart=mart.ens)
head(map)
ensembl_gene_id hgnc_symbol
1 ENSG00000249567 MIMT1
2 ENSG00000246493 SNHG8
3 ENSG00000187667 WHAMML1
4 ENSG00000248334 WHAMML2
5 ENSG00000225273 UBE2Q2P2
6 ENSG00000186615 C14orf33
Steffen
On Fri, Jul 15, 2011 at 10:45 AM, Marie Wong-Erasmus <
[email protected]> wrote:
> hi Tim,
>
> You might want to post this to the bioconductor mailing list to get them to
> handle single quotes in the value field.
>
> Either way, aliases should just return an empty set.
> Only hgnc symbols that are not synonyms will have an associated ENSG id.
> So if you used IMP8 which is an alias for IPO8, you will get an empty set
> which is what should be returned if you used 2'-PDE
>
> Marie
>
> From: Timothée Flutre <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Fri, 15 Jul 2011 12:13:41 -0400
> To: "[email protected]" <[email protected]>
> Subject: [BioMart Users] bug when gene ID contains an apostrophe
>
> Hello,
>
> I am using the R package "biomaRt" to find Ensembl IDs from a list of HGCN
> gene IDs:
>
> source("http://bioconductor.org/biocLite.R")
> biocLite("biomaRt",lib="~/src/Rlibs/")
> library(biomaRt)
> mart.ens <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
> getBM(attributes=c("ensembl_gene_id", "hgnc_symbol"),
> filters="hgnc_symbol", values="IPO8", mart=mart.ens)
> ensembl_gene_id hgnc_symbol
> 1 ENSG00000133704 IPO8
>
> It's working pretty well until a HGCN ID contains an apostrophe (here
> "2'-PDE" is an alias for the gene "PDE12", see
> here<http://www.genenames.org/data/hgnc_data.php?hgnc_id=25386>
> ):
>
> getBM(attributes=c("ensembl_gene_id", "hgnc_symbol"),
> filters="hgnc_symbol", values="2'-PDE", mart=mart.ens)
> Error in getBM(attributes = c("ensembl_gene_id", "hgnc_symbol"), filters =
> "hgnc_symbol", :
> Query ERROR: caught BioMart::Exception: non-BioMart die():
> not well-formed (invalid token) at line 1, column 322, byte 322 at
> /usr/lib/perl5/XML/Parser.pm line 187
>
> I know how to work around this for my own case, but would it be possible to
> fix this for a future release?
>
> Best regards,
> Tim
>
>
> _______________________________________________
> Users mailing list
> [email protected]
> https://lists.biomart.org/mailman/listinfo/users
>
>
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users