hi Tim,

You might want to post this to the bioconductor mailing list to get them to 
handle single quotes in the value field.

Either way, aliases should just return an empty set.
Only hgnc symbols that are not synonyms will have an associated ENSG id.
So if you used IMP8 which is an alias for IPO8, you will get an empty set which 
is what should be returned if you used 2'-PDE

Marie

From: Timothée Flutre <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Fri, 15 Jul 2011 12:13:41 -0400
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: [BioMart Users] bug when gene ID contains an apostrophe

Hello,

I am using the R package "biomaRt" to find Ensembl IDs from a list of HGCN gene 
IDs:

source("http://bioconductor.org/biocLite.R";)
biocLite("biomaRt",lib="~/src/Rlibs/")
library(biomaRt)
mart.ens <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
getBM(attributes=c("ensembl_gene_id", "hgnc_symbol"), filters="hgnc_symbol", 
values="IPO8", mart=mart.ens)
  ensembl_gene_id hgnc_symbol
1 ENSG00000133704        IPO8

It's working pretty well until a HGCN ID contains an apostrophe (here "2'-PDE" 
is an alias for the gene "PDE12", see 
here<http://www.genenames.org/data/hgnc_data.php?hgnc_id=25386>):

getBM(attributes=c("ensembl_gene_id", "hgnc_symbol"), filters="hgnc_symbol", 
values="2'-PDE", mart=mart.ens)
Error in getBM(attributes = c("ensembl_gene_id", "hgnc_symbol"), filters = 
"hgnc_symbol",  :
  Query ERROR: caught BioMart::Exception: non-BioMart die():
not well-formed (invalid token) at line 1, column 322, byte 322 at 
/usr/lib/perl5/XML/Parser.pm line 187

I know how to work around this for my own case, but would it be possible to fix 
this for a future release?

Best regards,
Tim

_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to