Hi,

I experienced the return of incomplete data sets. If I want to map a 
large set of ids (6500) (for example swissprot AC to EntrezGeneID via 
Ensembl Human) I get not the complete list back. If I query for all 
known id mappings and perform a merge in R afterwards I get a much 
longer list back (both lists with unique ids).

1. PROCDURE (via web interface):

$query->setDataset("hsapiens_gene_ensembl");
$query->addFilter("uniprot_swissprot_accession", [sample-SwissProtAC.txt]);
$query->addAttribute("entrezgene");
$query->formatter("TSV");
$query_runner->uniqueRowsOnly(1);

-> sample-RetrievedEG.txt (3420 IDs)

2. PROCEDURE (via web interface):

$query->setDataset("hsapiens_gene_ensembl");
$query->addAttribute("entrezgene");
$query->addAttribute("uniprot_swissprot_accession");
$query->formatter("TSV");
$query_runner->uniqueRowsOnly(1);

-> save as sp2eg
-> merge via R (sample-sp2eg.R)
-> sample-MapViaALL.txt (4382 IDs)

Unfortunately, the e-mail did not come through with attachments (9.10.). Here a 
try without.

Thanks & best regards,
Thomas

P.S.: I experienced a similar problem with an own mart before.


-- 
Thomas Burkard
CeMM - Research Centre for Molecular Medicine of the Austrian Academy of Science
Lazarettgasse 19/3. floor, A-1090 Vienna, Austria
Tel.: +43/1/40160 70 021
Mobile: +43/699/126 05 000
Fax.: +43/1/40160 970 030
Email: [EMAIL PROTECTED]
URL: http://www.cemm.at 





Reply via email to