Hi,
I experienced the return of incomplete data sets. If I want to map a
large set of ids (6500) (for example swissprot AC to EntrezGeneID via
Ensembl Human) I get not the complete list back. If I query for all
known id mappings and perform a merge in R afterwards I get a much
longer list back (both lists with unique ids).
1. PROCDURE (via web interface):
$query->setDataset("hsapiens_gene_ensembl");
$query->addFilter("uniprot_swissprot_accession", [sample-SwissProtAC.txt]);
$query->addAttribute("entrezgene");
$query->formatter("TSV");
$query_runner->uniqueRowsOnly(1);
-> sample-RetrievedEG.txt (3420 IDs)
2. PROCEDURE (via web interface):
$query->setDataset("hsapiens_gene_ensembl");
$query->addAttribute("entrezgene");
$query->addAttribute("uniprot_swissprot_accession");
$query->formatter("TSV");
$query_runner->uniqueRowsOnly(1);
-> save as sp2eg
-> merge via R (sample-sp2eg.R)
-> sample-MapViaALL.txt (4382 IDs)
Unfortunately, the e-mail did not come through with attachments (9.10.). Here a
try without.
Thanks & best regards,
Thomas
P.S.: I experienced a similar problem with an own mart before.
--
Thomas Burkard
CeMM - Research Centre for Molecular Medicine of the Austrian Academy of Science
Lazarettgasse 19/3. floor, A-1090 Vienna, Austria
Tel.: +43/1/40160 70 021
Mobile: +43/699/126 05 000
Fax.: +43/1/40160 970 030
Email: [EMAIL PROTECTED]
URL: http://www.cemm.at