On 17 Jul 2006, at 08:50, Alexandre Gattiker wrote:
Hello,
I am considering setting up a biomart to allow users to query genes
that result from a specific expression assay, and fetch various
attributes.
I have a little worry: If I query the Ensembl mart for a set of genes,
then select for instance Affy probeset and GO ID output attributes and
table output, the resulting table contains N*M lines for each gene,
where N is the number of Affy probesets and M is the number of GO IDs
-- like the result of a SQL join.
ENSG00000198763 ENST00000361453 GO:0008137 1553538_s_at
ENSG00000198763 ENST00000361453 GO:0006120 1553538_s_at
ENSG00000198763 ENST00000361453 GO:0042773 1553538_s_at
ENSG00000198763 ENST00000361453 GO:0005739 1553538_s_at
ENSG00000198763 ENST00000361453 GO:0016491 1553538_s_at
ENSG00000198763 ENST00000361453 GO:0005747 1553538_s_at
ENSG00000198763 ENST00000361453 GO:0016020 1553538_s_at
ENSG00000198763 ENST00000361453 GO:0016021 1553538_s_at
ENSG00000198763 ENST00000361453 GO:0008137 1553551_s_at
ENSG00000198763 ENST00000361453 GO:0006120 1553551_s_at
ENSG00000198763 ENST00000361453 GO:0042773 1553551_s_at
ENSG00000198763 ENST00000361453 GO:0005739 1553551_s_at
ENSG00000198763 ENST00000361453 GO:0016491 1553551_s_at
ENSG00000198763 ENST00000361453 GO:0005747 1553551_s_at
ENSG00000198763 ENST00000361453 GO:0016020 1553551_s_at
ENSG00000198763 ENST00000361453 GO:0016021 1553551_s_at
That is not exactly what I would like -- since the tool is aimed at
biologists, I would like to have one output line per gene, with
multiple attribute values joined with a semicolon.
Is it possible to produce such an output?
I presume a way to do it would be to store the semicolon-joined string
as a column in the main table and make that an exportable attribute.
yes, exactly :-) BioMart behaves just like relational database and
retrieves
the data exactly as it laid out in the database. If you can create a
column where
you store the data as semicolon-joined string you will be able to
retrieve it as a
single attribute and single line of output
a.
a.
Thanks in advance for your advice.
Alexandre
--
Alexandre Gattiker
Swiss Institute of Bioinformatics, Genome Bioinformatics Group
Biozentrum Tel. +41 61 267 1579
Klingelbergstrasse 50 Fax +41 61 267 1585
4056 Basel [EMAIL PROTECTED]
Switzerland http://www.biozentrum.unibas.ch/primig
------------------------------------------------------------------------
-------
Arek Kasprzyk
EMBL-European Bioinformatics Institute.
Wellcome Trust Genome Campus, Hinxton,
Cambridge CB10 1SD, UK.
Tel: +44-(0)1223-494606
Fax: +44-(0)1223-494468
------------------------------------------------------------------------
-------