On 17 Jul 2006, at 08:50, Alexandre Gattiker wrote:

Hello,

I am considering setting up a biomart to allow users to query genes that result from a specific expression assay, and fetch various attributes.

I have a little worry: If I query the Ensembl mart for a set of genes, then select for instance Affy probeset and GO ID output attributes and table output, the resulting table contains N*M lines for each gene, where N is the number of Affy probesets and M is the number of GO IDs -- like the result of a SQL join.

ENSG00000198763 ENST00000361453 GO:0008137      1553538_s_at    
ENSG00000198763 ENST00000361453 GO:0006120      1553538_s_at    
ENSG00000198763 ENST00000361453 GO:0042773      1553538_s_at    
ENSG00000198763 ENST00000361453 GO:0005739      1553538_s_at    
ENSG00000198763 ENST00000361453 GO:0016491      1553538_s_at    
ENSG00000198763 ENST00000361453 GO:0005747      1553538_s_at    
ENSG00000198763 ENST00000361453 GO:0016020      1553538_s_at    
ENSG00000198763 ENST00000361453 GO:0016021      1553538_s_at    
ENSG00000198763 ENST00000361453 GO:0008137      1553551_s_at    
ENSG00000198763 ENST00000361453 GO:0006120      1553551_s_at    
ENSG00000198763 ENST00000361453 GO:0042773      1553551_s_at    
ENSG00000198763 ENST00000361453 GO:0005739      1553551_s_at    
ENSG00000198763 ENST00000361453 GO:0016491      1553551_s_at    
ENSG00000198763 ENST00000361453 GO:0005747      1553551_s_at    
ENSG00000198763 ENST00000361453 GO:0016020      1553551_s_at    
ENSG00000198763 ENST00000361453 GO:0016021      1553551_s_at    

That is not exactly what I would like -- since the tool is aimed at biologists, I would like to have one output line per gene, with multiple attribute values joined with a semicolon.

Is it possible to produce such an output?

I presume a way to do it would be to store the semicolon-joined string as a column in the main table and make that an exportable attribute.



yes, exactly :-) BioMart behaves just like relational database and retrieves the data exactly as it laid out in the database. If you can create a column where you store the data as semicolon-joined string you will be able to retrieve it as a
single attribute and single line of output

a.

a.



Thanks in advance for your advice.
Alexandre

-- Alexandre Gattiker
Swiss Institute of Bioinformatics, Genome Bioinformatics Group
Biozentrum                                Tel. +41 61 267 1579
Klingelbergstrasse 50                     Fax  +41 61 267 1585
4056 Basel                                 [EMAIL PROTECTED]
Switzerland             http://www.biozentrum.unibas.ch/primig



------------------------------------------------------------------------ -------
Arek Kasprzyk
EMBL-European Bioinformatics Institute.
Wellcome Trust Genome Campus, Hinxton,
Cambridge CB10 1SD, UK.
Tel: +44-(0)1223-494606
Fax: +44-(0)1223-494468
------------------------------------------------------------------------ -------



Reply via email to