Hi Bob and Tom,
The bug reported by Tom is a well known issue in BioMart and has been reported to the developers. Arek may be able to advise as to whether this has been addressed in the new code. Tom, basically this is the SQL for your query:

SELECT main.stable_id_1023, main.stable_id_1066, main.stable_id_1070, ensembl_mart_66.hsapiens_gene_ensembl__interpro__dm.interpro_ac_1026, ensembl_mart_66 .hsapiens_gene_ensembl__interpro__dm.display_label_1074, ensembl_mart_66.hsapiens_gene_ensembl__interpro__dm.description_1074 FROM ensembl_mart_66.hsapiens_gene_ensembl__interpro__dm, ensembl_mart_66.hsapiens_gene_ensembl__translation__main main WHERE (main.stable_id_1066 = 'ENST00000507080') AND main .translation_id_1068_key = ensembl_mart_66 .hsapiens_gene_ensembl__interpro__dm.translation_id_1068_key LIMIT 200

The transcript used as a filter here is not protein coding and therefore has no entry in the translation main table. When you click on attributes that are stored on translation_id (e.g. InterPro, GO), these act something like a filter as they try to join on translation_id from the translation main table to the translation_id on the InterPro dimension table. As there is no translation and therefore no translation_id, the user no longer gets the gene id or transcript id back in the result set. This results in reduced numbers of rows in the result file.


Bob, your issue is something different. I have spoken to the Ensembl Genomes and Vectorbase teams and they are looking into it now. This issue does not appear to affect the BioMart in the Ensembl project.

I hope that helps, but please don't hesitate to get in touch if you have further questions.
Regards
Rhoda




On 2 Apr 2012, at 13:14, Bob MacCallum wrote:

By strange coincidence a colleague has just reported a similar problem
which affects VectorBase and Ensembl Genomes' 0.7 marts for Anopheles
gambiae.

The gene count is 13k but when you add GO attributes it still says 13k
but the output contains only 8k genes (roughly the same number as when
setting the "has GO term" filter, but note the behaviour happens with
NO FILTERS).  The same happens with Interpro but it only drops to 12k.

Our biomart people will be having a look into the problem, I just
wanted to share the "me too".

On Mon, Apr 2, 2012 at 8:56 AM, Tom A <[email protected]> wrote:
0507080 and select only gene name and description I get the results. When I select anything more like Interpro Description all the fileds disapear and I got null results in each column. Even if the RNA was not coding (but it is
and have extensive annotation on Ensembl) I stil
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Rhoda Kinsella Ph.D.
Ensembl Production Project Leader,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.

_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to