Hi Bob and Tom,
The bug reported by Tom is a well known issue in BioMart and has been
reported to the developers. Arek may be able to advise as to whether
this has been addressed in the new code. Tom, basically this is the
SQL for your query:
SELECT main.stable_id_1023, main.stable_id_1066, main.stable_id_1070,
ensembl_mart_66.hsapiens_gene_ensembl__interpro__dm.interpro_ac_1026,
ensembl_mart_66
.hsapiens_gene_ensembl__interpro__dm.display_label_1074,
ensembl_mart_66.hsapiens_gene_ensembl__interpro__dm.description_1074
FROM ensembl_mart_66.hsapiens_gene_ensembl__interpro__dm,
ensembl_mart_66.hsapiens_gene_ensembl__translation__main main WHERE
(main.stable_id_1066 = 'ENST00000507080') AND
main
.translation_id_1068_key
=
ensembl_mart_66
.hsapiens_gene_ensembl__interpro__dm.translation_id_1068_key LIMIT 200
The transcript used as a filter here is not protein coding and
therefore has no entry in the translation main table. When you click
on attributes that are stored on translation_id (e.g. InterPro, GO),
these act something like a filter as they try to join on
translation_id from the translation main table to the translation_id
on the InterPro dimension table. As there is no translation and
therefore no translation_id, the user no longer gets the gene id or
transcript id back in the result set. This results in reduced numbers
of rows in the result file.
Bob, your issue is something different. I have spoken to the Ensembl
Genomes and Vectorbase teams and they are looking into it now. This
issue does not appear to affect the BioMart in the Ensembl project.
I hope that helps, but please don't hesitate to get in touch if you
have further questions.
Regards
Rhoda
On 2 Apr 2012, at 13:14, Bob MacCallum wrote:
By strange coincidence a colleague has just reported a similar problem
which affects VectorBase and Ensembl Genomes' 0.7 marts for Anopheles
gambiae.
The gene count is 13k but when you add GO attributes it still says 13k
but the output contains only 8k genes (roughly the same number as when
setting the "has GO term" filter, but note the behaviour happens with
NO FILTERS). The same happens with Interpro but it only drops to 12k.
Our biomart people will be having a look into the problem, I just
wanted to share the "me too".
On Mon, Apr 2, 2012 at 8:56 AM, Tom A <[email protected]> wrote:
0507080 and select only gene name and description I get the
results. When I
select anything more like Interpro Description all the fileds
disapear and I
got null results in each column. Even if the RNA was not coding
(but it is
and have extensive annotation on Ensembl) I stil
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users
Rhoda Kinsella Ph.D.
Ensembl Production Project Leader,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users