Hi guys, yes Rhoda is absolutely right. This issue still needs to be addressed
a On Mon, Apr 2, 2012 at 9:49 AM, Rhoda Kinsella <[email protected]> wrote: > Hi Bob and Tom, > The bug reported by Tom is a well known issue in BioMart and has been > reported to the developers. Arek may be able to advise as to whether this > has been addressed in the new code. Tom, basically this is the SQL for your > query: > > SELECT main.stable_id_1023, main.stable_id_1066, main.stable_id_1070, > ensembl_mart_66.hsapiens_gene_**ensembl__interpro__dm.**interpro_ac_1026, > ensembl_mart_66.hsapiens_gene_**ensembl__interpro__dm.display_**label_1074, > ensembl_mart_66.hsapiens_gene_**ensembl__interpro__dm.**description_1074 > FROM ensembl_mart_66.hsapiens_gene_**ensembl__interpro__dm, > ensembl_mart_66.hsapiens_gene_**ensembl__translation__main main WHERE > (main.stable_id_1066 = 'ENST00000507080') AND main.translation_id_1068_key= > **ensembl_mart_66.hsapiens_gene_**ensembl__interpro__dm.**translation_id_1068_key > LIMIT 200 > > The transcript used as a filter here is not protein coding and therefore > has no entry in the translation main table. When you click on attributes > that are stored on translation_id (e.g. InterPro, GO), these act something > like a filter as they try to join on translation_id from the translation > main table to the translation_id on the InterPro dimension table. As there > is no translation and therefore no translation_id, the user no longer gets > the gene id or transcript id back in the result set. This results in > reduced numbers of rows in the result file. > > > Bob, your issue is something different. I have spoken to the Ensembl > Genomes and Vectorbase teams and they are looking into it now. This issue > does not appear to affect the BioMart in the Ensembl project. > > I hope that helps, but please don't hesitate to get in touch if you have > further questions. > Regards > Rhoda > > > > > > On 2 Apr 2012, at 13:14, Bob MacCallum wrote: > > By strange coincidence a colleague has just reported a similar problem >> which affects VectorBase and Ensembl Genomes' 0.7 marts for Anopheles >> gambiae. >> >> The gene count is 13k but when you add GO attributes it still says 13k >> but the output contains only 8k genes (roughly the same number as when >> setting the "has GO term" filter, but note the behaviour happens with >> NO FILTERS). The same happens with Interpro but it only drops to 12k. >> >> Our biomart people will be having a look into the problem, I just >> wanted to share the "me too". >> >> On Mon, Apr 2, 2012 at 8:56 AM, Tom A <[email protected]> wrote: >> >>> 0507080 and select only gene name and description I get the results. >>> When I >>> select anything more like Interpro Description all the fileds disapear >>> and I >>> got null results in each column. Even if the RNA was not coding (but it >>> is >>> and have extensive annotation on Ensembl) I stil >>> >> ______________________________**_________________ >> Users mailing list >> [email protected] >> https://lists.biomart.org/**mailman/listinfo/users<https://lists.biomart.org/mailman/listinfo/users> >> > > Rhoda Kinsella Ph.D. > Ensembl Production Project Leader, > European Bioinformatics Institute (EMBL-EBI), > Wellcome Trust Genome Campus, > Hinxton > Cambridge CB10 1SD, > UK. > > > ______________________________**_________________ > Users mailing list > [email protected] > https://lists.biomart.org/**mailman/listinfo/users<https://lists.biomart.org/mailman/listinfo/users> >
_______________________________________________ Users mailing list [email protected] https://lists.biomart.org/mailman/listinfo/users
