Dear Junjun, I've looked into the request log file and the query is the following:
2011-07-21 10:02:18,842 INFO [4219289@qtp-914691-6:Log.java:164]: Incoming XML query: <!DOCTYPE Query><Query client="biomartclient" processor="TSVX" limit="1000" header="1"><Dataset name="hsapiens_gene_ensembl" config="hsapiens_gene_ensembl_config"><Attribute name="CDKdb_proto__DiffExpression__main__gene_affy_id_101"/></Dataset></Query> Source: SELECT *ckd.CDKdb_proto__DiffExpression__main.gene_affy_id*, * ckd.CDKdb_proto__DiffExpression__main.gene_affy_id* FROM ckd.CDKdb_proto__DiffExpression__main 2011-07-21 10:02:19,260 INFO [pool-1-thread-1:Log.java:164]: using linkindices 2011-07-21 10:02:19,261 INFO [pool-1-thread-1:Log.java:164]: looking for index file in fileSystem (only first time trip): hsapiens_gene_ensembl_hsapiens_gene_ensembl__CDKdb_proto__DiffExpression__main_ckd.txt 2011-07-21 10:02:19,261 INFO [pool-1-thread-1:Log.java:164]: GETWD: /root/Desktop/biomartrc6/dist 2011-07-21 10:02:19,261 INFO [pool-1-thread-1:Log.java:164]: reading indices from fileSystem: hsapiens_gene_ensembl_hsapiens_gene_ensembl__CDKdb_proto__DiffExpression__main_ckd.txt 2011-07-21 10:02:19,262 ERROR [pool-1-thread-1:Log.java:208]:* index file not found* under: /registry/linkindices/ Index NOT used! 2011-07-21 10:02:32,062 INFO [4219289@qtp-914691-6:Log.java:164]: Total query time is 13080 ms The first think that I find strange is that the SELECT contains twice the *ckd.CDKdb_proto__DiffExpression__main.gene_affy_id *attribute but the output of the query only shows it once. As you can see there is no ensembl attribute involved in the query but I don't get all the affy ids stored in the local database (There are 816 affy ids and I get 574). Do you think that the index file not found issue could be causing this? Thanks again! Isaac 2011/7/21 Junjun Zhang <[email protected]> > Dear Isaac, > > Your use case is well taken here. We have users are trying to do similar > things. It's a matter of inner join v.s. left join. The current behaviour in > BioMart is that the joins are inner join, ie, only intersection will be > returned in the result. It is currently not possible to alter this > behaviour. We are thinking to introduce a flag in the configuration to let > deployer control the join behaviour. > > You mentioned that even you did not include any ensembl attribute in the > query, you still get only the intersection. This is strange, is there any > ensembl filter used in the query? As you would imagine, if a query only > involves attribute/filter from one dataset, there shouldn't be any join at > all. To further diagnose the problem, you may want to look into the log and > find out how query is executed. > > Let me know how you find. > > Cheers, > Junjun > > > From: Isaac cano <[email protected]> > Date: Wed, 20 Jul 2011 11:33:04 -0400 > To: BioMart Users <[email protected]> > Subject: [BioMart Users] Importing from sources > > Dear BioMart users, > > I'm dealing with BioMart to annotate a local database by importing several > attributes from the Ensembl mart (homo sapiens). Now I've created a > configuration for the Ensembl source and imported the local attributes to > it. > > The general point is that we would like to retrieve all the data stored in > the local database plus other attributes from different sources like Ensembl > in case they exist, in case not we would like to still retrieve the local > data and getting "no data" in those Ensembl attributes that do not exist for > the selected local attributes. Is this possible? > > The issue is that when I query only the local attributes (in this case the > "affy id" attribute) the results only contain those "affy ids" that are also > present on Ensembl. I would agree with this result if I would have included > in the query attributes from the Ensembl mart but this is not the case. Then > the question is the following: Is BioMart by default making the intersection > between the two sources when creating a link between them? Is there a way to > get the union instead of the intersection? > > Thanks in advance, > > -- > Isaac Cano > Bioinformatics > Linkcare Health Services SL > C/Villarroel 170 > 08036 - Barcelona > Tel.: (+34)932 275 400, ext. 4182\4523 > Mobile: (+34) 666 186 748 > Fax: (+34) 932 275 455 > [email protected] > > -- Isaac Cano Bioinformatics Linkcare Health Services SL C/Villarroel 170 08036 - Barcelona Tel.: (+34)932 275 400, ext. 4182\4523 Mobile: (+34) 666 186 748 Fax: (+34) 932 275 455 [email protected]
_______________________________________________ Users mailing list [email protected] https://lists.biomart.org/mailman/listinfo/users
