Dear Junjun,

I've looked into the request log file and the query is the following:

2011-07-21 10:02:18,842 INFO  [4219289@qtp-914691-6:Log.java:164]: Incoming
XML query: <!DOCTYPE Query><Query client="biomartclient" processor="TSVX"
limit="1000" header="1"><Dataset name="hsapiens_gene_ensembl"
config="hsapiens_gene_ensembl_config"><Attribute
name="CDKdb_proto__DiffExpression__main__gene_affy_id_101"/></Dataset></Query>
Source: SELECT *ckd.CDKdb_proto__DiffExpression__main.gene_affy_id*, *
ckd.CDKdb_proto__DiffExpression__main.gene_affy_id* FROM
ckd.CDKdb_proto__DiffExpression__main
2011-07-21 10:02:19,260 INFO  [pool-1-thread-1:Log.java:164]: using
linkindices
2011-07-21 10:02:19,261 INFO  [pool-1-thread-1:Log.java:164]: looking for
index file in fileSystem (only first time trip):
hsapiens_gene_ensembl_hsapiens_gene_ensembl__CDKdb_proto__DiffExpression__main_ckd.txt
2011-07-21 10:02:19,261 INFO  [pool-1-thread-1:Log.java:164]: GETWD:
/root/Desktop/biomartrc6/dist
2011-07-21 10:02:19,261 INFO  [pool-1-thread-1:Log.java:164]: reading
indices from fileSystem:
hsapiens_gene_ensembl_hsapiens_gene_ensembl__CDKdb_proto__DiffExpression__main_ckd.txt
2011-07-21 10:02:19,262 ERROR [pool-1-thread-1:Log.java:208]:* index file
not found* under: /registry/linkindices/
Index NOT used!
2011-07-21 10:02:32,062 INFO  [4219289@qtp-914691-6:Log.java:164]: Total
query time is 13080 ms

The first think that I find strange is that the SELECT contains twice
the *ckd.CDKdb_proto__DiffExpression__main.gene_affy_id
*attribute but the output of the query only shows it once. As you can see
there is no ensembl attribute involved in the query but I don't get all the
affy ids stored in the local database (There are 816 affy ids and I get
574). Do you think that the index file not found issue could be causing
this?

Thanks again!

Isaac

2011/7/21 Junjun Zhang <[email protected]>

> Dear Isaac,
>
> Your use case is well taken here. We have users are trying to do similar
> things. It's a matter of inner join v.s. left join. The current behaviour in
> BioMart is that the joins are inner join, ie, only intersection will be
> returned in the result. It is currently not possible to alter this
> behaviour. We are thinking to introduce a flag in the configuration to let
> deployer control the join behaviour.
>
> You mentioned that even you did not include any ensembl attribute in the
> query, you still get only the intersection. This is strange, is there any
> ensembl filter used in the query? As you would imagine, if a query only
> involves attribute/filter from one dataset, there shouldn't be any join at
> all. To further diagnose the problem, you may want to look into the log and
> find out how query is executed.
>
> Let me know how you find.
>
> Cheers,
> Junjun
>
>
> From: Isaac cano <[email protected]>
> Date: Wed, 20 Jul 2011 11:33:04 -0400
> To: BioMart Users <[email protected]>
> Subject: [BioMart Users] Importing from sources
>
> Dear BioMart users,
>
> I'm dealing with BioMart to annotate a local database by importing several
> attributes from the Ensembl mart (homo sapiens). Now I've created a
> configuration for the Ensembl source and imported the local attributes to
> it.
>
> The general point is that we would like to retrieve all the data stored in
> the local database plus other attributes from different sources like Ensembl
> in case they exist, in case not we would like to still retrieve the local
> data and getting "no data" in those Ensembl attributes that do not exist for
> the selected local attributes. Is this possible?
>
> The issue is that when I query only the local attributes (in this case the
> "affy id" attribute) the results only contain those "affy ids" that are also
> present on Ensembl. I would agree with this result if I would have included
> in the query attributes from the Ensembl mart but this is not the case. Then
> the question is the following: Is BioMart by default making the intersection
> between the two sources when creating a link between them? Is there a way to
> get the union instead of the intersection?
>
> Thanks in advance,
>
> --
> Isaac Cano
> Bioinformatics
> Linkcare Health Services SL
> C/Villarroel 170
> 08036 - Barcelona
> Tel.: (+34)932 275 400, ext. 4182\4523
> Mobile: (+34) 666 186 748
> Fax: (+34) 932 275 455
> [email protected]
>
>


-- 
Isaac Cano
Bioinformatics
Linkcare Health Services SL
C/Villarroel 170
08036 - Barcelona
Tel.: (+34)932 275 400, ext. 4182\4523
Mobile: (+34) 666 186 748
Fax: (+34) 932 275 455
[email protected]
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to