Hi Daz, The inconsistent results might be caused by batching. You mentioned escell_clone are unique in your database, query through batching on certain RDBMS platform might result in rows being skipped or being duplicated.
Find out possible duplicates or missing rows in the output files should help figuring out the problem. sort result1.txt |uniq -d ### this will give you possible duplicates You can use sort and diff to find out possible missing items. BTW, which RDBMS platform are you using? Hope this helps, Junjun On 11-02-02 4:10 AM, "Darren Oakley" <[email protected]> wrote: Hi, Sorry for the slow response on this one - the help is much appreciated. :) Here's the output from the logs for the three separate queries. It all looks sensible to me... Cheers, Daz ---- KOMP-CSD Only: 2011/02/02 09:07:21 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE Query> <Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6"> <Dataset name="idcc_targ_rep" interface="default"> <Filter value="KOMP-CSD" name="pipeline"/> <Attribute name="escell_clone"/> </Dataset> </Query> 2011/02/02 09:07:21 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep escell_clone main 2011/02/02 09:07:21 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep pipeline main 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200,200 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 400,400 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 800,800 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 1600,1600 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 3200,3200 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 6400,6400 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 12800,12800 2011/02/02 09:07:22 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 25600,25600 2011/02/02 09:07:23 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 51200,50000 ---- EUCOMM Only: 2011/02/02 09:07:29 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE Query> <Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6"> <Dataset name="idcc_targ_rep" interface="default"> <Filter value="EUCOMM" name="pipeline"/> <Attribute name="escell_clone"/> </Dataset> </Query> 2011/02/02 09:07:29 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep escell_clone main 2011/02/02 09:07:29 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep pipeline main 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200,200 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 400,400 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 800,800 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 1600,1600 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 3200,3200 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 6400,6400 2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 12800,12800 2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 25600,25600 2011/02/02 09:07:31 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 51200,50000 2011/02/02 09:07:33 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 101200,50000 2011/02/02 09:07:34 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 117710,50000 ---- Both projects together: 2011/02/02 08:59:14 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE Query> <Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6"> <Dataset name="idcc_targ_rep" interface="default"> <Filter value="KOMP-CSD,EUCOMM" name="pipeline"/> <Attribute name="escell_clone"/> </Dataset> </Query> 2011/02/02 08:59:14 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep escell_clone main 2011/02/02 08:59:14 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep pipeline main 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200 2011/02/02 08:59:14 BioMart.Configuration.URLLocation:100:WARN> RESPONSE: 0.6 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200,200 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 400,400 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 800,800 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 1600,1600 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 3200,3200 2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 6400,6400 2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 12800,12800 2011/02/02 08:59:16 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 25600,25600 2011/02/02 08:59:17 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 51200,50000 2011/02/02 08:59:20 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 101200,50000 2011/02/02 08:59:24 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 151200,50000 2011/02/02 08:59:27 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 201200,50000 2011/02/02 08:59:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 218452,50000 ----- On 2 Feb 2011, at 00:23, Syed Haider wrote: > Hi Daz, > > Please turn on the logging in log file: > > conf/log4perl.conf > > by setting: > > log4perl.logger = DEBUG,Screen > > Please check if the SQL compilation is as per your expectation. That will > give us more pointers as in how to debug it further. > > Best, > Syed > > On 31/01/2011 11:19, Darren Oakley wrote: >> Hi All, >> >> Hope you can help me with this one... >> >> I have a filter on one of our marts >> (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep >> dataset) configured as a multi-select like so: >> >> --- >> <FilterCollection displayName="Pipeline" internalName="pipeline"> >> <FilterDescription displayName="Pipeline" displayType="list" >> field="pipeline" internalName="pipeline" key="id_1022_key" >> legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" >> tableConstraint="main" type="list"> >> <Option displayName="KOMP-CSD" internalName="KOMP-CSD" >> isSelectable="true" value="KOMP-CSD"/> >> <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" >> isSelectable="true" value="KOMP-Regeneron"/> >> <Option displayName="EUCOMM" internalName="EUCOMM" >> isSelectable="true" value="EUCOMM"/> >> <Option displayName="NorCOMM" internalName="NorCOMM" >> isSelectable="true" value="NorCOMM"/> >> <Option displayName="mirKO" internalName="mirKO" isSelectable="true" >> value="mirKO"/> >> </FilterDescription> >> </FilterCollection> >> --- >> >> But, it's producing unexpected results when you query for either of the >> pipelines separately, or together... >> >> i.e. the following xml: >> >> --- >> <?xml version="1.0" encoding="UTF-8"?> >> <!DOCTYPE Query> >> <Query virtualSchemaName = "default" formatter = "CSV" header = "0" >> uniqueRows = "1" count = "" datasetConfigVersion = "0.6"> >> >> <Dataset name = "idcc_targ_rep" interface = "default"> >> <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/> >> <Attribute name = "escell_clone" /> >> </Dataset> >> </Query> >> --- >> >> retrieves 129816 rows of data, but when you do the different pipelines >> separately, they do not add up to this number... >> >> KOMP-CSD on its own is 59440 >> EUCOMM on its own is 76076 >> >> So, combined they give 135516 rows of results - 5700 more than the first >> query! >> >> The attribute i'm selecting ('escell_clone') is unique for every row in the >> database, so it's not that i'm getting duplicates for the separate queries. >> >> Any ideas as to why i'm getting this shortfall in the first query? Is there >> something wrong with the way I set up the filter? >> >> Any help is much appreciated. :) >> >> Thanks, >> >> Daz >>
_______________________________________________ Users mailing list [email protected] https://lists.biomart.org/mailman/listinfo/users
