Hi Junjun, Okey doke, will run through the files as suggested. Back with some results soon. :)
It's running on MySQL 5.1.39. Cheers, Daz On 2 Feb 2011, at 14:30, Junjun Zhang wrote: > Hi Daz, > > The inconsistent results might be caused by batching. You mentioned > escell_clone are unique in your database, query through batching on certain > RDBMS platform might result in rows being skipped or being duplicated. > > Find out possible duplicates or missing rows in the output files should help > figuring out the problem. > > sort result1.txt |uniq -d ### this will give you possible duplicates > > You can use sort and diff to find out possible missing items. > > BTW, which RDBMS platform are you using? > > Hope this helps, > > Junjun > > > On 11-02-02 4:10 AM, "Darren Oakley" <[email protected]> wrote: > > Hi, > > Sorry for the slow response on this one - the help is much appreciated. :) > > Here's the output from the logs for the three separate queries. It all looks > sensible to me... > > Cheers, > > Daz > > ---- > > KOMP-CSD Only: > > 2011/02/02 09:07:21 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" > encoding="UTF-8"?> > <!DOCTYPE Query> > <Query formatter="TSV" header="0" virtualSchemaName="default" count="" > uniqueRows="1" datasetConfigVersion="0.6"> > <Dataset name="idcc_targ_rep" interface="default"> > <Filter value="KOMP-CSD" name="pipeline"/> > <Attribute name="escell_clone"/> > </Dataset> > </Query> > > 2011/02/02 09:07:21 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep > escell_clone main > 2011/02/02 09:07:21 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep > pipeline main > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200,200 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'KOMP-CSD') LIMIT 400,400 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'KOMP-CSD') LIMIT 800,800 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'KOMP-CSD') LIMIT 1600,1600 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'KOMP-CSD') LIMIT 3200,3200 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'KOMP-CSD') LIMIT 6400,6400 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'KOMP-CSD') LIMIT 12800,12800 > 2011/02/02 09:07:22 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'KOMP-CSD') LIMIT 25600,25600 > 2011/02/02 09:07:23 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'KOMP-CSD') LIMIT 51200,50000 > > ---- > > EUCOMM Only: > > 2011/02/02 09:07:29 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" > encoding="UTF-8"?> > <!DOCTYPE Query> > <Query formatter="TSV" header="0" virtualSchemaName="default" count="" > uniqueRows="1" datasetConfigVersion="0.6"> > <Dataset name="idcc_targ_rep" interface="default"> > <Filter value="EUCOMM" name="pipeline"/> > <Attribute name="escell_clone"/> > </Dataset> > </Query> > > 2011/02/02 09:07:29 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep > escell_clone main > 2011/02/02 09:07:29 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep > pipeline main > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'EUCOMM') LIMIT 200 > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'EUCOMM') LIMIT 200,200 > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'EUCOMM') LIMIT 400,400 > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'EUCOMM') LIMIT 800,800 > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'EUCOMM') LIMIT 1600,1600 > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'EUCOMM') LIMIT 3200,3200 > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'EUCOMM') LIMIT 6400,6400 > 2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'EUCOMM') LIMIT 12800,12800 > 2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'EUCOMM') LIMIT 25600,25600 > 2011/02/02 09:07:31 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'EUCOMM') LIMIT 51200,50000 > 2011/02/02 09:07:33 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'EUCOMM') LIMIT 101200,50000 > 2011/02/02 09:07:34 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline = 'EUCOMM') LIMIT 117710,50000 > > ---- > > Both projects together: > > 2011/02/02 08:59:14 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" > encoding="UTF-8"?> > <!DOCTYPE Query> > <Query formatter="TSV" header="0" virtualSchemaName="default" count="" > uniqueRows="1" datasetConfigVersion="0.6"> > <Dataset name="idcc_targ_rep" interface="default"> > <Filter value="KOMP-CSD,EUCOMM" name="pipeline"/> > <Attribute name="escell_clone"/> > </Dataset> > </Query> > > 2011/02/02 08:59:14 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep > escell_clone main > 2011/02/02 08:59:14 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep > pipeline main > 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200 > 2011/02/02 08:59:14 BioMart.Configuration.URLLocation:100:WARN> RESPONSE: 0.6 > 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200,200 > 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 400,400 > 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 800,800 > 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 1600,1600 > 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 3200,3200 > 2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 6400,6400 > 2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 12800,12800 > 2011/02/02 08:59:16 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 25600,25600 > 2011/02/02 08:59:17 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 51200,50000 > 2011/02/02 08:59:20 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 101200,50000 > 2011/02/02 08:59:24 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 151200,50000 > 2011/02/02 08:59:27 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 201200,50000 > 2011/02/02 08:59:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT > main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main > WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 218452,50000 > > ----- > > On 2 Feb 2011, at 00:23, Syed Haider wrote: > > > Hi Daz, > > > > Please turn on the logging in log file: > > > > conf/log4perl.conf > > > > by setting: > > > > log4perl.logger = DEBUG,Screen > > > > Please check if the SQL compilation is as per your expectation. That will > > give us more pointers as in how to debug it further. > > > > Best, > > Syed > > > > On 31/01/2011 11:19, Darren Oakley wrote: > >> Hi All, > >> > >> Hope you can help me with this one... > >> > >> I have a filter on one of our marts > >> (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep > >> dataset) configured as a multi-select like so: > >> > >> --- > >> <FilterCollection displayName="Pipeline" internalName="pipeline"> > >> <FilterDescription displayName="Pipeline" displayType="list" > >> field="pipeline" internalName="pipeline" key="id_1022_key" > >> legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" > >> tableConstraint="main" type="list"> > >> <Option displayName="KOMP-CSD" internalName="KOMP-CSD" > >> isSelectable="true" value="KOMP-CSD"/> > >> <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" > >> isSelectable="true" value="KOMP-Regeneron"/> > >> <Option displayName="EUCOMM" internalName="EUCOMM" > >> isSelectable="true" value="EUCOMM"/> > >> <Option displayName="NorCOMM" internalName="NorCOMM" > >> isSelectable="true" value="NorCOMM"/> > >> <Option displayName="mirKO" internalName="mirKO" > >> isSelectable="true" value="mirKO"/> > >> </FilterDescription> > >> </FilterCollection> > >> --- > >> > >> But, it's producing unexpected results when you query for either of the > >> pipelines separately, or together... > >> > >> i.e. the following xml: > >> > >> --- > >> <?xml version="1.0" encoding="UTF-8"?> > >> <!DOCTYPE Query> > >> <Query virtualSchemaName = "default" formatter = "CSV" header = "0" > >> uniqueRows = "1" count = "" datasetConfigVersion = "0.6"> > >> > >> <Dataset name = "idcc_targ_rep" interface = "default"> > >> <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/> > >> <Attribute name = "escell_clone" /> > >> </Dataset> > >> </Query> > >> --- > >> > >> retrieves 129816 rows of data, but when you do the different pipelines > >> separately, they do not add up to this number... > >> > >> KOMP-CSD on its own is 59440 > >> EUCOMM on its own is 76076 > >> > >> So, combined they give 135516 rows of results - 5700 more than the first > >> query! > >> > >> The attribute i'm selecting ('escell_clone') is unique for every row in > >> the database, so it's not that i'm getting duplicates for the separate > >> queries. > >> > >> Any ideas as to why i'm getting this shortfall in the first query? Is > >> there something wrong with the way I set up the filter? > >> > >> Any help is much appreciated. :) > >> > >> Thanks, > >> > >> Daz > >> > >
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Users mailing list [email protected] https://lists.biomart.org/mailman/listinfo/users
