Hi Junjun,

Okey doke, will run through the files as suggested.  Back with some results 
soon. :)

It's running on MySQL 5.1.39.

Cheers,

Daz

On 2 Feb 2011, at 14:30, Junjun Zhang wrote:

> Hi Daz,
> 
> The inconsistent results might be caused by batching. You mentioned 
> escell_clone are unique in your database, query through batching on certain 
> RDBMS platform might result in rows being skipped or being duplicated.
> 
> Find out possible duplicates or missing rows in the output files should help 
> figuring out the problem.
> 
> sort result1.txt |uniq -d  ### this will give you possible duplicates
> 
> You can use sort and diff to find out possible missing items.
> 
> BTW, which RDBMS platform are you using?
> 
> Hope this helps,
> 
> Junjun
> 
> 
> On 11-02-02 4:10 AM, "Darren Oakley" <[email protected]> wrote:
> 
> Hi,
> 
> Sorry for the slow response on this one - the help is much appreciated. :)
> 
> Here's the output from the logs for the three separate queries.  It all looks 
> sensible to me...
> 
> Cheers,
> 
> Daz
> 
> ----
> 
> KOMP-CSD Only:
> 
> 2011/02/02 09:07:21 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" 
> encoding="UTF-8"?>
> <!DOCTYPE Query>
> <Query formatter="TSV" header="0" virtualSchemaName="default" count="" 
> uniqueRows="1" datasetConfigVersion="0.6">
>   <Dataset name="idcc_targ_rep" interface="default">
>     <Filter value="KOMP-CSD" name="pipeline"/>
>     <Attribute name="escell_clone"/>
>   </Dataset>
> </Query>
> 
> 2011/02/02 09:07:21 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep    
>   escell_clone    main
> 2011/02/02 09:07:21 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep 
>   pipeline        main
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200,200
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'KOMP-CSD') LIMIT 400,400
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'KOMP-CSD') LIMIT 800,800
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'KOMP-CSD') LIMIT 1600,1600
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'KOMP-CSD') LIMIT 3200,3200
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'KOMP-CSD') LIMIT 6400,6400
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'KOMP-CSD') LIMIT 12800,12800
> 2011/02/02 09:07:22 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'KOMP-CSD') LIMIT 25600,25600
> 2011/02/02 09:07:23 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'KOMP-CSD') LIMIT 51200,50000
> 
> ----
> 
> EUCOMM Only:
> 
> 2011/02/02 09:07:29 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" 
> encoding="UTF-8"?>
> <!DOCTYPE Query>
> <Query formatter="TSV" header="0" virtualSchemaName="default" count="" 
> uniqueRows="1" datasetConfigVersion="0.6">
>   <Dataset name="idcc_targ_rep" interface="default">
>     <Filter value="EUCOMM" name="pipeline"/>
>     <Attribute name="escell_clone"/>
>   </Dataset>
> </Query>
> 
> 2011/02/02 09:07:29 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep    
>   escell_clone    main
> 2011/02/02 09:07:29 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep 
>   pipeline        main
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'EUCOMM') LIMIT 200
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'EUCOMM') LIMIT 200,200
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'EUCOMM') LIMIT 400,400
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'EUCOMM') LIMIT 800,800
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'EUCOMM') LIMIT 1600,1600
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'EUCOMM') LIMIT 3200,3200
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'EUCOMM') LIMIT 6400,6400
> 2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'EUCOMM') LIMIT 12800,12800
> 2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'EUCOMM') LIMIT 25600,25600
> 2011/02/02 09:07:31 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'EUCOMM') LIMIT 51200,50000
> 2011/02/02 09:07:33 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'EUCOMM') LIMIT 101200,50000
> 2011/02/02 09:07:34 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline = 'EUCOMM') LIMIT 117710,50000
> 
> ----
> 
> Both projects together:
> 
> 2011/02/02 08:59:14 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" 
> encoding="UTF-8"?>
> <!DOCTYPE Query>
> <Query formatter="TSV" header="0" virtualSchemaName="default" count="" 
> uniqueRows="1" datasetConfigVersion="0.6">
>   <Dataset name="idcc_targ_rep" interface="default">
>     <Filter value="KOMP-CSD,EUCOMM" name="pipeline"/>
>     <Attribute name="escell_clone"/>
>   </Dataset>
> </Query>
> 
> 2011/02/02 08:59:14 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep    
>   escell_clone    main
> 2011/02/02 08:59:14 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep 
>   pipeline        main
> 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200
> 2011/02/02 08:59:14 BioMart.Configuration.URLLocation:100:WARN> RESPONSE:  0.6
> 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200,200
> 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 400,400
> 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 800,800
> 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 1600,1600
> 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 3200,3200
> 2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 6400,6400
> 2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 12800,12800
> 2011/02/02 08:59:16 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 25600,25600
> 2011/02/02 08:59:17 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 51200,50000
> 2011/02/02 08:59:20 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 101200,50000
> 2011/02/02 08:59:24 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 151200,50000
> 2011/02/02 08:59:27 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 201200,50000
> 2011/02/02 08:59:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT 
> main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main 
> WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 218452,50000
> 
> -----
> 
> On 2 Feb 2011, at 00:23, Syed Haider wrote:
> 
> > Hi Daz,
> >
> > Please turn on the logging in log file:
> >
> > conf/log4perl.conf
> >
> > by setting:
> >
> > log4perl.logger = DEBUG,Screen
> >
> > Please check if the SQL compilation is as per your expectation. That will 
> > give us more pointers as in how to debug it further.
> >
> > Best,
> > Syed
> >
> > On 31/01/2011 11:19, Darren Oakley wrote:
> >> Hi All,
> >>
> >> Hope you can help me with this one...
> >>
> >> I have a filter on one of our marts 
> >> (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep 
> >> dataset) configured as a multi-select like so:
> >>
> >> ---
> >>     <FilterCollection displayName="Pipeline" internalName="pipeline">
> >>       <FilterDescription displayName="Pipeline" displayType="list" 
> >> field="pipeline" internalName="pipeline" key="id_1022_key" 
> >> legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" 
> >> tableConstraint="main" type="list">
> >>         <Option displayName="KOMP-CSD" internalName="KOMP-CSD" 
> >> isSelectable="true" value="KOMP-CSD"/>
> >>         <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" 
> >> isSelectable="true" value="KOMP-Regeneron"/>
> >>         <Option displayName="EUCOMM" internalName="EUCOMM" 
> >> isSelectable="true" value="EUCOMM"/>
> >>         <Option displayName="NorCOMM" internalName="NorCOMM" 
> >> isSelectable="true" value="NorCOMM"/>
> >>         <Option displayName="mirKO" internalName="mirKO" 
> >> isSelectable="true" value="mirKO"/>
> >>       </FilterDescription>
> >>     </FilterCollection>
> >> ---
> >>
> >> But, it's producing unexpected results when you query for either of the 
> >> pipelines separately, or together...
> >>
> >> i.e. the following xml:
> >>
> >> ---
> >> <?xml version="1.0" encoding="UTF-8"?>
> >> <!DOCTYPE Query>
> >> <Query  virtualSchemaName = "default" formatter = "CSV" header = "0" 
> >> uniqueRows = "1" count = "" datasetConfigVersion = "0.6">
> >>
> >>   <Dataset name = "idcc_targ_rep" interface = "default">
> >>       <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/>
> >>       <Attribute name = "escell_clone" />
> >>   </Dataset>
> >> </Query>
> >> ---
> >>
> >> retrieves 129816 rows of data, but when you do the different pipelines 
> >> separately, they do not add up to this number...
> >>
> >> KOMP-CSD on its own is 59440
> >> EUCOMM on its own is 76076
> >>
> >> So, combined they give 135516 rows of results - 5700 more than the first 
> >> query!
> >>
> >> The attribute i'm selecting ('escell_clone') is unique for every row in 
> >> the database, so it's not that i'm getting duplicates for the separate 
> >> queries.
> >>
> >> Any ideas as to why i'm getting this shortfall in the first query?  Is 
> >> there something wrong with the way I set up the filter?
> >>
> >> Any help is much appreciated. :)
> >>
> >> Thanks,
> >>
> >> Daz
> >>
> 
> 

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to