I have two tables as follows: CREATE TABLE ROT_DIM_CONTRACT_LINE (ContractNo Integer,ContractLine Integer,StartDate Integer,EndDate Integer,NCRAmt Float,Item Integer,BusinessArea Text,ProductGroup Text,ProductStyle Text,Site Text,Customer Integer,PrincipalContractNo Text,SASContract Text,ContractMonths Float,StartMonths Float,FullMonths Float,EndMonths Float,MonthlyAmortAmt Float,FirstAmortAmt Float,LastAmort Float,BalancingAmortAmt Float,RolloutToContractDiff Float, PRIMARY KEY (ContractNo ASC,ContractLine ASC)) -CREATE UNIQUE INDEX IDX_ROT_DIM_CONTRACT_LINE_1 ON ROT_DIM_CONTRACT_LINE (ContractNo ASC,ContractLine ASC)
CREATE TABLE ROT_FACT_REV_ROLLOUT (Period_ID Integer,ContractNo Integer,ContractLine Integer,Revenue_Amount Float, PRIMARY KEY (Period_ID ASC,ContractNo ASC,ContractLine ASC)) CREATE UNIQUE INDEX IDX_ROT_FACT_REV_ROLLOUT_1 ON ROT_FACT_REV_ROLLOUT (Period_ID ASC,ContractNo ASC,ContractLine ASC) ROT_DIM_CONTRACT_LINE has 131,747 records ROT_FACT_REV_ROLLOUT has 3,971,369 records The process I am doing is two fold: 1) Joining the data for complete list 2) Splitting the data by date (Period_ID) into two catagories based on a threshold date. so if my threshold is Sep 2008 (200809) I want all records after Sep 2008 to be displayed AND I want a total column for all records prior to Sep 2008. No case statements in SQLite so two subqueries does the job: > SEP 2008 select * from (SELECT cl.Customer as Customer, cl.PrincipalContractNo as PrincipalContractNo, SASContract as SASContractNo, cl.BusinessArea, cl.ProductGroup, cl.Site, rr.period_id as period, sum(cl.NCRAmt) as NCRAmt, sum(rr.revenue_amount) as RevenueAmt FROM ROT_DIM_CONTRACT_LINE cl join ROT_FACT_REV_ROLLOUT rr on (cl.ContractNo = rr.ContractNo and cl.ContractLine = rr.ContractLine) where rr.period_id > 200809 group by cl.Customer, PrincipalContractNo, SASContract, cl.BusinessArea, cl.ProductGroup, cl.Site, rr.period_id <= SEP 2008 SELECT cl.Customer as Customer, cl.PrincipalContractNo as PrincipalContractNo, SASContract as SASContractNo, cl.BusinessArea, cl.ProductGroup, cl.Site, sum(cl.NCRAmt) as NCRAmt FROM ROT_DIM_CONTRACT_LINE cl join ROT_FACT_REV_ROLLOUT rr on (cl.ContractNo = rr.ContractNo and cl.ContractLine = rr.ContractLine) where rr.period_id <= 200809 group by cl.Customer, PrincipalContractNo, SASContract, cl.BusinessArea, cl.ProductGroup, cl.Site) ncritd on (fut.Customer = ncritd.Customer and fut.PrincipalContractNo = ncritd.PrincipalContractNo and fut.SASContractNo = ncritd.SASContractNo and fut.BusinessArea = ncritd.BusinessArea and fut.ProductGroup = ncritd.ProductGroup and fut.Site = ncritd.Site) order by fut.Customer, fut.PrincipalContractNo, fut.SASContractNo, fut.BusinessArea, fut.ProductGroup, fut.Site, fut.Period Ignoring joining for the moment. Each of these queries on there own takes around 1 minute. If however I create an index to satisfy the group by, say : create index IDX_ROT_DIM_CONTRACT_LINE_2 on ROT_DIM_CONTRACT_LINE (Customer ASC, PrincipalContractNo ASC, SASContract ASC, BusinessArea ASC, ProductGroup ASC, Site ASC) create index IDX_ROT_DIM_CONTRACT_LINE_3 on ROT_DIM_CONTRACT_LINE (ContractNo ASC, ContractLine ASC, Customer ASC, PrincipalContractNo ASC, SASContract ASC, BusinessArea ASC, ProductGroup ASC, Site ASC) analyze The query will run for 13 muntes + (killed after 13 minutes last time). Explain Query plan confirms the use of IDX_ROT_DIM_CONTRACT_LINE_2 Now back to joining, without any extra indices and an empty sqlite_Stat1 table the following query performs terribly (in my opinion). Its simply a join between the above two subqueries: select * from (SELECT cl.Customer as Customer, cl.PrincipalContractNo as PrincipalContractNo, SASContract as SASContractNo, cl.BusinessArea, cl.ProductGroup, cl.Site, rr.period_id as period, sum(cl.NCRAmt) as NCRAmt, sum(rr.revenue_amount) as RevenueAmt FROM ROT_DIM_CONTRACT_LINE cl join ROT_FACT_REV_ROLLOUT rr on (cl.ContractNo = rr.ContractNo and cl.ContractLine = rr.ContractLine) where rr.period_id > 200809 group by cl.Customer, PrincipalContractNo, SASContract, cl.BusinessArea, cl.ProductGroup, cl.Site, rr.period_id) fut join (SELECT cl.Customer as Customer, cl.PrincipalContractNo as PrincipalContractNo, SASContract as SASContractNo, cl.BusinessArea, cl.ProductGroup, cl.Site, sum(cl.NCRAmt) as NCRAmt FROM ROT_DIM_CONTRACT_LINE cl join ROT_FACT_REV_ROLLOUT rr on (cl.ContractNo = rr.ContractNo and cl.ContractLine = rr.ContractLine) where rr.period_id <= 200809 group by cl.Customer, PrincipalContractNo, SASContract, cl.BusinessArea, cl.ProductGroup, cl.Site) ncritd on (fut.Customer = ncritd.Customer and fut.PrincipalContractNo = ncritd.PrincipalContractNo and fut.SASContractNo = ncritd.SASContractNo and fut.BusinessArea = ncritd.BusinessArea and fut.ProductGroup = ncritd.ProductGroup and fut.Site = ncritd.Site) order by fut.Customer, fut.PrincipalContractNo, fut.SASContractNo, fut.BusinessArea, fut.ProductGroup, fut.Site, fut.Period However if I change the first query to be a temp table: drop table fut create temp table fut as SELECT cl.Customer as Customer, cl.PrincipalContractNo as PrincipalContractNo, SASContract as SASContractNo, cl.BusinessArea as BusinessArea, cl.ProductGroup as ProductGroup, cl.Site as Site, rr.period_id as period, sum(cl.NCRAmt) as NCRAmt, sum(rr.revenue_amount) as RevenueAmt FROM ROT_DIM_CONTRACT_LINE cl join ROT_FACT_REV_ROLLOUT rr on (cl.ContractNo = rr.ContractNo and cl.ContractLine = rr.ContractLine) where rr.period_id > 200809 group by cl.Customer, PrincipalContractNo, SASContract, cl.BusinessArea, cl.ProductGroup, Site, rr.period_id; I know get around 2 minutes for the temp table combined with the second subquery: select fut.Customer as Customer, fut.PrincipalContractNo as PrincipalContractNo, fut.SASContractNo as SASContractNo, fut.BusinessArea as BusinessArea, fut.ProductGroup as ProductGroup, fut.Site as Site, ncritd.NCRAmt as MCR_ITD, fut.Period as Period, fut.RevenueAmt as RevenueAmt from fut fut join (SELECT cl.Customer as Customer, cl.PrincipalContractNo as PrincipalContractNo, SASContract as SASContractNo, cl.BusinessArea, cl.ProductGroup, cl.Site, sum(cl.NCRAmt) as NCRAmt FROM ROT_DIM_CONTRACT_LINE cl join ROT_FACT_REV_ROLLOUT rr on (cl.ContractNo = rr.ContractNo and cl.ContractLine = rr.ContractLine) where rr.period_id <= 200809 group by cl.Customer, PrincipalContractNo, SASContract, BusinessArea, ProductGroup, Site) ncritd on (fut.Customer = ncritd.Customer and fut.PrincipalContractNo = ncritd.PrincipalContractNo and fut.SASContractNo = ncritd.SASContractNo and fut.BusinessArea = ncritd.BusinessArea and fut.ProductGroup = ncritd.ProductGroup and fut.Site = ncritd.Site) order by fut.Customer, fut.PrincipalContractNo, fut.SASContractNo, fut.BusinessArea, fut.ProductGroup, fut.Site, fut.Period ; So subqueries appear far more inefficient than temp tables. Thanks, On Thu, Oct 23, 2008 at 1:05 PM, Igor Tandetnik <[EMAIL PROTECTED]> wrote: > "Da Martian" <[EMAIL PROTECTED]> wrote > in message > news:[EMAIL PROTECTED]<[EMAIL PROTECTED]> > > Does anyone have any ideas on how to optimise this type of process in > > SQLite? > > What type of process? Show your tables and your query. > > Igor Tandetnik > > > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users