Awesome, brilliant, and decisive!

New times:

No index on Delta File:
3 seconds.

Index on SN:
4 seconds.

Index on MasterList (write_out_ok, MFGID, TypeID, SN);
4 seconds.

The speedup of the one query is greater than this because the above time
figures include
1) A query to see if there are any records in deltas with write_out_ok=0 (if
so, don't execute other queries)
2) A query to update write_out_ok =1 in delta where the record in main
exists and is newer.
3) A query to update write_out_ok =1 in main where the record came from
delta;

1,2, & 3 were negligible compared to the un-optimized
insert or replace into TargetD select * from sourceD sa where
sa.write_out_ok=0 and sa.sn not in (select ta.sn from TargetD ta where
ta.record_updatetime > sa.record_updatetime)   ;

Now, it appears that the time is comparable, so the actual time is in the
order of 2 seconds faster than listed above.  Dropping the sequence time
from 7 min 22s down to 0 minutes 4 seconds is tremendous.

thank you.

Adam


On Tue, Jul 14, 2009 at 2:04 PM, Pavel Ivanov <paiva...@gmail.com> wrote:

> I believe your choice of query is not good enough. Try this one:
>
> insert or replace into main.masterlist
>   select d.*
>     from delta.masterlist d left outer join main.masterlist M on d.sn =
> M.sn
>    where d.write_out_ok=0
>       and d.record_updatetime >= ifnull(M.record_updatetime, '0000')
>
>
> Pavel
>
> On Tue, Jul 14, 2009 at 12:54 PM, Adam DeVita<adev...@verifeye.com> wrote:
> > Good day,
> >
> > Could someone explain where I'm going wrong with this?
> >
> > I've identified the following query as a bottle neck in a utility I've
> > written.
> >
> > insert or replace into main.masterlist select * from delta.masterlist d
> > where d.write_out_ok=0 and
> > d.sn not in(select M.sn from main.masterlist M where M.record_updatetime
> >
> > d.record_updatetime)
> >
> > The purpose is to import a data from a  remotely created change file,
> with
> > only new/newer records.  (Due to the fact that the subject of the data is
> > shipping / receiving product serial numbers and that data moves faster
> than
> > product there is no way independent nodes can create a change to a record
> at
> > the same time.  Also, deleting is not allowed.)
> >
> > The change file is attached as 'delta'
> >
> > The structure of masterlist in the main database is:
> > sqlite> .schema masterlist
> > CREATE TABLE MasterList (SN int primary key not null, TypeID int default
> 0
> > references Product_type_dictionary(TypeID)  , ConstructionDate text,
> MFGID
> > int default 0 references MFG_dictionary (MFGID), Init_cust_ship_date
> text,
> > record_updatetime text default "2000.00.00.00", write_out_ok int default
> 0);
> >
> > CREATE INDEX IDX_MasterList on MasterList (write_out_ok, MFGID, TypeID,
> SN);
> >
> > main.masterlist has 36,000 records
> > deltas.masterlist has 9,000 records
> >
> > Notes about fields:
> > write_out_ok is a flag indicating that the record has been imported.
> States
> > are 1 or 0.
> > MFGID is a manufacturer, about 4 different ints can be used.
> > TypeID is a product Type, about 7 different types,
> >
> > The index is ordered by cardinality, and all int.
> > record_updatetime  is the modified date & time GMT (UTC),
> > yyyy.mm.dd.hh.MM.ss
> >
> >
> > ------------
> > Experimenting with indexes on the delta file with
> > No indexes:
> > 7 min 22s
> >
> > CREATE INDEX IDX_MasterList on MasterList ( SN);
> > 14min 52s
> >
> > CREATE INDEX IDX_MasterList on MasterList (write_out_ok, MFGID, TypeID,
> SN);
> > 20 min, 07s
> >
> > -----------
> > Dropped indexes on both main and delta.
> > ~20 min.
> > ---------------------
> >
> > Is the real problem a poor choice of index in main?
> >
> >
> > regards,
> > Adam
> > _______________________________________________
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
VerifEye Technologies Inc.
905-948-0015x245
7100 Warden Ave, Unit 3
Markham ON, L3R 8B5
Canada
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to