Re: [sqlite] Help with performance...
Joe, each of the tables involved also had a parent table. that was also being copied. It turned out that the parent table copy (insert .. select) was taking over 50% of the time. So I flattened the tables including the neccessary fields into the children tables. This doubled the throughput in 2 places, actually creating the original and for the copy component. Thanks for your patience and allowing me to bounce ideas. Ken Joe Wilson <[EMAIL PROTECTED]> wrote: Forget about the alternate insert statements I suggested. Assuming "id" is declared INTEGER PRIMARY KEY in all tables, you can't get better performance than this in a single insert statement: insert into x select x1.* from a.x x1, y where x1.id = y.id; Be a better Heartthrob. Get better relationship answers from someone who knows. Yahoo! Answers - Check it out. http://answers.yahoo.com/dir/?link=list=396545433 - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Help with performance...
Forget about the alternate insert statements I suggested. Assuming "id" is declared INTEGER PRIMARY KEY in all tables, you can't get better performance than this in a single insert statement: insert into x select x1.* from a.x x1, y where x1.id = y.id; Be a better Heartthrob. Get better relationship answers from someone who knows. Yahoo! Answers - Check it out. http://answers.yahoo.com/dir/?link=list=396545433 - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Help with performance...
--- Joe Wilson <[EMAIL PROTECTED]> wrote: > This should be faster: > > insert into x > select * from a.x x1 > where exists (select 1 from Y where x1.id = y.id); > > See if adding an order by statement will make it faster by speeding > up the inserts: > > insert into x > select * from a.x x1 > where exists (select 1 from Y where x1.id = y.id) > order by x1.id; The last statement could be a bit faster if pragma temp_store = memory - assuming the intermediate select result set can fit into memory. Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Help with performance...
This should be faster: insert into x select * from a.x x1 where exists (select 1 from Y where x1.id = y.id); See if adding an order by statement will make it faster by speeding up the inserts: insert into x select * from a.x x1 where exists (select 1 from Y where x1.id = y.id) order by x1.id; I don't think 1000 records selected and inserted per second is that bad, but your idea below separating the select from the inserts may very well speed it up. For extra speed sort the records in memory by "id" prior to inserting them. malloc()ing is not likely as important to disk seeking in this case. Time the query on its own without the insert. Also time everything from a :memory: database just for kicks to see how much overhead the hard disk introduces. --- Ken <[EMAIL PROTECTED]> wrote: > The id field is also the Primary Key and no other indices exist. I've run > the explan query > plan and they seem optimal, reading table X, and using an index access into > Z. > > I just had a thought. It seems to me that doing the insert into X select > from a.x might be the > problem. Given that the Master DB is on the same disk as the attached DB.. So > effectively > causing all sorts of seek operations. > > Question does sqlite when doing an insert into X select * from Attached.X, > y where x.id =y.id > operate internally as follows??? > for rows: >read 1 row from A.X >Insert row into X > end row > > If so then I suspect performance could be greatly improved as follows: > >create a large memory buffer say 10,000 rows.. >Select and populate buffer. >Begin; >for each row in buffer: > insert into x >commit; > > Could the overhead of memory allocation acount for the variances? Is there > any way to pin the > sqlite allocated pages into the Master DB's cache? Sort of a sub cached > allocation so that > attached DB's do not require malloc/free for memory pages? > > Ken Park yourself in front of a world of choices in alternative vehicles. Visit the Yahoo! Auto Green Center. http://autos.yahoo.com/green_center/ - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Help with performance...
Joe, Thanks for the reply. The id field is also the Primary Key and no other indices exist. I've run the explan query plan and they seem optimal, reading table X, and using an index access into Z. I just had a thought. It seems to me that doing the insert into X select from a.x might be the problem. Given that the Master DB is on the same disk as the attached DB.. So effectively causing all sorts of seek operations. Question does sqlite when doing an insert into X select * from Attached.X, y where x.id =y.id operate internally as follows??? for rows: read 1 row from A.X Insert row into X end row If so then I suspect performance could be greatly improved as follows: create a large memory buffer say 10,000 rows.. Select and populate buffer. Begin; for each row in buffer: insert into x commit; Could the overhead of memory allocation acount for the variances? Is there any way to pin the sqlite allocated pages into the Master DB's cache? Sort of a sub cached allocation so that attached DB's do not require malloc/free for memory pages? Ken Joe Wilson <[EMAIL PROTECTED]> wrote: Not much you can do. You could examine the output of EXPLAIN QUERY PLAN for those statements. Hard to know without knowing the schema, but try making the "id" an INTEGER PRIMARY KEY, assuming it's appropriate for your data. Try to have as few indexes as possible on the table being inserted into. --- Ken wrote: > I'm looking for your help solving a performance issue: > > Master db and an attached db. called A > > Table x and table Y are identical in both the master and Attached database. > table Z is built to determine the set of rows to copy and has an index on the > id field. > > The goal is to move data from many attached db's to the master. > > SQL statements: > Begin; > insert into x > select x1.* from a.x x1 , Y > where x1.id = y.id ; > commit; > takes 4.08 seconds to move 3904 rows. > > begin; > insert into y > select y1.* from a.x y1 , Y > where y1.id = y.id ; > commit; > Takes 2.19 seconds to move 2676 rows. > > So X is moved at 922 r/s and y is moved at 1221 r/s. > Table X has 16 columns and no blobs > Table Y has 22 columns one of which is a blob. > > All columns are integers except for the blob on table y. > > So why is the X copy performance 25% slower than Y ? > > Page size is 4k. > Cache size is 4000. > locking_mode = exclusive. > Synchronous off > > Is there any way to disable journaling? > I can always re-create the DB should anything fail. > > Any other ideas on how to make this run quicker? > > Thanks, > Ken Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. http://smallbusiness.yahoo.com/webhosting - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Help with performance...
Not much you can do. You could examine the output of EXPLAIN QUERY PLAN for those statements. Hard to know without knowing the schema, but try making the "id" an INTEGER PRIMARY KEY, assuming it's appropriate for your data. Try to have as few indexes as possible on the table being inserted into. --- Ken <[EMAIL PROTECTED]> wrote: > I'm looking for your help solving a performance issue: > > Master db and an attached db. called A > > Table x and table Y are identical in both the master and Attached database. > table Z is built to determine the set of rows to copy and has an index on the > id field. > > The goal is to move data from many attached db's to the master. > > SQL statements: > Begin; >insert into x > select x1.* from a.x x1 , Y >where x1.id = y.id ; > commit; > takes 4.08 seconds to move 3904 rows. > > begin; >insert into y > select y1.* from a.x y1 , Y > where y1.id = y.id ; > commit; > Takes 2.19 seconds to move 2676 rows. > > So X is moved at 922 r/s and y is moved at 1221 r/s. > Table X has 16 columns and no blobs > Table Y has 22 columns one of which is a blob. > > All columns are integers except for the blob on table y. > > So why is the X copy performance 25% slower than Y ? > > Page size is 4k. > Cache size is 4000. > locking_mode = exclusive. > Synchronous off > > Is there any way to disable journaling? > I can always re-create the DB should anything fail. > > Any other ideas on how to make this run quicker? > > Thanks, > Ken Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. http://smallbusiness.yahoo.com/webhosting - To unsubscribe, send email to [EMAIL PROTECTED] -
[sqlite] Help with performance...
I'm looking for your help solving a performance issue: Master db and an attached db. called A Table x and table Y are identical in both the master and Attached database. table Z is built to determine the set of rows to copy and has an index on the id field. The goal is to move data from many attached db's to the master. SQL statements: Begin; insert into x select x1.* from a.x x1 , Y where x1.id = y.id ; commit; takes 4.08 seconds to move 3904 rows. begin; insert into y select y1.* from a.x y1 , Y where y1.id = y.id ; commit; Takes 2.19 seconds to move 2676 rows. So X is moved at 922 r/s and y is moved at 1221 r/s. Table X has 16 columns and no blobs Table Y has 22 columns one of which is a blob. All columns are integers except for the blob on table y. So why is the X copy performance 25% slower than Y ? Page size is 4k. Cache size is 4000. locking_mode = exclusive. Synchronous off Is there any way to disable journaling? I can always re-create the DB should anything fail. Any other ideas on how to make this run quicker? Thanks, Ken