Re: [sqlite] Help with performance...

2007-08-15 Thread Ken
Joe,

each of the tables involved also had a parent table. that was also being 
copied. 

It turned out that the parent table copy (insert .. select) was taking over 50% 
of the time. So I flattened the tables including the neccessary fields into the 
children tables. 

This doubled the throughput in 2 places, actually creating the original and for 
the copy component.

Thanks for your patience and allowing me to bounce ideas.
Ken


Joe Wilson <[EMAIL PROTECTED]> wrote: Forget about the alternate insert 
statements I suggested.

Assuming "id" is declared INTEGER PRIMARY KEY in all tables,
you can't get better performance than this in a single
insert statement:

 insert into x select x1.* from a.x x1, y where x1.id = y.id;



   

Be a better Heartthrob. Get better relationship answers from someone who knows. 
Yahoo! Answers - Check it out. 
http://answers.yahoo.com/dir/?link=list=396545433

-
To unsubscribe, send email to [EMAIL PROTECTED]
-




Re: [sqlite] Help with performance...

2007-08-12 Thread Joe Wilson
Forget about the alternate insert statements I suggested.

Assuming "id" is declared INTEGER PRIMARY KEY in all tables,
you can't get better performance than this in a single
insert statement:

 insert into x select x1.* from a.x x1, y where x1.id = y.id;



   

Be a better Heartthrob. Get better relationship answers from someone who knows. 
Yahoo! Answers - Check it out. 
http://answers.yahoo.com/dir/?link=list=396545433

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Help with performance...

2007-08-12 Thread Joe Wilson
--- Joe Wilson <[EMAIL PROTECTED]> wrote:
> This should be faster:
> 
>  insert into x 
>   select * from a.x x1
>   where exists (select 1 from Y where x1.id = y.id);
> 
> See if adding an order by statement will make it faster by speeding
> up the inserts:
> 
>  insert into x 
>   select * from a.x x1
>   where exists (select 1 from Y where x1.id = y.id)
>   order by x1.id;

The last statement could be a bit faster if pragma temp_store = memory
- assuming the intermediate select result set can fit into memory.



  

Shape Yahoo! in your own image.  Join our Network Research Panel today!   
http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Help with performance...

2007-08-12 Thread Joe Wilson
This should be faster:

 insert into x 
  select * from a.x x1
  where exists (select 1 from Y where x1.id = y.id);

See if adding an order by statement will make it faster by speeding
up the inserts:

 insert into x 
  select * from a.x x1
  where exists (select 1 from Y where x1.id = y.id)
  order by x1.id;

I don't think 1000 records selected and inserted per second is that bad,
but your idea below separating the select from the inserts may very well 
speed it up. For extra speed sort the records in memory by "id" prior 
to inserting them.

malloc()ing is not likely as important to disk seeking in this case.

Time the query on its own without the insert.
Also time everything from a :memory: database just for kicks to see
how much overhead the hard disk introduces.

--- Ken <[EMAIL PROTECTED]> wrote:
>   The id field is also the Primary Key and no other indices exist. I've run 
> the explan query
> plan and they seem optimal,  reading table X, and using an index access into 
> Z. 
>
>   I just had a thought. It seems to me that doing the insert into X select 
> from a.x might be the
> problem. Given that the Master DB is on the same disk as the attached DB.. So 
> effectively
> causing all sorts of seek operations.
>
>   Question does sqlite when doing an insert into X select * from Attached.X, 
> y where x.id =y.id
> operate internally as follows???
> for rows:
>read 1 row from A.X
>Insert row into X
> end row
>
>   If so then I suspect performance could be greatly improved as follows:
>  
>create a large memory buffer say 10,000 rows..
>Select and populate buffer.
>Begin;
>for each row in buffer:
> insert into x
>commit;
>
>   Could the overhead of  memory allocation acount for the variances? Is there 
> any way to pin the
> sqlite allocated pages into the Master DB's cache? Sort of a sub cached 
> allocation so that
> attached DB's do not require malloc/free for memory pages? 
>
>   Ken



  

Park yourself in front of a world of choices in alternative vehicles. Visit the 
Yahoo! Auto Green Center.
http://autos.yahoo.com/green_center/ 

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Help with performance...

2007-08-12 Thread Ken
Joe, 
   
  Thanks for the reply.
   
  The id field is also the Primary Key and no other indices exist. I've run the 
explan query plan and they seem optimal,  reading table X, and using an index 
access into Z. 
   
  I just had a thought. It seems to me that doing the insert into X select from 
a.x might be the problem. Given that the Master DB is on the same disk as the 
attached DB.. So effectively causing all sorts of seek operations.
   
  Question does sqlite when doing an insert into X select * from Attached.X, y 
where x.id =y.id operate internally as follows???
for rows:
   read 1 row from A.X
   Insert row into X
end row
   
  If so then I suspect performance could be greatly improved as follows:
 
   create a large memory buffer say 10,000 rows..
   Select and populate buffer.
   Begin;
   for each row in buffer:
insert into x
   commit;
   
  Could the overhead of  memory allocation acount for the variances? Is there 
any way to pin the sqlite allocated pages into the Master DB's cache? Sort of a 
sub cached allocation so that attached DB's do not require malloc/free for 
memory pages? 
   
  Ken
  

Joe Wilson <[EMAIL PROTECTED]> wrote:
  Not much you can do.
You could examine the output of EXPLAIN QUERY PLAN for those statements.

Hard to know without knowing the schema, but try making the "id" an 
INTEGER PRIMARY KEY, assuming it's appropriate for your data.

Try to have as few indexes as possible on the table being inserted into.

--- Ken wrote:
> I'm looking for your help solving a performance issue:
> 
> Master db and an attached db. called A
> 
> Table x and table Y are identical in both the master and Attached database.
> table Z is built to determine the set of rows to copy and has an index on the 
> id field.
> 
> The goal is to move data from many attached db's to the master.
> 
> SQL statements:
> Begin;
> insert into x 
> select x1.* from a.x x1 , Y
> where x1.id = y.id ;
> commit;
> takes 4.08 seconds to move 3904 rows.
> 
> begin;
> insert into y 
> select y1.* from a.x y1 , Y
> where y1.id = y.id ;
> commit;
> Takes 2.19 seconds to move 2676 rows.
> 
> So X is moved at 922 r/s and y is moved at 1221 r/s.
> Table X has 16 columns and no blobs
> Table Y has 22 columns one of which is a blob.
> 
> All columns are integers except for the blob on table y.
> 
> So why is the X copy performance 25% slower than Y ?
> 
> Page size is 4k.
> Cache size is 4000.
> locking_mode = exclusive.
> Synchronous off
> 
> Is there any way to disable journaling? 
> I can always re-create the DB should anything fail. 
> 
> Any other ideas on how to make this run quicker?
> 
> Thanks,
> Ken





Building a website is a piece of cake. Yahoo! Small Business gives you all the 
tools to get online.
http://smallbusiness.yahoo.com/webhosting 

-
To unsubscribe, send email to [EMAIL PROTECTED]
-




Re: [sqlite] Help with performance...

2007-08-11 Thread Joe Wilson
Not much you can do.
You could examine the output of EXPLAIN QUERY PLAN for those statements.

Hard to know without knowing the schema, but try making the "id" an 
INTEGER PRIMARY KEY, assuming it's appropriate for your data.

Try to have as few indexes as possible on the table being inserted into.

--- Ken <[EMAIL PROTECTED]> wrote:
> I'm looking for your help solving a performance issue:
> 
> Master db and an attached db. called A
> 
> Table x and table Y are identical in both the master and Attached database.
> table Z is built to determine the set of rows to copy and has an index on the 
> id field.
> 
> The goal is to move data from many attached db's to the master.
> 
> SQL statements:
> Begin;
>insert into x 
> select  x1.* from a.x x1 , Y
>where x1.id = y.id ;
> commit;
> takes 4.08 seconds to move 3904 rows.
> 
> begin;
>insert into y 
>  select  y1.* from a.x y1 , Y
> where y1.id = y.id ;
>  commit;
>   Takes 2.19 seconds to move 2676 rows.
> 
> So X is moved at 922 r/s  and y is moved at 1221 r/s.
> Table X has  16 columns and no blobs
> Table Y has 22 columns one of which is a blob.
> 
> All columns are integers except for the blob on table y.
> 
> So why is the X copy performance 25% slower than Y ?
> 
> Page size is 4k.
> Cache size is 4000.
> locking_mode = exclusive.
> Synchronous off
> 
> Is there any way to disable journaling? 
> I can always re-create the DB should anything fail. 
> 
> Any other ideas on how to make this run quicker?
> 
> Thanks,
> Ken



   

Building a website is a piece of cake. Yahoo! Small Business gives you all the 
tools to get online.
http://smallbusiness.yahoo.com/webhosting 

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Help with performance...

2007-08-10 Thread Ken

I'm looking for your help solving a performance issue:

Master db and an attached db. called A

Table x and table Y are identical in both the master and Attached database.
table Z is built to determine the set of rows to copy and has an index on the 
id field.

The goal is to move data from many attached db's to the master.

SQL statements:
Begin;
   insert into x 
select  x1.* from a.x x1 , Y
   where x1.id = y.id ;
commit;
takes 4.08 seconds to move 3904 rows.

begin;
   insert into y 
 select  y1.* from a.x y1 , Y
where y1.id = y.id ;
 commit;
  Takes 2.19 seconds to move 2676 rows.

So X is moved at 922 r/s  and y is moved at 1221 r/s.
Table X has  16 columns and no blobs
Table Y has 22 columns one of which is a blob.

All columns are integers except for the blob on table y.

So why is the X copy performance 25% slower than Y ?

Page size is 4k.
Cache size is 4000.
locking_mode = exclusive.
Synchronous off

Is there any way to disable journaling? 
I can always re-create the DB should anything fail. 

Any other ideas on how to make this run quicker?

Thanks,
Ken