Re: [PERFORM] Insert performance vs Table size

2005-06-28 Thread Praveen Raja
I assume you took size to mean the row size? What I really meant was
does the number of rows a table has affect the performance of new
inserts into the table (just INSERTs) all other things remaining
constant. Sorry for the confusion.

I know that having indexes on the table adds an overhead but again does
this overhead increase (for an INSERT operation) with the number of rows
the table contains?

My instinct says no to both. If I'm wrong can someone explain why the
number of rows in a table affects INSERT performance?

Thanks again

-Original Message-
From: Jacques Caron [mailto:[EMAIL PROTECTED] 
Sent: 27 June 2005 14:05
To: Praveen Raja
Cc: pgsql-performance@postgresql.org
Subject: RE: [PERFORM] Insert performance vs Table size

Hi,

At 13:50 27/06/2005, Praveen Raja wrote:
Just to clear things up a bit, the scenario that I'm interested in is a
table with a large number of indexes on it (maybe 7-8).

If you're after performance you'll want to carefully consider which
indexes 
are really useful and/or redesign your schema so that you can have less 
indexes on that table. 7 or 8 indexes is quite a lot, and that really
has a 
cost.

  In this scenario
other than the overhead of having to maintain the indexes (which I'm
guessing is the same regardless of the size of the table)

Definitely not: indexes grow with the size of the table. Depending on
what 
columns you index (and their types), the indexes may be a fraction of
the 
size of the table, or they may be very close in size (in extreme cases
they 
may even be larger). With 7 or 8 indexes, that can be quite a large
volume 
of data to manipulate, especially if the values of the columns inserted
can 
span the whole range of the index (rather than being solely id- or 
time-based, for instance, in which case index updates are concentrated
in a 
small area of each of the indexes), as this means you'll need to have a 
majority of the indexes in RAM if you want to maintain decent
performance.

does the size of the table play a role in determining insert
performance 
(and I mean
only insert performance)?

In this case, it's really the indexes that'll cause you trouble, though 
heavily fragmented tables (due to lots of deletes or updates) will also 
incur a penalty just for the data part of the inserts.

Also, don't forget the usual hints if you are going to do lots of
inserts:
- batch them in large transactions, don't do them one at a time
- better yet, use COPY rather than INSERT
- in some situations, you might be better of dropping the indexes, doing

large batch inserts, then re-creating the indexes. YMMV depending on the

existing/new ratio, whether you need to maintain indexed access to the 
tables, etc.
- pay attention to foreign keys

Jacques.



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Insert performance vs Table size

2005-06-28 Thread Jacques Caron

Hi,

At 11:50 28/06/2005, Praveen Raja wrote:

I assume you took size to mean the row size?


Nope, the size of the table.


 What I really meant was
does the number of rows a table has affect the performance of new
inserts into the table (just INSERTs) all other things remaining
constant. Sorry for the confusion.


As I said previously, in most cases it does. One of the few cases where it 
doesn't would be an append-only table, no holes, no indexes, no foreign keys...



I know that having indexes on the table adds an overhead but again does
this overhead increase (for an INSERT operation) with the number of rows
the table contains?


It depends on what you are indexing. If the index key is something that 
grows monotonically (e.g. a unique ID or a timestamp), then the size of the 
table (and hence of the indexes) should have a very limited influence on 
the INSERTs. If the index key is anything else (and that must definitely be 
the case if you have 7 or 8 indexes!), then that means updates will happen 
all over the indexes, which means a lot of read and write activity, and 
once the total size of your indexes exceeds what can be cached in RAM, 
performance will decrease quite a bit. Of course if your keys are 
concentrated in a few limited areas of the key ranges it might help.



My instinct says no to both. If I'm wrong can someone explain why the
number of rows in a table affects INSERT performance?


As described above, maintaining indexes when you hit anywhere in said 
indexes is very costly. The larger the table, the larger the indexes, the 
higher the number of levels in the trees, etc. As long as it fits in RAM, 
it shouldn't be a problem. Once you exceed that threshold, you start 
getting a lot of random I/O, and that's expensive.


Again, it depends a lot on your exact schema, the nature of the data, the 
spread of the different values, etc, but I would believe it's more often 
the case than not.


Jacques.



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Insert performance vs Table size

2005-06-28 Thread Tom Lane
Praveen Raja [EMAIL PROTECTED] writes:
 I know that having indexes on the table adds an overhead but again does
 this overhead increase (for an INSERT operation) with the number of rows
 the table contains?

Typical index implementations (such as b-tree) have roughly O(log N)
cost to insert or lookup a key in an N-entry index.  So yes, it grows,
though slowly.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Insert performance vs Table size

2005-06-27 Thread 李江华
Praveen Raja:

  I think the size of a table don't affect the speed of inserts 
into it.Because PostgreSQL just doing something like append on the data files.
  But the index do speed-down the inserts. Because PostgreSQL should 
maintain the index when doing inserts.

  hope this is useful for your question.



=== 2005-06-27 19:24:06 you wrote:===

Hi all

I'm wondering if and how the size of a table affects speed of inserts
into it? What if the table has indexes, does that alter the answer?

Thanks



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


= = = = = = = = = = = = = = = = = = = =


   Best regards!
  

   李江华
   Seamus Dean
   Alibaba.com
   TEL:0571-85022088-2287
[EMAIL PROTECTED]
2005-06-27

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Insert performance vs Table size

2005-06-27 Thread Jacques Caron

Hi,

At 13:24 27/06/2005, Praveen Raja wrote:

I'm wondering if and how the size of a table affects speed of inserts
into it? What if the table has indexes, does that alter the answer?


Many parameters will affect the result:
- whether there are any indexes (including the primary key, unique 
constraints...) to update or not

- whether there are any foreign keys from or to that table
- the size of the rows
- whether the table (or at least the bits being updated) fit in RAM or not
- whether the table has holes (due to former updates/deletes and vacuum) 
and how they are placed

- and probably a bunch of other things...

Obviously, if you have an append-only (no updates, no deletes) table with 
no indexes and no foreign keys, the size of the table should not matter 
much. As soon as one of those conditions is not met table size will have an 
impact, probably small as long as whatever is needed can be held in RAM, a 
lot bigger once it's not the case.


Hope that helps,

Jacques.



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Insert performance vs Table size

2005-06-27 Thread Praveen Raja
Just to clear things up a bit, the scenario that I'm interested in is a
table with a large number of indexes on it (maybe 7-8). In this scenario
other than the overhead of having to maintain the indexes (which I'm
guessing is the same regardless of the size of the table), does the size
of the table play a role in determining insert performance (and I mean
only insert performance)?

-Original Message-
From: Jacques Caron [mailto:[EMAIL PROTECTED] 
Sent: 27 June 2005 13:40
To: Praveen Raja
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Insert performance vs Table size

Hi,

At 13:24 27/06/2005, Praveen Raja wrote:
I'm wondering if and how the size of a table affects speed of inserts
into it? What if the table has indexes, does that alter the answer?

Many parameters will affect the result:
- whether there are any indexes (including the primary key, unique 
constraints...) to update or not
- whether there are any foreign keys from or to that table
- the size of the rows
- whether the table (or at least the bits being updated) fit in RAM or
not
- whether the table has holes (due to former updates/deletes and
vacuum) 
and how they are placed
- and probably a bunch of other things...

Obviously, if you have an append-only (no updates, no deletes) table
with 
no indexes and no foreign keys, the size of the table should not matter 
much. As soon as one of those conditions is not met table size will have
an 
impact, probably small as long as whatever is needed can be held in RAM,
a 
lot bigger once it's not the case.

Hope that helps,

Jacques.



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Insert performance vs Table size

2005-06-27 Thread Jacques Caron

Hi,

At 13:50 27/06/2005, Praveen Raja wrote:

Just to clear things up a bit, the scenario that I'm interested in is a
table with a large number of indexes on it (maybe 7-8).


If you're after performance you'll want to carefully consider which indexes 
are really useful and/or redesign your schema so that you can have less 
indexes on that table. 7 or 8 indexes is quite a lot, and that really has a 
cost.



 In this scenario
other than the overhead of having to maintain the indexes (which I'm
guessing is the same regardless of the size of the table)


Definitely not: indexes grow with the size of the table. Depending on what 
columns you index (and their types), the indexes may be a fraction of the 
size of the table, or they may be very close in size (in extreme cases they 
may even be larger). With 7 or 8 indexes, that can be quite a large volume 
of data to manipulate, especially if the values of the columns inserted can 
span the whole range of the index (rather than being solely id- or 
time-based, for instance, in which case index updates are concentrated in a 
small area of each of the indexes), as this means you'll need to have a 
majority of the indexes in RAM if you want to maintain decent performance.


does the size of the table play a role in determining insert performance 
(and I mean

only insert performance)?


In this case, it's really the indexes that'll cause you trouble, though 
heavily fragmented tables (due to lots of deletes or updates) will also 
incur a penalty just for the data part of the inserts.


Also, don't forget the usual hints if you are going to do lots of inserts:
- batch them in large transactions, don't do them one at a time
- better yet, use COPY rather than INSERT
- in some situations, you might be better of dropping the indexes, doing 
large batch inserts, then re-creating the indexes. YMMV depending on the 
existing/new ratio, whether you need to maintain indexed access to the 
tables, etc.

- pay attention to foreign keys

Jacques.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match