Re: [PERFORM] Building multiple indexes on one table.

2014-07-23 Thread Marc Mamin
Von: pgsql-performance-ow...@postgresql.org 
[pgsql-performance-ow...@postgresql.org]quot; im Auftrag von quot;Claudio 
Freire [klaussfre...@gmail.com]
Gesendet: Freitag, 18. Juli 2014 01:21
An: Chris Ruprecht
Cc: pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] Building multiple indexes on one table.

On Thu, Jul 17, 2014 at 7:47 PM, Chris Ruprecht ch...@cdrbill.com wrote:
 Is there any way that I can build multiple indexes on one table without 
 having to scan the table multiple times? For small tables, that's probably 
 not an issue, but if I have a 500 GB table that I need to create 6 indexes 
 on, I don't want to read that table 6 times.
 Nothing I could find in the manual other than reindex, but that's not 
 helping, since it only rebuilds indexes that are already there and I don't 
 know if that reads the table once or multiple times. If I could create 
 indexes inactive and then run reindex, which then reads the table once, I 
 would have a solution. But that doesn't seem to exist either.

Just build them with separate but concurrent connections, and the
scans will be synchronized so it will be only one.

Btw, reindex rebuilds one index at a time, so what I do is issue
separate reindex for each index in parallel, to avoid the repeated
scans as well.

Just make sure you've got the I/O and CPU capacity for it (you'll be
writing many indexes at once, so there is a lot of I/O).

Index creation on large tables are mostly CPU bound as long as no swap occurs.
I/O may be an issue when all your indexes are similar; e.g. all on single int4 
columns.
in other cases the writes will not all take place concurrently.
To reduce I/O due to swap, you can consider increasing maintenance_work_mem on 
the connextions/sessionns
that build the indexes.

regards,

Marc Mamin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Building multiple indexes on one table.

2014-07-23 Thread Claudio Freire
On Wed, Jul 23, 2014 at 4:40 PM, Marc Mamin m.ma...@intershop.de wrote:
On Thu, Jul 17, 2014 at 7:47 PM, Chris Ruprecht ch...@cdrbill.com wrote:
 Is there any way that I can build multiple indexes on one table without 
 having to scan the table multiple times? For small tables, that's probably 
 not an issue, but if I have a 500 GB table that I need to create 6 indexes 
 on, I don't want to read that table 6 times.
 Nothing I could find in the manual other than reindex, but that's not 
 helping, since it only rebuilds indexes that are already there and I don't 
 know if that reads the table once or multiple times. If I could create 
 indexes inactive and then run reindex, which then reads the table once, I 
 would have a solution. But that doesn't seem to exist either.

Just build them with separate but concurrent connections, and the
scans will be synchronized so it will be only one.

Btw, reindex rebuilds one index at a time, so what I do is issue
separate reindex for each index in parallel, to avoid the repeated
scans as well.

Just make sure you've got the I/O and CPU capacity for it (you'll be
writing many indexes at once, so there is a lot of I/O).

 Index creation on large tables are mostly CPU bound as long as no swap occurs.
 I/O may be an issue when all your indexes are similar; e.g. all on single 
 int4 columns.
 in other cases the writes will not all take place concurrently.
 To reduce I/O due to swap, you can consider increasing maintenance_work_mem 
 on the connextions/sessionns
 that build the indexes.

Usually there will always be swap, unless you've got toy indexes.

But swap I/O is all sequential I/O, with a good readahead setting
there should be no problem.

It's the final writing step that can be a bottleneck if you have a
lame I/O system and try to push 5 or 6 indexes at once.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Building multiple indexes on one table.

2014-07-17 Thread Chris Ruprecht
Is there any way that I can build multiple indexes on one table without having 
to scan the table multiple times? For small tables, that's probably not an 
issue, but if I have a 500 GB table that I need to create 6 indexes on, I don't 
want to read that table 6 times.
Nothing I could find in the manual other than reindex, but that's not helping, 
since it only rebuilds indexes that are already there and I don't know if that 
reads the table once or multiple times. If I could create indexes inactive and 
then run reindex, which then reads the table once, I would have a solution. But 
that doesn't seem to exist either.

best regards,
chris
-- 
chris ruprecht
database grunt and bit pusher extraordinaíre



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Building multiple indexes on one table.

2014-07-17 Thread Claudio Freire
On Thu, Jul 17, 2014 at 7:47 PM, Chris Ruprecht ch...@cdrbill.com wrote:
 Is there any way that I can build multiple indexes on one table without 
 having to scan the table multiple times? For small tables, that's probably 
 not an issue, but if I have a 500 GB table that I need to create 6 indexes 
 on, I don't want to read that table 6 times.
 Nothing I could find in the manual other than reindex, but that's not 
 helping, since it only rebuilds indexes that are already there and I don't 
 know if that reads the table once or multiple times. If I could create 
 indexes inactive and then run reindex, which then reads the table once, I 
 would have a solution. But that doesn't seem to exist either.

Just build them with separate but concurrent connections, and the
scans will be synchronized so it will be only one.

Btw, reindex rebuilds one index at a time, so what I do is issue
separate reindex for each index in parallel, to avoid the repeated
scans as well.

Just make sure you've got the I/O and CPU capacity for it (you'll be
writing many indexes at once, so there is a lot of I/O).


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance