Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-11-02 Thread Vivek Khera


On Oct 27, 2006, at 2:07 PM, Tom Lane wrote:


8.2, but in existing releases I can't see much you can do about it
except REINDEX when things get slow.


This will be so nice for me.  I have one huge table with a massive  
amount of churn and bulk deletes.  I have to reindex it once every  
other month.  It takes about 60 to 75 minutes per index (times two  
indexes) else I'd do it monthly.


It shaves nearly 1/3 of the relpages off of the index size.



smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-29 Thread Gavin Hamill

Hi :)

[pg_dump from a Slony replica]

 That's not the problem.  The problem is that when you restore the
 dump of the slave, you'll have garbage.  Slony fools with the
 catalogs on the replicas.  

  (And you might as well throw away the
 dumpfiles from the replicas that you have.  They won't work when you
 restore them.)

This is interesting, but I don't understand.. We've done a full restore
from one of these pg_dump backups before now and it worked just great.

Sure I had to DROP SCHEMA _replication CASCADE to clear out all the
slony-specific triggers etc., but the new-master ran fine, as did
firing up new replication to the other nodes :)

Was I just lucky?

Cheers,
Gavin.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-29 Thread Andrew Sullivan
On Sun, Oct 29, 2006 at 03:08:26PM +, Gavin Hamill wrote:
 
 This is interesting, but I don't understand.. We've done a full restore
 from one of these pg_dump backups before now and it worked just great.
 
 Sure I had to DROP SCHEMA _replication CASCADE to clear out all the
 slony-specific triggers etc., but the new-master ran fine, as did
 firing up new replication to the other nodes :)
 
 Was I just lucky?

Yes.  Slony alters data in the system catalog for a number of
database objects on the replicas.  It does this in order to prevent,
for example, triggers from firing both on the origin and the replica. 
(That is the one that usually bites people hardest, but IIRC it's not
the only such hack in there.)  This was a bit of a dirty hack that
was supposed to be cleaned up, but that hasn't been yet.  In general,
you can't rely on a pg_dump of a replica giving you a dump that, when
restored, actually works.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Everything that happens in the world happens at some place.
--Jane Jacobs 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-29 Thread Andreas Kostyrka
Am Sonntag, den 29.10.2006, 10:34 -0500 schrieb Andrew Sullivan:
 On Sun, Oct 29, 2006 at 03:08:26PM +, Gavin Hamill wrote:
  
  This is interesting, but I don't understand.. We've done a full restore
  from one of these pg_dump backups before now and it worked just great.
  
  Sure I had to DROP SCHEMA _replication CASCADE to clear out all the
  slony-specific triggers etc., but the new-master ran fine, as did
  firing up new replication to the other nodes :)
  
  Was I just lucky?
 
 Yes.  Slony alters data in the system catalog for a number of
 database objects on the replicas.  It does this in order to prevent,
 for example, triggers from firing both on the origin and the replica. 
 (That is the one that usually bites people hardest, but IIRC it's not
 the only such hack in there.)  This was a bit of a dirty hack that
 was supposed to be cleaned up, but that hasn't been yet.  In general,
 you can't rely on a pg_dump of a replica giving you a dump that, when
 restored, actually works.

Actually, you need to get the schema from the master node, and can take
the data from a slave. In mixing dumps like that, you must realize that
there are two seperate parts in the schema dump: table definitions and
constraints. Do get a restorable backup you need to put the table
definitions stuff before your data, and the constraints after the data
copy.

Andreas

 
 A
 


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-29 Thread Andrew Sullivan
On Sun, Oct 29, 2006 at 05:24:33PM +0100, Andreas Kostyrka wrote:
 Actually, you need to get the schema from the master node, and can take
 the data from a slave. In mixing dumps like that, you must realize that
 there are two seperate parts in the schema dump: table definitions and
 constraints. Do get a restorable backup you need to put the table
 definitions stuff before your data, and the constraints after the data
 copy.

This will work, yes, but you don't get a real point-in-time dump this
way.  (In any case, we're off the -performance charter now, so if
anyone wants to pursue this, I urge you to take it to the Slony
list.)

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Windows is a platform without soap, where rats run around 
in open sewers.
--Daniel Eran

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-29 Thread Andreas Kostyrka
Am Sonntag, den 29.10.2006, 11:43 -0500 schrieb Andrew Sullivan:
 On Sun, Oct 29, 2006 at 05:24:33PM +0100, Andreas Kostyrka wrote:
  Actually, you need to get the schema from the master node, and can take
  the data from a slave. In mixing dumps like that, you must realize that
  there are two seperate parts in the schema dump: table definitions and
  constraints. Do get a restorable backup you need to put the table
  definitions stuff before your data, and the constraints after the data
  copy.
 
 This will work, yes, but you don't get a real point-in-time dump this
But one does, because one can dump all data in one pg_dump call. And
with slony enabled, schema changes won't happen by mistake, they tend to
be a thing for the Slony High Priest, nothing for mere developers ;)

Andreas



signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-27 Thread Gavin Hamill
On Thu, 26 Oct 2006 18:09:37 -0400
Andrew Sullivan [EMAIL PROTECTED] wrote:

 On Thu, Oct 26, 2006 at 09:35:56PM +0100, Gavin Hamill wrote:
  
  I'm absolutely certain. The backups run from only one slave, given
  that it is a full copy of node 1. Our overnight traffic has not
  increased any, and the nightly backups show that the overall size
  of the DB has not increased more than usual growth.
 
 A couple things from your posts:
 
 1.Don't do VACUUM FULL, please.  It takes longer, and blocks
 other things while it's going on, which might mean you're having
 table bloat in various slony-related tables.

I know it takes longer, I know it blocks. It's never been a problem

 2.Are your slony logs showing increased time too?  Are your
 targets getting further behind?

Nope, the slaves are keeping up just great - once the vacuums are
finished, all machines are running at about 50%-75% of full load in
duty.
 
 3.Your backups from the slave aren't done with pg_dump,
 right?

Em, they are indeed. I assumed that MVCC would ensure I got a
consistent snapshot from the instant when pg_dump began. Am I wrong?

 But I suspect Slony has a role here, too.  I'd look carefully at the
 slony tables -- especially the sl_log and pg_listen things, which
 both are implicated.

Slony is an easy target to point the finger at, so I tried a
little test. I took one of the 'light' slaves (only 10 tables..),
stopped its slon daemon, removed it from the load-balancer, and
restarted postgres so there were no active connections.

With the removal of both replication overhead and normal queries from
clients, the machine should be completely clear to run at full tilt.

Then I launched a 'vacuum verbose' and I was able to see exactly the
same poor speeds as before, even with vacuum_cost_delay = 0 as it was
previously...

2006-10-27 08:37:12 UTC INFO:  vacuuming public.Allocation
2006-10-27 08:37:21 UTC INFO:  Allocation: found 56449 removable, 4989360 
nonremovable row versions in 47158 pages
2006-10-27 08:37:21 UTC DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 64 to 72 bytes long.
There were 1 unused item pointers.
Total free space (including removable row versions) is 5960056 bytes.
13 pages are or will become empty, including 0 at the end of the table.
5258 pages containing 4282736 free bytes are potential move 
destinations.
CPU 0.16s/0.07u sec elapsed 9.55 sec.
2006-10-27 08:44:25 UTC INFO:  index allocation_pkey now contains 4989360 row 
versions in 102198 pages
2006-10-27 08:44:25 UTC DETAIL:  56449 index row versions were removed.
1371 index pages have been deleted, 1371 are currently reusable.
CPU 1.02s/0.38u sec elapsed 423.22 sec.

If I've read this correctly, then on an otherwise idle system, it has taken 
seven minutes to perform 1.4 seconds-worth of actual work. Surely that's 
nonsense? 

That would suggest that the issue is poor IO; vmstat 5 output during this run 
wasn't ripping performance - maybe averaging 3MB/sec in and out. 

I know the peak IO on this machine is rather much better than that:

joltpg2:/root# dd if=/dev/zero of=/tmp/t bs=1024k count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 8.02106 seconds, 131 MB/s

The test system is one CPU's-worth (two cores) of a 4 x Opteron 880 machine 
split up by Xen, and I can confirm the IO on the other Xen partitions was 
minimal.

I appreciate the time, help and advice people are offering, however I really 
don't think Slony is the culprit here.

Cheers,
Gavin.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-27 Thread Tom Lane
Gavin Hamill [EMAIL PROTECTED] writes:
 2006-10-27 08:37:12 UTC INFO:  vacuuming public.Allocation
 2006-10-27 08:37:21 UTC INFO:  Allocation: found 56449 removable, 4989360 
 nonremovable row versions in 47158 pages
 2006-10-27 08:37:21 UTC DETAIL:  0 dead row versions cannot be removed yet.
 Nonremovable row versions range from 64 to 72 bytes long.
 There were 1 unused item pointers.
 Total free space (including removable row versions) is 5960056 bytes.
 13 pages are or will become empty, including 0 at the end of the 
 table.
 5258 pages containing 4282736 free bytes are potential move 
 destinations.
 CPU 0.16s/0.07u sec elapsed 9.55 sec.
 2006-10-27 08:44:25 UTC INFO:  index allocation_pkey now contains 4989360 
 row versions in 102198 pages
 2006-10-27 08:44:25 UTC DETAIL:  56449 index row versions were removed.
 1371 index pages have been deleted, 1371 are currently reusable.
 CPU 1.02s/0.38u sec elapsed 423.22 sec.

So the time is all in index vacuuming, eh?  I think what's happening is
that the physical order of the index is degrading over time, and so the
vacuum scan takes longer due to more seeking.  Can you afford to do a
REINDEX?  If this theory is correct that should drive the time back
down.

8.2 has rewritten btree index vacuuming code that scans the index in
physical not logical order, so this problem should largely go away in
8.2, but in existing releases I can't see much you can do about it
except REINDEX when things get slow.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-27 Thread Gavin Hamill
On Fri, 27 Oct 2006 14:07:43 -0400
Tom Lane [EMAIL PROTECTED] wrote:

 So the time is all in index vacuuming, eh?  I think what's happening
 is that the physical order of the index is degrading over time, and
 so the vacuum scan takes longer due to more seeking.  Can you afford
 to do a REINDEX?  If this theory is correct that should drive the
 time back down.

Tom,

You wonderful, wonderful man.

I tried a test reindex on Allocation, and noticed a vacuum had
turbo-charged... then reindexed the whole db, did a vacuum, and lo! The
whole db had turbo-charged :)

When I say 'turbo-charged', I mean it. The vacuum times have dropped to
20% of what we were seeing even before it 'got much slower a
couple of days ago.'

It sucks that the new reindex code is only in 8.2, but now that I know
this is an issue in 8.1 I can plan for it.

Thanks so much :)

Cheers,
Gavin.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Gavin Hamill
Hullo, here's one of those dreadful touchy-feely hand-waving problems.
Our 5-node 8.1.3 Slony system has just started taking /much/ longer to
VACUUM ANALYZE..

The data set has not increased more than usual (nightly backups stand
at 1.3GB, growing by 10MB per day), and no configuration has changed on
the machines.

Nodes 2 and 3 take only the tables necessary to run our search (10 out
of the full 130) and are much lighter (only 7GB on disk cf. 30GB for
the full master) , yet the nightly VACUUM FULL has jumped from 2 hours
to 4 in the space of one day!

Like I say, no config changes, no reboots / postmaster restarts, no extra 
processes, and every machine has a comfortable overhead of free page slots + 
relations.

From a few days ago:
2006-10-20 03:04:29 UTC INFO:  Allocation: found 786856 removable, 4933448 
nonremovable row versions in 53461 pages
2006-10-20 03:04:29 UTC DETAIL:  0 dead row versions cannot be removed yet.
2006-10-20 03:07:32 UTC INFO:  index allocation_pkey now contains 4933448 row 
versions in 93918 pages
2006-10-20 03:07:32 UTC DETAIL:  786856 index row versions were removed.
2006-10-20 03:14:21 UTC INFO:  index ix_date now contains 4933448 row 
versions in 74455 pages
2006-10-20 03:14:21 UTC DETAIL:  786856 index row versions were removed.
2006-10-20 03:22:32 UTC INFO:  index ix_dateprice now contains 4933448 row 
versions in 81313 pages
2006-10-20 03:22:32 UTC DETAIL:  786856 index row versions were removed.
2006-10-20 03:24:41 UTC INFO:  index ix_dateroom now contains 4933448 row 
versions in 44610 pages
2006-10-20 03:24:41 UTC DETAIL:  786856 index row versions were removed.
2006-10-20 03:27:52 UTC INFO:  index ix_room now contains 4933448 row 
versions in 35415 pages
2006-10-20 03:27:52 UTC DETAIL:  786856 index row versions were removed.
2006-10-20 03:31:43 UTC INFO:  Allocation: moved 348324 row versions, 
truncated 53461 to 46107 pages
2006-10-20 03:31:43 UTC DETAIL:  CPU 4.72s/17.63u sec elapsed 230.81 sec.

From last night:
2006-10-26 01:00:30 UTC INFO:  vacuuming public.Allocation
2006-10-26 01:00:36 UTC INFO:  Allocation: found 774057 removable, 4979938 
nonremovable row versions in 53777 pages
2006-10-26 01:00:36 UTC DETAIL:  0 dead row versions cannot be removed yet.
2006-10-26 01:06:18 UTC INFO:  index allocation_pkey now contains 4979938 row 
versions in 100800 pages
2006-10-26 01:06:18 UTC DETAIL:  774057 index row versions were removed.
2006-10-26 01:19:22 UTC INFO:  index ix_date now contains 4979938 row 
versions in 81630 pages
2006-10-26 01:19:22 UTC DETAIL:  774057 index row versions were removed.
2006-10-26 01:35:17 UTC INFO:  index ix_dateprice now contains 4979938 row 
versions in 87750 pages
2006-10-26 01:35:17 UTC DETAIL:  774057 index row versions were removed.
2006-10-26 01:41:27 UTC INFO:  index ix_dateroom now contains 4979938 row 
versions in 46320 pages
2006-10-26 01:41:27 UTC DETAIL:  774057 index row versions were removed.
2006-10-26 01:48:18 UTC INFO:  index ix_room now contains 4979938 row 
versions in 36513 pages
2006-10-26 01:48:18 UTC DETAIL:  774057 index row versions were removed.
2006-10-26 01:56:35 UTC INFO:  Allocation: moved 322744 row versions, 
truncated 53777 to 46542 pages
2006-10-26 01:56:35 UTC DETAIL:  CPU 4.21s/15.90u sec elapsed 496.30 sec.

As you can see, the amount of system + user time for these runs are comparable, 
but the amount of real time has more than doubled. 

This isn't even a case for making the cost-based delay vacuum more aggressive 
because I already have vacuum_cost_delay = 0 on all machines to make the vacuum 
run as quickly as possible.

Any ideas warmly received! :)

Cheers,
Gavin.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Tom Lane
Gavin Hamill [EMAIL PROTECTED] writes:
 Nodes 2 and 3 take only the tables necessary to run our search (10 out
 of the full 130) and are much lighter (only 7GB on disk cf. 30GB for
 the full master) , yet the nightly VACUUM FULL has jumped from 2 hours
 to 4 in the space of one day!

I guess the most useful question to ask is why are you doing VACUUM FULL?
Plain VACUUM should be considerably faster, and for the level of row
turnover shown by your log, there doesn't seem to be a reason to use FULL.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Gavin Hamill
On Thu, 26 Oct 2006 10:47:21 -0400
Tom Lane [EMAIL PROTECTED] wrote:

 Gavin Hamill [EMAIL PROTECTED] writes:
  Nodes 2 and 3 take only the tables necessary to run our search (10
  out of the full 130) and are much lighter (only 7GB on disk cf.
  30GB for the full master) , yet the nightly VACUUM FULL has jumped
  from 2 hours to 4 in the space of one day!
 
 I guess the most useful question to ask is why are you doing VACUUM
 FULL? Plain VACUUM should be considerably faster, and for the level
 of row turnover shown by your log, there doesn't seem to be a reason
 to use FULL.

I do FULL on the 'light' clients simply because 'I can'. The example
posted was a poor choice - the other tables have a larger churn.

Anyway, once it starts, the load balancer takes it out of rotation so
no love is lost.

The same behaviour is shown on the 'heavy' clients (master + 2 slaves)
which take all tables - although I cannot afford to VACUUM FULL on
there, the usual VACUUM ANALYZE has begun to take vastly more time
since yesterday than in the many previous months we've been using pg.

Cheers,
Gavin.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Jim C. Nasby
On Thu, Oct 26, 2006 at 04:06:09PM +0100, Gavin Hamill wrote:
 On Thu, 26 Oct 2006 10:47:21 -0400
 Tom Lane [EMAIL PROTECTED] wrote:
 
  Gavin Hamill [EMAIL PROTECTED] writes:
   Nodes 2 and 3 take only the tables necessary to run our search (10
   out of the full 130) and are much lighter (only 7GB on disk cf.
   30GB for the full master) , yet the nightly VACUUM FULL has jumped
   from 2 hours to 4 in the space of one day!
  
  I guess the most useful question to ask is why are you doing VACUUM
  FULL? Plain VACUUM should be considerably faster, and for the level
  of row turnover shown by your log, there doesn't seem to be a reason
  to use FULL.
 
 I do FULL on the 'light' clients simply because 'I can'. The example
 posted was a poor choice - the other tables have a larger churn.
 
 Anyway, once it starts, the load balancer takes it out of rotation so
 no love is lost.
 
 The same behaviour is shown on the 'heavy' clients (master + 2 slaves)
 which take all tables - although I cannot afford to VACUUM FULL on
 there, the usual VACUUM ANALYZE has begun to take vastly more time
 since yesterday than in the many previous months we've been using pg.

Are you sure that there's nothing else happening on the machine that
could affect the vacuum times? Like, say a backup? Or perhaps updates
coming in from Slony that didn't used to be there?
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Gavin Hamill
On Thu, 26 Oct 2006 14:17:29 -0500
Jim C. Nasby [EMAIL PROTECTED] wrote:

 Are you sure that there's nothing else happening on the machine that
 could affect the vacuum times? Like, say a backup? Or perhaps updates
 coming in from Slony that didn't used to be there?

I'm absolutely certain. The backups run from only one slave, given that
it is a full copy of node 1. Our overnight traffic has not increased
any, and the nightly backups show that the overall size of the DB has
not increased more than usual growth.

Plus, I have fairly verbose logging, and it's not showing anything out
of the ordinary. 

Like I said, it's one of those awful hypothesis/hand-waving problems :)

Cheers,
Gavin.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Jim C. Nasby
On Thu, Oct 26, 2006 at 09:35:56PM +0100, Gavin Hamill wrote:
 On Thu, 26 Oct 2006 14:17:29 -0500
 Jim C. Nasby [EMAIL PROTECTED] wrote:
 
  Are you sure that there's nothing else happening on the machine that
  could affect the vacuum times? Like, say a backup? Or perhaps updates
  coming in from Slony that didn't used to be there?
 
 I'm absolutely certain. The backups run from only one slave, given that
 it is a full copy of node 1. Our overnight traffic has not increased
 any, and the nightly backups show that the overall size of the DB has
 not increased more than usual growth.
 
 Plus, I have fairly verbose logging, and it's not showing anything out
 of the ordinary. 
 
 Like I said, it's one of those awful hypothesis/hand-waving problems :)

Well, the fact that it's happening on all your nodes leads me to think
Slony is somehow involved. Perhaps it suddenly decided to change how
often it's issuing syncs? I know it issues vacuums as well, so maybe
that's got something to do with it... (though I'm guessing you've
already looked in pg_stat_activity/logs to see if anything
correlates...) Still, it might be worth asking about this on the slony
list...
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Andrew Sullivan
On Thu, Oct 26, 2006 at 09:35:56PM +0100, Gavin Hamill wrote:
 
 I'm absolutely certain. The backups run from only one slave, given that
 it is a full copy of node 1. Our overnight traffic has not increased
 any, and the nightly backups show that the overall size of the DB has
 not increased more than usual growth.

A couple things from your posts:

1.  Don't do VACUUM FULL, please.  It takes longer, and blocks
other things while it's going on, which might mean you're having
table bloat in various slony-related tables.

2.  Are your slony logs showing increased time too?  Are your
targets getting further behind?

3.  Your backups from the slave aren't done with pg_dump,
right?

But I suspect Slony has a role here, too.  I'd look carefully at the
slony tables -- especially the sl_log and pg_listen things, which
both are implicated.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The year's penultimate month is not in truth a good way of saying
November.
--H.W. Fowler

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org