[PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1

2007-08-24 Thread Anton
Hi.

I just created partitioned table, n_traf, sliced by month
(n_traf_y2007m01, n_traf_y2007m02... and so on, see below). They are
indexed by 'date_time' column.
Then I populate it (last value have date 2007-08-...) and do VACUUM
ANALYZE ON n_traf_y2007... all of it.

Now I try to select latest value (ORDER BY date_time LIMIT 1), but
Postgres produced the ugly plan:

=# explain SELECT * FROM n_traf ORDER BY date_time DESC LIMIT 1;
   QUERY PLAN
-
 Limit  (cost=824637.69..824637.69 rows=1 width=32)
   ->  Sort  (cost=824637.69..838746.44 rows=5643499 width=32)
 Sort Key: public.n_traf.date_time
 ->  Result  (cost=0.00..100877.99 rows=5643499 width=32)
   ->  Append  (cost=0.00..100877.99 rows=5643499 width=32)
 ->  Seq Scan on n_traf  (cost=0.00..22.30
rows=1230 width=32)
 ->  Seq Scan on n_traf_y2007m01 n_traf
(cost=0.00..22.30 rows=1230 width=32)
 ->  Seq Scan on n_traf_y2007m02 n_traf
(cost=0.00..22.30 rows=1230 width=32)
 ->  Seq Scan on n_traf_y2007m03 n_traf
(cost=0.00..22.30 rows=1230 width=32)
 ->  Seq Scan on n_traf_y2007m04 n_traf
(cost=0.00..1.01 rows=1 width=32)
 ->  Seq Scan on n_traf_y2007m05 n_traf
(cost=0.00..9110.89 rows=509689 width=32)
 ->  Seq Scan on n_traf_y2007m06 n_traf
(cost=0.00..32003.89 rows=1790489 width=32)
 ->  Seq Scan on n_traf_y2007m07 n_traf
(cost=0.00..33881.10 rows=1895510 width=32)
 ->  Seq Scan on n_traf_y2007m08 n_traf
(cost=0.00..25702.70 rows=1437970 width=32)
 ->  Seq Scan on n_traf_y2007m09 n_traf
(cost=0.00..22.30 rows=1230 width=32)
 ->  Seq Scan on n_traf_y2007m10 n_traf
(cost=0.00..22.30 rows=1230 width=32)
 ->  Seq Scan on n_traf_y2007m11 n_traf
(cost=0.00..22.30 rows=1230 width=32)
 ->  Seq Scan on n_traf_y2007m12 n_traf
(cost=0.00..22.30 rows=1230 width=32)
(18 rows)


Why it no uses indexes at all?
---

The simplier query goes fast, use index.
=# explain analyze SELECT * FROM n_traf_y2007m08 ORDER BY date_time
DESC LIMIT 1;

   QUERY PLAN
--
 Limit  (cost=0.00..0.03 rows=1 width=32) (actual time=0.156..0.158
rows=1 loops=1)
   ->  Index Scan Backward using n_traf_y2007m08_date_time_login_id on
n_traf_y2007m08  (cost=0.00..39489.48 rows=1437970 width=32) (actual
time=0.150..0.150 rows=1 loops=1)
 Total runtime: 0.241 ms
(3 rows)

Table n_traf looks like this:
=# \d n_traf
 Table "public.n_traf"
   Column|Type | Modifiers
-+-+
 login_id| integer | not null
 traftype_id | integer | not null
 date_time   | timestamp without time zone | not null
 bytes_in| bigint  | not null default 0
 bytes_out   | bigint  | not null default 0
Indexes:
"n_traf_login_id_key" UNIQUE, btree (login_id, traftype_id, date_time)
"n_traf_date_time_login_id" btree (date_time, login_id)
Foreign-key constraints:
"n_traf_login_id_fkey" FOREIGN KEY (login_id) REFERENCES
n_logins(login_id) ON UPDATE CASCADE ON DELETE CASCADE
"n_traf_traftype_id_fkey" FOREIGN KEY (traftype_id) REFERENCES
n_traftypes(traftype_id) ON UPDATE CASCADE
Rules:
n_traf_insert_y2007m01 AS
ON INSERT TO n_traf
   WHERE new.date_time >= '2007-01-01'::date AND new.date_time <
'2007-02-01 00:00:00'::timestamp without time zone DO INSTEAD
  INSERT INTO n_traf_y2007m01 (login_id, traftype_id, date_time,
bytes_in, bytes_out)
  VALUES (new.login_id, new.traftype_id, new.date_time, new.bytes_in,
new.bytes_out)
n_traf_insert_y2007m02 AS
ON INSERT TO n_traf
   WHERE new.date_time >= '2007-02-01'::date AND new.date_time <
'2007-03-01 00:00:00'::timestamp without time zone DO INSTEAD
  INSERT INTO n_traf_y2007m02 (login_id, traftype_id, date_time,
bytes_in, bytes_out)
  VALUES (new.login_id, new.traftype_id, new.date_time, new.bytes_in,
new.bytes_out)
n_traf_insert_y2007m03 AS
ON INSERT TO n_traf
   WHERE new.date_time >= '2007-03-01'::date AND new.date_time <
'2007-04-01 00:00:00'::timestamp without time zone DO INSTEAD
  INSERT INTO n_traf_y2007m03 (login_id, traftype_id, date_time,
bytes_in, bytes_out)
  VALUES (new.login_id, new.traftype_id, new.date_time, new.bytes_in,
new.bytes_out)
n_traf_insert_y2007m04 AS
ON INSERT TO n_traf
   WHERE new.date_time >= '2007-04-01'::date AND new.date_time <
'2007-05-01 00:00:00'::timestamp without time z

Re: [PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1

2007-08-24 Thread Mikko Partio
On 8/24/07, Anton <[EMAIL PROTECTED]> wrote:
>
> Hi.
>
> I just created partitioned table, n_traf, sliced by month
> (n_traf_y2007m01, n_traf_y2007m02... and so on, see below). They are
> indexed by 'date_time' column.
> Then I populate it (last value have date 2007-08-...) and do VACUUM
> ANALYZE ON n_traf_y2007... all of it.
>
> Now I try to select latest value (ORDER BY date_time LIMIT 1), but
> Postgres produced the ugly plan:
>
> =# explain SELECT * FROM n_traf ORDER BY date_time DESC LIMIT 1;
>QUERY PLAN
>
> -
> Limit  (cost=824637.69..824637.69 rows=1 width=32)
>->  Sort  (cost=824637.69..838746.44 rows=5643499 width=32)
>  Sort Key: public.n_traf.date_time
>  ->  Result  (cost=0.00..100877.99 rows=5643499 width=32)
>->  Append  (cost=0.00..100877.99 rows=5643499 width=32)
>  ->  Seq Scan on n_traf  (cost=0.00..22.30
> rows=1230 width=32)
>  ->  Seq Scan on n_traf_y2007m01 n_traf
> (cost=0.00..22.30 rows=1230 width=32)
>  ->  Seq Scan on n_traf_y2007m02 n_traf
> (cost=0.00..22.30 rows=1230 width=32)
>  ->  Seq Scan on n_traf_y2007m03 n_traf
> (cost=0.00..22.30 rows=1230 width=32)
>  ->  Seq Scan on n_traf_y2007m04 n_traf
> (cost=0.00..1.01 rows=1 width=32)
>  ->  Seq Scan on n_traf_y2007m05 n_traf
> (cost=0.00..9110.89 rows=509689 width=32)
>  ->  Seq Scan on n_traf_y2007m06 n_traf
> (cost=0.00..32003.89 rows=1790489 width=32)
>  ->  Seq Scan on n_traf_y2007m07 n_traf
> (cost=0.00..33881.10 rows=1895510 width=32)
>  ->  Seq Scan on n_traf_y2007m08 n_traf
> (cost=0.00..25702.70 rows=1437970 width=32)
>  ->  Seq Scan on n_traf_y2007m09 n_traf
> (cost=0.00..22.30 rows=1230 width=32)
>  ->  Seq Scan on n_traf_y2007m10 n_traf
> (cost=0.00..22.30 rows=1230 width=32)
>  ->  Seq Scan on n_traf_y2007m11 n_traf
> (cost=0.00..22.30 rows=1230 width=32)
>  ->  Seq Scan on n_traf_y2007m12 n_traf
> (cost=0.00..22.30 rows=1230 width=32)
> (18 rows)
>
>
> Why it no uses indexes at all?
> ---




I'm no expert but I'd guess that the the planner doesn't know which
partition holds the latest time so it has to read them all.

Regards

MP


Re: [PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1

2007-08-24 Thread Anton
> > =# explain SELECT * FROM n_traf ORDER BY date_time DESC LIMIT 1;
> >QUERY PLAN
> -
> > Limit  (cost=824637.69..824637.69 rows=1 width=32)
> >->  Sort  (cost=824637.69..838746.44 rows=5643499 width=32)
> >  Sort Key: public.n_traf.date_time
> >  ->  Result  (cost=0.00..100877.99 rows=5643499 width=32)
> >->  Append  (cost= 0.00..100877.99 rows=5643499 width=32)
> >  ->  Seq Scan on n_traf  (cost=0.00..22.30
> > rows=1230 width=32)
> >  ->  Seq Scan on n_traf_y2007m01 n_traf
> > (cost=0.00..22.30 rows=1230 width=32)
...
> >  ->  Seq Scan on n_traf_y2007m12 n_traf
> > (cost=0.00..22.30 rows=1230 width=32)
> > (18 rows)
> >
> > Why it no uses indexes at all?
> > ---
> I'm no expert but I'd guess that the the planner doesn't know which
> partition holds the latest time so it has to read them all.

Agree. But why it not uses indexes when it reading them?

-- 
engineer

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1

2007-08-24 Thread Heikki Linnakangas
Anton wrote:
>>> =# explain SELECT * FROM n_traf ORDER BY date_time DESC LIMIT 1;
>>>QUERY PLAN
>> -
>>> Limit  (cost=824637.69..824637.69 rows=1 width=32)
>>>->  Sort  (cost=824637.69..838746.44 rows=5643499 width=32)
>>>  Sort Key: public.n_traf.date_time
>>>  ->  Result  (cost=0.00..100877.99 rows=5643499 width=32)
>>>->  Append  (cost= 0.00..100877.99 rows=5643499 width=32)
>>>  ->  Seq Scan on n_traf  (cost=0.00..22.30
>>> rows=1230 width=32)
>>>  ->  Seq Scan on n_traf_y2007m01 n_traf
>>> (cost=0.00..22.30 rows=1230 width=32)
> ...
>>>  ->  Seq Scan on n_traf_y2007m12 n_traf
>>> (cost=0.00..22.30 rows=1230 width=32)
>>> (18 rows)
>>>
>>> Why it no uses indexes at all?
>>> ---
>> I'm no expert but I'd guess that the the planner doesn't know which
>> partition holds the latest time so it has to read them all.
> 
> Agree. But why it not uses indexes when it reading them?

The planner isn't smart enough to push the "ORDER BY ... LIMIT ..."
below the append node. Therefore it needs to fetch all rows from all the
tables, and the fastest way to do that is a seq scan.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] When/if to Reindex

2007-08-24 Thread Steven Flatt
On 8/24/07, Mark Kirkwood <[EMAIL PROTECTED]> wrote:
>
> Tom Lane wrote:
> >
> > The fly in the ointment is that after collecting the pg_index definition
> > of the index, plancat.c also wants to know how big it is --- it calls
> > RelationGetNumberOfBlocks.  And that absolutely does look at the
> > physical storage, which means it absolutely is unsafe to do in parallel
> > with a REINDEX that will be dropping the old physical storage at some
> > point.
>
> A suggestion that seems a bit like a leap backwards in time - maybe just
> use the pg_class.relpages entry for the index size?


Just throwing this out there (looking from a higher level)...

Why do we even need to consider calling RelationGetNumberOfBlocks or looking
at the pg_class.relpages entry?  My understanding of the expected behaviour
is that while a reindex is happening, all queries run against the parent
table are planned as though the index isn't there (i.e. it's unusable).
This may/will result in sub-optimal query plans, but the point is that
reindex never blocks readers.  Not sure if from an implementation standpoint
it's easy to mark an index as "being reindexed" in which case the planner
should just skip it.

Steve


Re: [PERFORM] When/if to Reindex

2007-08-24 Thread Tom Lane
"Steven Flatt" <[EMAIL PROTECTED]> writes:
>> The fly in the ointment is that after collecting the pg_index definition
>> of the index, plancat.c also wants to know how big it is --- it calls
>> RelationGetNumberOfBlocks.

> Why do we even need to consider calling RelationGetNumberOfBlocks or looking
> at the pg_class.relpages entry?  My understanding of the expected behaviour
> is that while a reindex is happening, all queries run against the parent
> table are planned as though the index isn't there (i.e. it's unusable).

Where in the world did you get that idea?

If we had a REINDEX CONCURRENTLY it might work that way.  A normal
REINDEX cannot "mark" anything because it runs within a single
transaction; there is no way that it can emit any catalog changes
that will be visible before it's over.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1

2007-08-24 Thread Luke Lonergan
We just fixed this - I'll post a patch, but I don't have time to verify
against HEAD.

- Luke


On 8/24/07 3:38 AM, "Heikki Linnakangas" <[EMAIL PROTECTED]> wrote:

> Anton wrote:
 =# explain SELECT * FROM n_traf ORDER BY date_time DESC LIMIT 1;
QUERY PLAN
>>> 
>>> -
 Limit  (cost=824637.69..824637.69 rows=1 width=32)
->  Sort  (cost=824637.69..838746.44 rows=5643499 width=32)
  Sort Key: public.n_traf.date_time
  ->  Result  (cost=0.00..100877.99 rows=5643499 width=32)
->  Append  (cost= 0.00..100877.99 rows=5643499 width=32)
  ->  Seq Scan on n_traf  (cost=0.00..22.30
 rows=1230 width=32)
  ->  Seq Scan on n_traf_y2007m01 n_traf
 (cost=0.00..22.30 rows=1230 width=32)
>> ...
  ->  Seq Scan on n_traf_y2007m12 n_traf
 (cost=0.00..22.30 rows=1230 width=32)
 (18 rows)
 
 Why it no uses indexes at all?
 ---
>>> I'm no expert but I'd guess that the the planner doesn't know which
>>> partition holds the latest time so it has to read them all.
>> 
>> Agree. But why it not uses indexes when it reading them?
> 
> The planner isn't smart enough to push the "ORDER BY ... LIMIT ..."
> below the append node. Therefore it needs to fetch all rows from all the
> tables, and the fastest way to do that is a seq scan.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1

2007-08-24 Thread Luke Lonergan
Below is a patch against 8.2.4 (more or less), Heikki can you take a look at
it?

This enables the use of index scan of a child table by recognizing sort
order of the append node.  Kurt Harriman did the work.

- Luke

Index: cdb-pg/src/backend/optimizer/path/indxpath.c
===
RCS file: 
/data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
er/path/indxpath.c,v
diff -u -N -r1.22 -r1.22.2.1
--- cdb-pg/src/backend/optimizer/path/indxpath.c25 Apr 2007 22:07:21
-1.22
+++ cdb-pg/src/backend/optimizer/path/indxpath.c10 Aug 2007 03:41:15
-1.22.2.1
@@ -379,8 +379,51 @@
 index_pathkeys = build_index_pathkeys(root, index,
   ForwardScanDirection,
   true);
-useful_pathkeys = truncate_useless_pathkeys(root, rel,
-index_pathkeys);
+/*
+ * CDB: For appendrel child, pathkeys contain Var nodes in
terms 
+ * of the child's baserel.  Transform the pathkey list to refer
to 
+ * columns of the appendrel.
+ */
+if (rel->reloptkind == RELOPT_OTHER_MEMBER_REL)
+{
+AppendRelInfo  *appinfo = NULL;
+RelOptInfo *appendrel = NULL;
+ListCell   *appcell;
+CdbPathLocusnotalocus;
+
+/* Find the appendrel of which this baserel is a child. */
+foreach(appcell, root->append_rel_list)
+{
+appinfo = (AppendRelInfo *)lfirst(appcell);
+if (appinfo->child_relid == rel->relid)
+break;
+}
+Assert(appinfo);
+appendrel = find_base_rel(root, appinfo->parent_relid);
+
+/*
+ * The pathkey list happens to have the same format as the
+ * partitioning key of a Hashed locus, so by disguising it
+ * we can use cdbpathlocus_pull_above_projection() to do
the 
+ * transformation.
+ */
+CdbPathLocus_MakeHashed(¬alocus, index_pathkeys);
+notalocus =
+cdbpathlocus_pull_above_projection(root,
+   notalocus,
+   rel->relids,
+   rel->reltargetlist,
+  
appendrel->reltargetlist,
+   appendrel->relid);
+if (CdbPathLocus_IsHashed(notalocus))
+index_pathkeys = truncate_useless_pathkeys(root,
appendrel,
+  
notalocus.partkey);
+else
+index_pathkeys = NULL;
+}
+
+useful_pathkeys = truncate_useless_pathkeys(root, rel,
+index_pathkeys);
 }
 else
 useful_pathkeys = NIL;
Index: cdb-pg/src/backend/optimizer/path/pathkeys.c
===
RCS file: 
/data/FISHEYE_REPOSITORIES/greenplum/cvsroot/cdb2/cdb-pg/src/backend/optimiz
er/path/pathkeys.c,v
diff -u -N -r1.18 -r1.18.2.1
--- cdb-pg/src/backend/optimizer/path/pathkeys.c30 Apr 2007 05:44:07
-1.18
+++ cdb-pg/src/backend/optimizer/path/pathkeys.c10 Aug 2007 03:41:15
-1.18.2.1
@@ -1403,55 +1403,53 @@
 {
 PathKeyItem*item;
 Expr   *newexpr;
+AttrNumber  targetindex;
 
 Assert(pathkey);
 
-/* Use constant expr if available.  Will be at head of list. */
-if (CdbPathkeyEqualsConstant(pathkey))
+/* Find an expr that we can rewrite to use the projected columns. */
+item = cdbpullup_findPathKeyItemInTargetList(pathkey,
+ relids,
+ targetlist,
+ &targetindex); // OUT
+
+/* If not found, see if the equiv class contains a constant expr. */
+if (!item &&
+CdbPathkeyEqualsConstant(pathkey))
 {
 item = (PathKeyItem *)linitial(pathkey);
 newexpr = (Expr *)copyObject(item->key);
 }
 
-/* New vars for old! */
-else
-{
-AttrNumber  targetindex;
+/* Fail if no usable expr. */
+else if (!item)
+return NULL;
 
-/* Find an expr that we can rewrite to use the projected columns.
*/
-item = cdbpullup_findPathKeyItemInTargetList(pathkey,
- relids,
- targetlist,
- &targetindex); // OUT
-if 

Re: [PERFORM] When/if to Reindex

2007-08-24 Thread Steven Flatt
On 8/24/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>
> "Steven Flatt" <[EMAIL PROTECTED]> writes:
> > Why do we even need to consider calling RelationGetNumberOfBlocks or
> looking
> > at the pg_class.relpages entry?  My understanding of the expected
> behaviour
> > is that while a reindex is happening, all queries run against the parent
> > table are planned as though the index isn't there (i.e. it's unusable).
>
> Where in the world did you get that idea?


Maybe that's what I was *hoping* the behaviour would be. :)

>From the docs:
"REINDEX locks out writes but not reads of the index's parent table."
"It also takes an exclusive lock on the specific index being processed..."

I believe those two statements imply that reads of the parent table don't
take any lock whatsoever on the index being processed, i.e. they ignore it.

If we had a REINDEX CONCURRENTLY it might work that way.  A normal
> REINDEX cannot "mark" anything because it runs within a single
> transaction; there is no way that it can emit any catalog changes
> that will be visible before it's over.
>
... but I understand this difficulty.

So, can we simply trust what's in pg_class.relpages and ignore looking
directly at the index?  This is a fairly serious concern for us, that
reindex is blocking all readers of the parent table.

Thanks,
Steve


Re: [PERFORM] When/if to Reindex

2007-08-24 Thread Tom Lane
"Steven Flatt" <[EMAIL PROTECTED]> writes:
> So, can we simply trust what's in pg_class.relpages and ignore looking
> directly at the index?

No, we can't.  In the light of morning I remember more about the reason
for the aforesaid patch: it's actually unsafe to read the pg_class row
at all if you have not got lock on the index.  We are reading with
SnapshotNow in order to be sure we see up-to-date info, and that means
that a concurrent update of the row (eg, for REINDEX to report the new
relfilenode) can have the following behavior:

1. REINDEX inserts the new modified version of the index's pg_class row.

2. Would-be reader process visits the new version of the pg_class row.
   It's not committed yet, so we ignore it and continue scanning.

3. REINDEX commits.

4. Reader process visits the old version of the pg_class row.  It's
   now committed dead, so we ignore it and continue scanning.

5. Reader process bombs out with a complaint about no pg_class row for
   the index.

So we really have to have the lock.

> This is a fairly serious concern for us, that
> reindex is blocking all readers of the parent table.

I'm afraid you're kinda stuck: I don't see any fix that would be
practical to put into 8.2, or even 8.3 considering that it's way too
late to be thinking of implementing REINDEX CONCURRENTLY for 8.3.

You might be able to work around it for now by faking such a reindex
"by hand"; that is, create a duplicate new index under a different
name using CREATE INDEX CONCURRENTLY, then exclusive-lock the table
for just long enough to drop the old index and rename the new one
to match.

It's probably worth asking also how badly you really need routine
reindexing.  Are you certain your app still needs that with 8.2,
or is it a hangover from a few releases back?  Could more aggressive
(auto)vacuuming provide a better solution?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] When/if to Reindex

2007-08-24 Thread Steven Flatt
On 8/24/07, Tom Lane <[EMAIL PROTECTED]> wrote:

> You might be able to work around it for now by faking such a reindex
> "by hand"; that is, create a duplicate new index under a different
> name using CREATE INDEX CONCURRENTLY, then exclusive-lock the table
> for just long enough to drop the old index and rename the new one
> to match.


This is a good suggestion, one that we had thought of earlier.  Looks like
it might be time to try it out and observe system impact.




> It's probably worth asking also how badly you really need routine
> reindexing.  Are you certain your app still needs that with 8.2,
> or is it a hangover from a few releases back?  Could more aggressive
> (auto)vacuuming provide a better solution?


Routine reindexing was added (recently, since moving to 8.2) as more of an
optimization than a necessity.  If the idea above doesn't work for us or
causes locking issues, then we could always do away with the periodic
reindexing.  That would be unfortunate, because reindexing serves to be
quite a nice optimization for us.  We've observed up to 40% space savings
(after setting the fillfactor to 100, then reindexing) along with general
improvement in read performance (although hard to quantify).

As mentioned earlier in this thread, we're only reindexing insert-only
partitioned tables, once they're fully loaded.

Thanks for your help.

Steve


[PERFORM] significant vacuum issues - looking for suggestions

2007-08-24 Thread Kevin Kempter
Hi List;

I've just started working with a client that has been running Postgres (with 
no DBA) for a few years. They're running  version 8.1.4 on 4-way dell boxes 
with 4Gig of memory on each box attached to RAID-10 disk arrays. 

Some of their key config settings are here:
shared_buffers = 20480
work_mem = 16384
maintenance_work_mem = 32758
wal_buffers = 24
checkpoint_segments = 32
checkpoint_timeout = 300
checkpoint_warning = 30
effective_cache_size = 524288
autovacuum = on
autovacuum_naptime = 60
autovacuum_vacuum_threshold = 500
autovacuum_analyze_threshold = 250
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_cost_delay = -1
autovacuum_vacuum_cost_limit = -1


Currently I've found that they have approx 17 tables that show a significant 
amount of bloat in the system. The worst one showing over 5million  pages 
worth of dead space. One of the problems is that their system is so busy with 
activity during the day and massive data load processes at night that they 
often kill the pid of vacuum processes kicked off by autovacuum because the 
overall load impact disallows users from logging into the app since the login 
process includes at least one db query that then seems to hang because there 
are anywhere from 100 - 300 queries ahead of it at any given time. Normally a 
user gets logged in with an avg wait of 5 - 10 seconds but when a long 
running vacuum (sometimes due to a long running update process that's trying 
to sort/update > 40million rows)  is going the system gets to a state where 
the login queries never get executed until the vacuum process is killed.

As a result of this I believe that the biggest table (the one with > 5million 
pages worth of dead space) has never been vacuumed to completion. I suspect 
this is the case for a few of the other top dead space tables as well but I 
can't be sure. 

My first priority was to get this vacuum scenario cleaned up. First off I 
added the biggest table into pg_autovacuum and set the enabled column to 
false ('f'). Then I set vacuum_cost_delay to 10 and in the same session 
ran "vacuum analyze verbose big_table".  This ran for 7.5 hours before we had 
to kill it due to system load - and to make matters worse the high system 
load was forcing many of the nightly batch queries that load, update, etc the 
data to stack up to a point where the system was at less than 2% idle (CPU) 
for the next 4 hours and barely responding to the command line.

To make matters worse I find out this morning that the db is at 85% per used 
transaction ID's - again since a vacuum on the entire db has never been 
completed. 

As far as I can tell, the overall db size is currently 199G of which approx 
104G seems to be valid data.

Here's my thoughts per how to proceed:

=
1) fix the big table ASAP (probably over the weekend) since it's not only the 
biggest table but the most active like this:

   a) run a pg_dump of this table

   b)  restore this dump into a new table (i.e. new_big_table)

   c) lock the original big_table, sync any changes, inserts, deletes since we 
did the dump from big_table into new_big_table

   d) drop big_table

   e) re-name new_big_table to big_table

* I may run through this for a few of the other large, highly active tables 
that have minimal page density as well.
=


The development folks that have been here awhile tell me that it seems like 
when they have a query (not limited to vacuum processes) that has been 
running for a long time (i.e. > 5 or 6 hours) that the query sort of "goes 
crazy" and the entire system gets pegged until they kill that process. - I've 
not heard of this but I suspect upgrading to 8.2.4 is probably a good plan at 
this point as well, so for step 2, I'll do this:

=
2) (obviously I'll do this in dev first, then in QA and finally in prod)
   a) install verson 8.2.4 from source, leaving 8.1.4 in place

   b) create the new 8.2.4 cluster on a new port

   c) setup WAL archiving on the 8.1.4 cluster

   d) do a full dump of the 8.1.4 cluster and restore it to the new 8.2.4 
cluster

   e) stop the 8.2.4 cluster and bring it up in recovery mode, pointing it to 
the directory where we're archiving the 8.1.4 cluster's WAL segments.

   f) once caught up, bring both clusters down

   g) copy any final files from the 8.1.4 cluster's pg_xlog directory into the 
new 8.2.4 pg_xlog dir (is this ok, since I'm moving 8.1.4 version tx logs 
into an 8.2.4 xlog dir?)

   h) Change the port on the 8.2.4 cluster to what the original 8.1.4 cluster 
port was

   i) bring up the new 8.2.4 system, and actively manage the vacuum needs 
moving fwd via a combination of autovacuum, cron processes for specififed 
table vac's (daily, hourly, 15min, 5min, etc), and as needed interactive 
session vacuums
=


The src based install will allow me to setup a rob

Re: [PERFORM] When/if to Reindex

2007-08-24 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> "Steven Flatt" <[EMAIL PROTECTED]> writes:
>> So, can we simply trust what's in pg_class.relpages and ignore looking
>> directly at the index?
>
> No, we can't.  In the light of morning I remember more about the reason
> for the aforesaid patch: it's actually unsafe to read the pg_class row
> at all if you have not got lock on the index.  We are reading with
> SnapshotNow in order to be sure we see up-to-date info, and that means
> that a concurrent update of the row (eg, for REINDEX to report the new
> relfilenode) can have the following behavior:

Should reindex be doing an in-place update? Don't we have to do in-place
updates for other system catalogs which are read in snapshotnow for precisely
the same reasons?

Alternatively, why does the planner need access to the pg_class entry and not
just the pg_index record?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] significant vacuum issues - looking for suggestions

2007-08-24 Thread Bill Moran
In response to Kevin Kempter <[EMAIL PROTECTED]>:

> Hi List;
> 
> I've just started working with a client that has been running Postgres (with 
> no DBA) for a few years. They're running  version 8.1.4 on 4-way dell boxes 
> with 4Gig of memory on each box attached to RAID-10 disk arrays. 
> 
> Some of their key config settings are here:
> shared_buffers = 20480
> work_mem = 16384
> maintenance_work_mem = 32758

Before you do any of those other things, bump shared_buffers to about
12 and maintenance_work_mem to 25 or so -- unless this box
has other applications on it using significant amounts of those 4G of
RAM.  You may find that these changes alone are enough to get vacuum
to complete.  You'll need to restart the server for the shared_buffers
setting to take effect.

Can you do a pg_relation_size() on the tables in question?

> wal_buffers = 24
> checkpoint_segments = 32
> checkpoint_timeout = 300
> checkpoint_warning = 30
> effective_cache_size = 524288
> autovacuum = on
> autovacuum_naptime = 60
> autovacuum_vacuum_threshold = 500
> autovacuum_analyze_threshold = 250
> autovacuum_vacuum_scale_factor = 0.2
> autovacuum_analyze_scale_factor = 0.1
> autovacuum_vacuum_cost_delay = -1
> autovacuum_vacuum_cost_limit = -1
> 
> 
> Currently I've found that they have approx 17 tables that show a significant 
> amount of bloat in the system. The worst one showing over 5million  pages 
> worth of dead space. One of the problems is that their system is so busy with 
> activity during the day and massive data load processes at night that they 
> often kill the pid of vacuum processes kicked off by autovacuum because the 
> overall load impact disallows users from logging into the app since the login 
> process includes at least one db query that then seems to hang because there 
> are anywhere from 100 - 300 queries ahead of it at any given time. Normally a 
> user gets logged in with an avg wait of 5 - 10 seconds but when a long 
> running vacuum (sometimes due to a long running update process that's trying 
> to sort/update > 40million rows)  is going the system gets to a state where 
> the login queries never get executed until the vacuum process is killed.
> 
> As a result of this I believe that the biggest table (the one with > 5million 
> pages worth of dead space) has never been vacuumed to completion. I suspect 
> this is the case for a few of the other top dead space tables as well but I 
> can't be sure. 
> 
> My first priority was to get this vacuum scenario cleaned up. First off I 
> added the biggest table into pg_autovacuum and set the enabled column to 
> false ('f'). Then I set vacuum_cost_delay to 10 and in the same session 
> ran "vacuum analyze verbose big_table".  This ran for 7.5 hours before we had 
> to kill it due to system load - and to make matters worse the high system 
> load was forcing many of the nightly batch queries that load, update, etc the 
> data to stack up to a point where the system was at less than 2% idle (CPU) 
> for the next 4 hours and barely responding to the command line.
> 
> To make matters worse I find out this morning that the db is at 85% per used 
> transaction ID's - again since a vacuum on the entire db has never been 
> completed. 
> 
> As far as I can tell, the overall db size is currently 199G of which approx 
> 104G seems to be valid data.
> 
> Here's my thoughts per how to proceed:
> 
> =
> 1) fix the big table ASAP (probably over the weekend) since it's not only the 
> biggest table but the most active like this:
> 
>a) run a pg_dump of this table
> 
>b)  restore this dump into a new table (i.e. new_big_table)
> 
>c) lock the original big_table, sync any changes, inserts, deletes since 
> we 
> did the dump from big_table into new_big_table
> 
>d) drop big_table
> 
>e) re-name new_big_table to big_table
> 
> * I may run through this for a few of the other large, highly active tables 
> that have minimal page density as well.
> =
> 
> 
> The development folks that have been here awhile tell me that it seems like 
> when they have a query (not limited to vacuum processes) that has been 
> running for a long time (i.e. > 5 or 6 hours) that the query sort of "goes 
> crazy" and the entire system gets pegged until they kill that process. - I've 
> not heard of this but I suspect upgrading to 8.2.4 is probably a good plan at 
> this point as well, so for step 2, I'll do this:
> 
> =
> 2) (obviously I'll do this in dev first, then in QA and finally in prod)
>a) install verson 8.2.4 from source, leaving 8.1.4 in place
> 
>b) create the new 8.2.4 cluster on a new port
> 
>c) setup WAL archiving on the 8.1.4 cluster
> 
>d) do a full dump of the 8.1.4 cluster and restore it to the new 8.2.4 
> cluster
> 
>e) stop the 8.2.4 cluster and bring it up in recovery mode, pointing it to 
> the directory where we're arch

Re: [PERFORM] significant vacuum issues - looking for suggestions

2007-08-24 Thread Kevin Grittner
>>> On Fri, Aug 24, 2007 at  2:57 PM, in message
<[EMAIL PROTECTED]>, Kevin Kempter
<[EMAIL PROTECTED]> wrote: 
>c) setup WAL archiving on the 8.1.4 cluster
> 
>d) do a full dump of the 8.1.4 cluster and restore it to the new 8.2.4 
> cluster
> 
>e) stop the 8.2.4 cluster and bring it up in recovery mode, pointing it 
> to 
> the directory where we're archiving the 8.1.4 cluster's WAL segments.

You can't use these techniques for a major version upgrade.
Use pg_dump piped to psql.  That will also eliminate all bloat.
 
-Kevin
 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] significant vacuum issues - looking for suggestions

2007-08-24 Thread Bill Moran
In response to "Kevin Grittner" <[EMAIL PROTECTED]>:

> >>> On Fri, Aug 24, 2007 at  2:57 PM, in message
> <[EMAIL PROTECTED]>, Kevin Kempter
> <[EMAIL PROTECTED]> wrote: 
> >c) setup WAL archiving on the 8.1.4 cluster
> > 
> >d) do a full dump of the 8.1.4 cluster and restore it to the new 8.2.4 
> > cluster
> > 
> >e) stop the 8.2.4 cluster and bring it up in recovery mode, pointing it 
> > to 
> > the directory where we're archiving the 8.1.4 cluster's WAL segments.
> 
> You can't use these techniques for a major version upgrade.
> Use pg_dump piped to psql.  That will also eliminate all bloat.

If you can't afford any downtime, you may be able to use Slony to
do your upgrade.  However, slony adds overhead, and if this system
is tapped out already, it may not tolerate the additional overhead.

-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

[EMAIL PROTECTED]
Phone: 412-422-3463x4023


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] significant vacuum issues - looking for suggestions

2007-08-24 Thread Tom Lane
Kevin Kempter <[EMAIL PROTECTED]> writes:
> The development folks that have been here awhile tell me that it seems like 
> when they have a query (not limited to vacuum processes) that has been 
> running for a long time (i.e. > 5 or 6 hours) that the query sort of "goes 
> crazy" and the entire system gets pegged until they kill that
> process. - I've not heard of this 

Me either, but I wonder whether their queries are tickling some memory
leak.  I could imagine that what they are seeing is the backend process
growing slowly until it starts to swap, and then continuing to grow and
needing more and more swap activity.  Once you get over the knee of that
curve, things get real bad real fast.  It might not be a bad idea to run
the postmaster under a (carefully chosen) ulimit setting to cut such
things off before the system starts swapping.  Other things to look at:

* what exactly gets "pegged" --- is it CPU or I/O bound?  Watching
"vmstat 1" is usually a good diagnostic since you can see CPU, swap,
and regular disk I/O activity at once.

* is there really not any pattern to the queries that cause the problem?
I don't think 8.1.4 has any widespread leakage problem, but they might
be tickling something isolated, in which case 8.2 is not necessarily
gonna fix it.  If you can produce a test case showing this behavior it'd
be time to call in pgsql-hackers.

Your other points seem pretty well covered by other replies.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] When/if to Reindex

2007-08-24 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> Should reindex be doing an in-place update?

Not if you'd like it to be crash-safe.

> Alternatively, why does the planner need access to the pg_class entry and not
> just the pg_index record?

For one thing, to find out how big the index is ... though if we could
get around that problem, it might indeed be possible to treat the
pg_index records as property of the parent table not the index itself,
which would give us license to read them without locking the index.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-24 Thread Benjamin Arai

Hi,

I have an application which loads millions of NEW documents each month
into a PostgreSQL tsearch2 table.  I have the initial version completed
and searching performance is great but my problem is that each time a  
new

month rolls around I have to drop all the indexes do a COPY and re-index
the entire table. This is problematic considering that each month takes
longer than the previous to rebuild the indexes and the application in
unavailable during the rebuilding process.

In order to avoid the re-indexing I was thinking of instead creating  
a new

table each month (building its indexes and etc) and accessing them all
through a view. This way I only have to index the new data each month.

Does this work?  Does a view with N tables make it N times slower for
tsearch2 queries? Is there a better solution?

Benjamin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-24 Thread Brandon Shalton

Benjamin,




In order to avoid the re-indexing I was thinking of instead creating  a 
new

table each month (building its indexes and etc) and accessing them all
through a view. This way I only have to index the new data each month.



Take a look at bizgres.org (based on postgres).

They have a parent-child structure.

The way i use it, is I have about 30M records a day that are inserted into 
the database.


Each day is a "child" table to the "parent".

so example:

the parent table is called  "logfile"

each day, is a child, with the structure like  "logfile_YYMMDD"

the "child" inherits the table structure of the parent, such that you could 
query the child table name directly, or you run the query against the parent 
(ie. logfile table) and get all the data.


the indexes are done on a per table basis, so new data that comes in, is a 
lesser amount, and doesn't require re-indexing.



example:

select * from logfile_070825 where datafield = 'foo'

if i knew i wanted to specifically go into that child, or:

select * from logfile where datafield = 'foo'

and all child tables are searched and results merged.  You can perform any 
kind of sql query and field structures are you normally do.


the downside is that the queries are run sequentially.

so if you had 100 child tables, each table is queried via indexes, then 
results are merged.


but, this approach does allow me to dump alot of data in, without having the 
re-indexing issues you are facing.


at some point, you could roll up the days, in to weekly child tables, then 
monthly tables, etc.


I believe Bizgres has a new version of their system that does parallel 
queries which would certainly speed things up.


For your documents, you can do it by the day it was checked in, or maybe you 
have some other way of logically grouping, but the parent/child table 
structure really helped to solve my problem of adding in millions of records 
each day.


The closest thing in mysql is using merge tables, which is not really 
practical when it comes time to do the joins to the tables.


-brandon

http://www.t3report.com - marketing intelligence for online marketing and 
affiliate programs











---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-24 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Brandon Shalton wrote:
> Benjamin,
> 
> 
>>
>> In order to avoid the re-indexing I was thinking of instead creating 
>> a new
>> table each month (building its indexes and etc) and accessing them all
>> through a view. This way I only have to index the new data each month.
>>
> 
> Take a look at bizgres.org (based on postgres).
> 
> They have a parent-child structure.
> 
> The way i use it, is I have about 30M records a day that are inserted
> into the database.
> 
> Each day is a "child" table to the "parent".
> 
> so example:
> 
> the parent table is called  "logfile"
> 
> each day, is a child, with the structure like  "logfile_YYMMDD"
> 
> the "child" inherits the table structure of the parent, such that you
> could query the child table name directly, or you run the query against
> the parent (ie. logfile table) and get all the data.
> 
> the indexes are done on a per table basis, so new data that comes in, is
> a lesser amount, and doesn't require re-indexing.


PostgreSQL can do all of this too.

Sincerely,

Joshua D. Drake

> 
> 
> example:
> 
> select * from logfile_070825 where datafield = 'foo'
> 
> if i knew i wanted to specifically go into that child, or:
> 
> select * from logfile where datafield = 'foo'
> 
> and all child tables are searched and results merged.  You can perform
> any kind of sql query and field structures are you normally do.
> 
> the downside is that the queries are run sequentially.
> 
> so if you had 100 child tables, each table is queried via indexes, then
> results are merged.
> 
> but, this approach does allow me to dump alot of data in, without having
> the re-indexing issues you are facing.
> 
> at some point, you could roll up the days, in to weekly child tables,
> then monthly tables, etc.
> 
> I believe Bizgres has a new version of their system that does parallel
> queries which would certainly speed things up.
> 
> For your documents, you can do it by the day it was checked in, or maybe
> you have some other way of logically grouping, but the parent/child
> table structure really helped to solve my problem of adding in millions
> of records each day.
> 
> The closest thing in mysql is using merge tables, which is not really
> practical when it comes time to do the joins to the tables.
> 
> -brandon
> 
> http://www.t3report.com - marketing intelligence for online marketing
> and affiliate programs
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend
> 


- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGz4XuATb/zqfZUUQRAukhAJ9b2x4PLPZsoPmtm3O/Ze4AobDXngCgq+rl
X2j2ePDyjYxRajfGCVmjnYU=
=pIjb
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-24 Thread Benjamin Arai

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

This kind of disappointing, I was hoping there was more that could be  
done.


There has to be another way to do incremental indexing without  
loosing that much performance.


Benjamin

On Aug 24, 2007, at 6:29 PM, Joshua D. Drake wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Brandon Shalton wrote:

Benjamin,




In order to avoid the re-indexing I was thinking of instead creating
a new
table each month (building its indexes and etc) and accessing  
them all
through a view. This way I only have to index the new data each  
month.




Take a look at bizgres.org (based on postgres).

They have a parent-child structure.

The way i use it, is I have about 30M records a day that are inserted
into the database.

Each day is a "child" table to the "parent".

so example:

the parent table is called  "logfile"

each day, is a child, with the structure like  "logfile_YYMMDD"

the "child" inherits the table structure of the parent, such that you
could query the child table name directly, or you run the query  
against

the parent (ie. logfile table) and get all the data.

the indexes are done on a per table basis, so new data that comes  
in, is

a lesser amount, and doesn't require re-indexing.



PostgreSQL can do all of this too.

Sincerely,

Joshua D. Drake




example:

select * from logfile_070825 where datafield = 'foo'

if i knew i wanted to specifically go into that child, or:

select * from logfile where datafield = 'foo'

and all child tables are searched and results merged.  You can  
perform

any kind of sql query and field structures are you normally do.

the downside is that the queries are run sequentially.

so if you had 100 child tables, each table is queried via indexes,  
then

results are merged.

but, this approach does allow me to dump alot of data in, without  
having

the re-indexing issues you are facing.

at some point, you could roll up the days, in to weekly child tables,
then monthly tables, etc.

I believe Bizgres has a new version of their system that does  
parallel

queries which would certainly speed things up.

For your documents, you can do it by the day it was checked in, or  
maybe

you have some other way of logically grouping, but the parent/child
table structure really helped to solve my problem of adding in  
millions

of records each day.

The closest thing in mysql is using merge tables, which is not really
practical when it comes time to do the joins to the tables.

-brandon

http://www.t3report.com - marketing intelligence for online marketing
and affiliate programs










---(end of  
broadcast)---

TIP 6: explain analyze is your friend




- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/ 
donate

PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGz4XuATb/zqfZUUQRAukhAJ9b2x4PLPZsoPmtm3O/Ze4AobDXngCgq+rl
X2j2ePDyjYxRajfGCVmjnYU=
=pIjb
-END PGP SIGNATURE-



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (Darwin)

iQIVAwUBRs+/UfyqRf6YpodNAQL6Xg//eEqR0UQ4I/snn7Dtmkru40jCuECGeG8g
XoxLWEa+bumVgwrEYbjKTBp3KP6OEKz9VV4xHQROTtqxh+rg0hdoc0kWxSyquCm8
VljL24ykvBmRmjhacwi8FKp092zwRcLrbkzTxIr90q8u008aVPWxQCBtmfL6QVTv
I9AyN0kb00ypx+B9I2ySugYzBerVCMUiKUeXplHWn1loSSm1w+5CzXY8gtvivFEV
YspS1Fk2rxjnjlPE/FTGUiwJrdWZTJrd3BuSVbH5DWBoCjz9gzq0NyNZAtESWX2H
oGwlWBEJNFTtoHnK4iTMS+CzKHQQQZ9ZuQcHy84SlXYUo9n0/NCIeabu2xaj44Fs
LFq8jBCH3ebAkD/hQOgk1H05ljbfX8A/u2zz75W1NbD0xTB/sAljWqhypz2x7pOo
sUJF9MQ7DwVG8JitUAAc5fuGpLLR4WxF68YdkgycaCNknP7IATeD2ecqJkC26Av+
GHHci2ct5ypVq9Qq8OuesYSox7XpO2+E+Y5DtgBo+/R7eOJRLA3Z0FDXFLGsdFxy
0OKoew1MN79jP+KMZFJwvddH/TrkZBdIKlkacXYwUHU3c1ATwne6WteKTnEmr2aP
99oQgfmNDyQgTeEL20jokF4YZOdm1UO3Cc7wTi2QlwyqUDbUmYtWzgbS9QbnaGGA
58XdVacGznw=
=Hst4
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] significant vacuum issues - looking for suggestions

2007-08-24 Thread Kevin Kempter
On Friday 24 August 2007 15:39:22 Tom Lane wrote:
> Kevin Kempter <[EMAIL PROTECTED]> writes:
> > The development folks that have been here awhile tell me that it seems
> > like when they have a query (not limited to vacuum processes) that has
> > been running for a long time (i.e. > 5 or 6 hours) that the query sort of
> > "goes crazy" and the entire system gets pegged until they kill that
> > process. - I've not heard of this
>
> Me either, but I wonder whether their queries are tickling some memory
> leak.  I could imagine that what they are seeing is the backend process
> growing slowly until it starts to swap, and then continuing to grow and
> needing more and more swap activity.  Once you get over the knee of that
> curve, things get real bad real fast.  It might not be a bad idea to run
> the postmaster under a (carefully chosen) ulimit setting to cut such
> things off before the system starts swapping.  Other things to look at:
>
> * what exactly gets "pegged" --- is it CPU or I/O bound?  Watching
> "vmstat 1" is usually a good diagnostic since you can see CPU, swap,
> and regular disk I/O activity at once.
>
> * is there really not any pattern to the queries that cause the problem?
> I don't think 8.1.4 has any widespread leakage problem, but they might
> be tickling something isolated, in which case 8.2 is not necessarily
> gonna fix it.  If you can produce a test case showing this behavior it'd
> be time to call in pgsql-hackers.
>
> Your other points seem pretty well covered by other replies.
>
>   regards, tom lane

Thanks everyone for the help. I'll first up the memory settings like Bill 
suggested and then see where I'm at. Moving fwd I'll see if I have a test 
case that I can re-create, plus I may try constraining the postmaster via a 
ulimit setting, again based on what I see once the cluster is allowed to use 
the memory it should have been given up front.

/Kevin


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org