Re: [PERFORM][OT] Best suiting OS

2009-10-05 Thread Csaba Nagy
Hi Jean-David,

On Mon, 2009-10-05 at 15:37 +0200, Jean-David Beyer wrote:
 Robert Haas wrote (in part):
 
  Also, I'd just like to mention that vi is a much better editor than
  emacs.
  
 That is not my impression. I have used vi from when it first came out (I
 used ed before that) until about 1998 when I first installed Linux on one of
 my machines and started using emacs. I find that for some tasks involving
 global editing, that vi is a lot easier to use. But for most of the things I
 do on a regular basis, if find emacs better. So, for me, it is not which is
 the better editor, but which is the better editor for the task at hand.

You are probably absolutely right, but Robert only wanted to point out
that this conversation gets in the flame-war direction, in his subtle
way of doing this...

Cheers,
Csaba.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] TB-sized databases

2007-11-30 Thread Csaba Nagy
 Isn't that what statement_timeout is for? Since this is entirely based
 on estimates, using arbitrary fuzzy numbers for this seems fine to me;
 precision isn't really the goal.

There's an important difference to statement_timeout: this proposal
would avoid completely taking any resources if it estimates it can't be
executed in proper time, but statement_timeout will allow a bad query to
run at least statement_timeout long...

Cheers,
Csaba.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] TB-sized databases

2007-11-29 Thread Csaba Nagy
On Thu, 2007-11-29 at 10:45 -0500, Tom Lane wrote:
 Given that this list spends all day every day discussing cases where the
 planner is wrong, I'd have to think that that's a bet I wouldn't take.
 
 You could probably avoid this risk by setting the cutoff at something
 like 100 or 1000 times what you really want to tolerate, but how
 useful is it then?

It would still be useful in the sense that if the planner is taking
wrong estimates you must correct it somehow... raise statistics target,
rewrite query or other tweaking, you should do something. An error is
sometimes better than gradually decreasing performance because of too
low statistics target for example. So if the error is thrown because of
wrong estimate, it is still a valid error raising a signal that the DBA
has to do something about it.

It's still true that if the planner estimates too low, it will raise no
error and will take the resources. But that's just what we have now, so
it wouldn't be a regression of any kind...

Cheers,
Csaba.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] TB-sized databases

2007-11-28 Thread Csaba Nagy
On Wed, 2007-11-28 at 08:27 -0500, Bill Moran wrote:
 Is there something wrong with:
 set enable_seqscan = off
 ?

Nothing wrong with enable_seqscan = off except it is all or nothing type
of thing... if you want the big table to never use seqscan, but a medium
table which is joined in should use it, then what you do ? And setting
enable_seqscan = off will actually not mean the planner can't use a
sequential scan for the query if no other alternative exist. In any case
it doesn't mean please throw an error if you can't do this without a
sequential scan. 

In fact an even more useful option would be to ask the planner to throw
error if the expected cost exceeds a certain threshold...

Cheers,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] TB-sized databases

2007-11-28 Thread Csaba Nagy
On Wed, 2007-11-28 at 08:54 -0500, Bill Moran wrote:
  Nothing wrong with enable_seqscan = off except it is all or nothing type
  of thing...
 
 If that's true, then I have a bug report to file:
[snip]
 It looks to me to be session-alterable.

I didn't mean that it can't be set per session, I meant that it is not
fine grained enough to select the affected table but it affects _all_
tables in a query... and big tables are rarely alone in a query.

Cheers,
Csaba.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] autovacuum: recommended?

2007-11-16 Thread Csaba Nagy
On Fri, 2007-11-16 at 12:13 +0100, Tobias Brox wrote:
 [snip] should i use both auto-vacuum and 
  manual-vacuum?

I would say for 8.2 that's the best strategy (which might change with
8.3 and it's multiple vacuum workers thingy).

 That being said, we have some huge tables in our database and pretty
 much traffic, and got quite some performance problems when the
 autovacuum kicked in and started vacuuming those huge tables, so we're
 currently running without.  Autovacuum can be tuned to not touch those
 tables, but we've chosen to leave it off.

We are doing that here, i.e. set up autovacuum not to touch big tables,
and cover those with nightly vacuums if there is still some activity on
them, and one weekly complete vacuum of the whole DB (vacuum without
other params, preferably as the postgres user to cover system tables
too).

In fact we also have a few very frequently updated small tables, those
are also covered by very frequent crontab vacuums because in 8.2
autovacuum can spend quite some time vacuuming some medium sized tables
and in that interval the small but frequently updated ones get bloated.
This should be better with 8.3 and multiple autovacuum workers.

For the disable for autovacuum part search for pg_autovacuum in the
docs.

I would say the autovacuum + disable autovacuum on big tables + nightly
vacuum + weekly vacuumdb + frequent crontab vacuum of very updated small
tables works well in 8.2. One thing which could be needed is to also
schedule continuous vacuum of big tables which are frequently updated,
with big delay settings to throttle the resources used by the vacuum. We
don't need that here because we don't update frequently our big
tables...

Cheers,
Csaba.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Searching for the cause of a bad plan

2007-09-28 Thread Csaba Nagy
On Thu, 2007-09-27 at 11:07 -0700, Ron Mayer wrote:
 Csaba Nagy wrote:
  
  Well, my problem was actually solved by rising the statistics target,
 
 Would it do more benefit than harm if postgres increased the
 default_statistics_target?
 
 I see a fair number of people (myself included) asking questions who's
 resolution was to ALTER TABLE SET STATISTICS; and I think relatively
 fewer (if any?) people concerned about space in pg_statistic
 or people improving analyze time by reducing the statistics target.

Well, the cost of raising the statistics target is far from zero: with
all defaults the analyze time was ~ 10 seconds, with one column set to
100 was ~ 1.5 minutes, with one column set to 1000 was 15 minutes for
the table in question (few 100M rows). Of course the IO load must have
been proportional to the timings... so I'm pretty sure the current
default is serving well most of the situations.

Cheers,
Csaba.





---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Searching for the cause of a bad plan

2007-09-28 Thread Csaba Nagy
 Just an idea, but with the 8.3 concurrent scan support would it be
 possible to hang a more in depth analyze over exisiting sequential
 scans. Then it would be a lower cost to have higher resolution in
 the statistics because the I/O component would be hidden.

The biggest problem with that is that it wouldn't be deterministic...
the table in question from my original post is never scanned
sequentially in normal operation. The other way around is also possible,
when sequential scans are too frequent, in that case you wouldn't want
to also analyze all the time. So there would be a need for logic of when
to analyze or not with a sequential scan and when do it proactively
without waiting for one... and I'm not sure it will be worth the
complexity.

I think it would me much more productive if some long running query
tracking combined with a background planner thread would do targeted
analyzes for specific correlations/distributions/conditions based on
what queries are actually running on the system.

Cheers,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Searching for the cause of a bad plan

2007-09-27 Thread Csaba Nagy
On Wed, 2007-09-26 at 11:22 -0400, Tom Lane wrote:
 ... how
 many values of a are there really, and what's the true distribution of
 counts?  

table_a has 23366 distinct values. Some statistics (using R):

 summary(table_a_histogram)
   a count
 Min.   : 7857   Min.   :  1  
 1st Qu.:73628   1st Qu.:  9  
 Median :700011044   Median : 22  
 Mean   :622429573   Mean   :  17640  
 3rd Qu.:700018020   3rd Qu.:391  
 Max.   :83349   Max.   :3347707  


I'm not sure what you want to see in terms of distribution of counts, so
I created 2 plots: a against the counts for each distinct a value,
and the histogram of the log of the counts (without the log it's not
really readable). I hope they'll make it through to the list...

 Do the plan estimates get closer to reality if you set a higher
 statistics target?

The results of setting higher statistics targets are attached too. I
can't tell if the stats are closer to reality or not, but the plan
changes in any case...

Cheers,
Csaba.


attachment: table_a_counts.pngattachment: table_a_counts_histogram.pngdb=# alter table temp_table_a ALTER a set statistics 100;

db=# analyze verbose temp_table_a;
INFO:  analyzing public.temp_table_a
INFO:  temp_table_a: scanned 3 of 655299 pages, containing 1887 live 
rows and 0 dead rows; 3 rows in sample, 412183071 estimated total rows

db=# select * from pg_stats where tablename = 'temp_table_a';
 schemaname |  tablename   | attname | null_frac | avg_width | n_distinct  |






  most_common_vals  





 |  





most_common_freqs   





|   





 histogram_bounds   






  | correlation 

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-27 Thread Csaba Nagy
On Thu, 2007-09-27 at 10:40 -0400, Tom Lane wrote:
 And yet there's another trap here: if the parameter you passed in
 chanced to be one of the very common values, a plan that was optimized
 for a small number of matches would perform terribly.
 
 We've speculated about trying to deal with these types of situations
 by switching plans on-the-fly at runtime, but that's just blue-sky
 dreaming at the moment.  In the short run, if boosting the stats target
 doesn't result in acceptable plans, there may be no real solution other
 than to avoid parameterized queries on this column.

Well, my problem was actually solved by rising the statistics target,
thanks to Simon for suggesting it. The problem is that it's quite hard
to tell (for a non-postgres-developer) which column needs higher
statistics target when a multi-join query doesn't work as expected...

Apropos switching plans on the fly and blue sky dreaming... IIRC, there
were some plans to cache plans in shared mode for the whole cluster, not
just per backend.

What about allowing the user to prepare a plan offline, i.e. without
actually executing it (via some variant of PREPARE CACHED or so), and
let the planner do more exhaustive cost estimation, possibly actually
analyzing specific tables for correlations etc., on the ground that the
whole thing is done only once and reused many times. The resulting plan
could also contain turning points for parameter values, which would
switch between different variants of the plan, this way it can be more
specific with parameter values even if planned generically... and it
could set up some dependencies on the relevant statistics on which it is
basing it's decisions, so it will be invalidated when those statistics
are presumably changed more than a threshold, and possibly a background
planner thread re-plans it, after the necessary analyze steps are run
again.

If there is a background planner, that one could also collect long
running query statistics and automatically do a cached plans for the
most offending ones, and possibly generate missing index, you should
cluster this table and such warnings.

The fast planner would still be needed for interactive queries which are
not yet prepared, so new interactive queries don't pay the unpredictable
cost of hard planning. If those run fast enough, they will never get
prepared, they don't need to... otherwise they should be passed to the
background planner to be exhaustively (or at least more thoroughly)
analyzed...

One other thing I dream of would be some way to tell postgres that a
query should run in batch mode or interactive mode, i.e. it should
be optimized for best throughput or fast startup, in the second case
great care should be taken to avoid the worst case scenarios too. I know
there's a strong feeling against query hints around here, but this one
could fly using a GUC parameter, which could be set in the config file
for a default value (batch for a data warehouse, interactive for an OLTP
application), and it also could be set per session.

Ok, that's about the dreaming...

Cheers,
Csaba.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Searching for the cause of a bad plan

2007-09-26 Thread Csaba Nagy
 Csaba, please can you copy that data into fresh tables, re-ANALYZE and
 then re-post the EXPLAINs, with stats data.

Here you go, fresh experiment attached.

Cheers,
Csaba.


db=# \d temp_table_a 
 Table public.temp_table_a
 Column |  Type  | Modifiers 
++---
 a  | bigint | not null
 b  | bigint | not null
Indexes:
temp_pk_table_a PRIMARY KEY, btree (a, b)

db=# \d temp_table_b1
Table public.temp_table_b1
 Column |  Type  | Modifiers 
++---
 b  | bigint | not null
Indexes:
temp_pk_table_b1 PRIMARY KEY, btree (b)

db=# \d temp_table_b2
Table public.temp_table_b2
 Column |  Type  | Modifiers 
++---
 b  | bigint | not null
Indexes:
temp_pk_table_b2 PRIMARY KEY, btree (b)
Foreign-key constraints:
temp_fk_table_b2_b1 FOREIGN KEY (b) REFERENCES temp_table_b1(b)

db=# analyze verbose temp_table_a;
INFO:  analyzing public.temp_table_a
INFO:  temp_table_a: scanned 3000 of 655299 pages, containing 1887000 live 
rows and 0 dead rows; 3000 rows in sample, 412183071 estimated total rows

db=# analyze verbose temp_table_b1;
INFO:  analyzing public.temp_table_b1
INFO:  temp_table_b1: scanned 3000 of 57285 pages, containing 2232000 live 
rows and 0 dead rows; 3000 rows in sample, 42620040 estimated total rows

db=# analyze verbose temp_table_b2;
INFO:  analyzing public.temp_table_b2
INFO:  temp_table_b2: scanned 57 of 57 pages, containing 41967 live rows and 
0 dead rows; 3000 rows in sample, 41967 estimated total rows


db=# select * from pg_stats where tablename = 'temp_table_a';
 schemaname |  tablename   | attname | null_frac | avg_width | n_distinct | 
  most_common_vals  
  |most_common_freqs
|   
 histogram_bounds| 
correlation 
+--+-+---+---++---+-++-
 public | temp_table_a | a   | 0 | 8 |   1261 | 
{74117,700022128,72317,79411,700023682,76025,72843,700014833,76505,78694}
 | 
{0.015,0.0116667,0.0097,0.0097,0.0097,0.009,0.0087,0.008,0.0077,0.0077}
 | 
{70010872,70035,73086,75843,78974,700011369,700013305,700015988,700019048,700022257,83151}
 |0.850525
 public | temp_table_a | b   | 0 | 8 | -1 | 

  | 
| 
{41708986,700707712,803042997,7004741432,7007455842,7009719495,7013869874,7016501748,7019139288,7025078292,7037930133}
 |0.646759


db=# select * from pg_stats where tablename = 'temp_table_b1';
 schemaname |   tablename   | attname | null_frac | avg_width | n_distinct | 
most_common_vals | most_common_freqs |  
  histogram_bounds 
| correlation 
+---+-+---+---++--+---+-+-
 public | temp_table_b1 | b   | 0 | 8 | -1 |
  |   | 
{41719236,801608645,7003211583,7007403678,7011591097,7016707278,7021089839,7025573684,7029316772,7033888226,8002470137}
 |0.343186


db=# select * from pg_stats where tablename = 'temp_table_b2';
 schemaname |   tablename   | attname | null_frac | avg_width | n_distinct | 
most_common_vals | most_common_freqs |  
   histogram_bounds 
 | correlation 
+---+-+---+---++--+---+---+-
 public | temp_table_b2 | b   | 0 | 8 | -1 |
  |   | 
{70054,7000352893,7000357745,7000362304,7000367025,7000371629,7000376587,7000381229,7009567724,7023749432,7034300740}
 |   -0.216073


db=# prepare test_001(bigint) as
db-# SELECT tb.*
db-# FROM temp_table_a ta 
db-# JOIN 

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-24 Thread Csaba Nagy
On Mon, 2007-09-24 at 14:27 +0100, Simon Riggs wrote:
 Csaba, please can you copy that data into fresh tables, re-ANALYZE and
 then re-post the EXPLAINs, with stats data.

Well, I can of course. I actually tried to generate some random data
with similar record count and relations between the tables (which I'm
not sure I succeeded at), without the extra columns, but it was happily
yielding the nested loop plan. So I guess I really have to copy the
whole data (several tens of GB).

But from my very limited understanding of what information is available
for the planner, I thought that the record count estimated for the join
between table_a and table_b1 on column b should be something like

(estimated record count in table_a for value a) * (weight of b range
covered by table_b1 and table_a in common) / (weight of b range
covered by table_a)

This is if the b values in table_a wouldn't be correlated at all with
the content of table_b2. The reality is that they are, but the planner
has no information about that.

I have no idea how the planner works though, so this might be totally
off...

I will copy the data and send the results (not promising though that it
will be today).

Cheers,
Csaba.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] select count(*) performance (vacuum did not help)

2007-09-24 Thread Csaba Nagy
On Mon, 2007-09-24 at 17:14 +0200, Gábor Farkas wrote:
 will i achieve the same thing by simply dropping that table and 
 re-creating it?

If you have an index/PK on that table, the fastest and most useful way
to rebuild it is to do CLUSTER on that index. That will be a lot faster
than VACUUM FULL and it will also order your table in index order... but
it will also lock it in exclusive mode just as VACUUM FULL would do it.
If your table has just a few live rows and lots of junk in it, CLUSTER
should be fast enough. With 20K entries I would expect it to be fast
enough not to be a problem...

Cheers,
Csaba.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Linux mis-reporting memory

2007-09-21 Thread Csaba Nagy
On Fri, 2007-09-21 at 09:03 +0100, Gregory Stark wrote:
  Mem:  32945280k total, 32871832k used,73448k free,   247432k buffers
  Swap:  1951888k total,42308k used,  1909580k free, 30294300k cached
 
 It seems to imply Linux is paging out sysV shared memory. In fact some of
 Heikki's tests here showed that Linux would do precisely that.

But then why is it not reporting that in the Swap: used section ? It
only reports 42308k used swap. 

I have a box where I just executed 3x a select count(*) from a table
which has ~5.5 GB size on disk, and the count executed in 4 seconds,
which I take as it is all cached (shared memory is set to 12GB - I use
the box for testing for now, otherwise I would set it far lower because
I have bad experience with setting it more than 1/3 of the available
memory). Top reported at the end of the process:

Mem:  16510724k total, 16425252k used,85472k free,10144k buffers
Swap:  7815580k total,   157804k used,  7657776k free, 15980664k cached

I also watched it during the selects, but it was not significantly
different. So my only conclusion is that the reported cached value is
either including the shared memory or is simply wrong... or I just don't
get how linux handles memory.

Cheers,
Csaba.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Linux mis-reporting memory

2007-09-21 Thread Csaba Nagy
On Fri, 2007-09-21 at 10:43 +0100, Gregory Stark wrote:
 The other possibility is that Postgres just hasn't even touched a large part
 of its shared buffers. 
 

But then how do you explain the example I gave, with a 5.5GB table
seq-scanned 3 times, shared buffers set to 12 GB, and top still showing
almost 100% memory as cached and no SWAP used ? In this case you can't
say postgres didn't touch it's shared buffers - or a sequential scan
won't use the shared buffers ?

Cheers,
Csaba.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Csaba Nagy
Hi all,

Postgres version: 8.2.4

Tables:

table_a(a bigint, b bigint, primary key(a, b) );

table_b1(b bigint primary key, more columns...);

table_b2(b bigint primary key references table_b1(b), more columns...);

table_b1: 
 ~ 27M rows;
 ~25 more columns;
 width=309 (as reported by explain select *);

table_a:
 ~400M rows;
 - column b should reference table_b1, but it does not for performance
reasons (it is an insert only table);
 - column a distinct values: 1148
 - has (a, b) as primary key;
 - has no additional columns;

table_b1:
 ~40K rows;
 ~70 more columns;
 width=1788 (as reported by explain select *);

Statistics for the involved columns for each table are attached in files
(to preserve the spacing). They were taken after analyzing the relevant
table (except for table_b2 where I added the fiddled statistics first
and then remembered to analyze fresh, resulting in the non_fiddled
version, which gives the same result as the fiddled one).

The problem query is:

prepare test_001(bigint) as
SELECT tb.*
FROM table_a ta 
JOIN table_b2 tb ON ta.b=tb.b
WHERE ta.a = $1  
ORDER BY ta.a, ta.b
limit 10;

Explain gives Plan 1 (see attached plans.txt)

If I set enable_hashjoin=off and enable_mergejoin=off, I get Plan 2
(again, see plans.txt).

The difference is a 30x improvement in the second case...
(I actually forgot to account for cache effects, but later rerun the
queries multiple times and the timings are proportional).

Additionally, if I replace table_b2 with table_b1 in the query, I get
Plan 3 (with reasonable execution time) with both enable_hashjoin and
enable_mergejoin on. So there is something which makes table_b2
different from table_b1 for planning purposes, but I could not identify
what that is... they have differences in statistics, but fiddling with
the stats gave me no difference in the plan.

Looking at Plan 2, it looks like the limit step is estimating wrongly
it's cost. I guessed that it does that because it thinks the b values
selected from table_a for a given a span a larger range than the b
values in table_b2, because the b values in table_b2 are a (relatively
small) subset of the b values in table_a. But this is not the case,
the query only gets a values for which all the b values in table_a
will be found in table_b2. Of course the planner has no way to know
this, but then I think it is not the case, as I tried to copy the
histogram statistics in pg_statistic for the column b from the entry
for table_b1 (which contains the whole span of b values) to the entry
for table_b2, with no change in the plan.

Just for the record, this query is just a part of a more complex one,
which joins in bigger tables, resulting in even worse performance, but I
tracked it down to refusing the nested loop to be the problem.

Is there anything I could do to convince the planner to use here the
nested loop plan ?

Thanks,
Csaba.

 attname | null_frac | avg_width | n_distinct | 
 most_common_vals   |   
   most_common_freqs  | 
  histogram_bounds  
 | correlation 
-+---+---++-+-+--+-
 a   | 0 | 8 |   1148 | 
{31826743,31855101,31855343,31854918,31856328,31861573,31855122,31855130,31855189,31856426}
 | 
{0.005,0.0047,0.0043,0.004,0.004,0.004,0.0037,0.0037,0.0037,0.0037}
 | 
{31734956,31854364,31854732,31855162,31855660,31857144,31858109,31858965,31859762,31860576,31861566}
 |0.999608 
 b   | 0 | 8 | -1 | 
|   
  | 
{63977,36878147,42247866,42548692,42812320,46992026,51444368,55977972,56607708,59496742,68530614}
|0.602959 

 attname | null_frac | avg_width | n_distinct | most_common_vals | 
most_common_freqs | 

histogram_bounds
 | 
correlation 

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Csaba Nagy
On Fri, 2007-09-21 at 11:59 +0100, Simon Riggs wrote:
 Please re-run everything on clean tables without frigging the stats. We
 need to be able to trust what is happening is normal.

I did, the plan fiddling happened after getting the plans after a fresh
analyze, and I did run the plan again with fresh analyze just before
sending the mail and the plan was the same. In fact I spent almost 2
days playing with the query which is triggering this behavior, until I
tracked it down to this join. Thing is that we have many queries which
rely on this join, so it is fairly important that we understand what
happens there.

 Plan2 sees that b1 is wider, which will require more heap blocks to be
 retrieved. It also sees b1 is less correlated than b2, so again will
 require more database blocks to retrieve. Try increasing
 effective_cache_size.

effective_cach_size is set to ~2.7G, the box has 4G memory. I increased
it now to 3,5G but it makes no difference. I increased it further to 4G,
no difference again.

 Can you plans with/without LIMIT and with/without cursor, for both b1
 and b2?

The limit is unfortunately absolutely needed part of the query, it makes
no sense to try without. If it would be acceptable to do it without the
limit, then it is entirely possible that the plan I get now would be
indeed better... but it is not acceptable.

Thanks,
Csaba.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Linux mis-reporting memory

2007-09-21 Thread Csaba Nagy
On Fri, 2007-09-21 at 11:34 +0100, Heikki Linnakangas wrote:
 Which version of Postgres is this? In 8.3, a scan like that really won't
 suck it all into the shared buffer cache. For seq scans on tables larger
 than shared_buffers/4, it switches to the bulk read strategy, using only
  a few buffers, and choosing the starting point with the scan
 synchronization facility.
 
This was on 8.1.9 installed via apt-get on Debian 4.1.1-21. In any case
I'm pretty sure linux swaps shared buffers, as I always got worse
performance for shared buffers more than about 1/3 of the memory. But in
that case the output of top is misleading.

Cheers,
Csaba.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Csaba Nagy
On Fri, 2007-09-21 at 12:34 +0100, Simon Riggs wrote:
 On Fri, 2007-09-21 at 13:29 +0200, Csaba Nagy wrote:
 
   Can you plans with/without LIMIT and with/without cursor, for both b1
   and b2?
  
  The limit is unfortunately absolutely needed part of the query
 
 Understood, but not why I asked...
 
Well, the same query without limit goes:

dbdop=# explain execute test_001(31855344);
 QUERY
PLAN 

 Sort  (cost=322831.85..322831.94 rows=36 width=1804)
   Sort Key: ta.a, ta.b
   -  Hash Join  (cost=3365.60..322830.92 rows=36 width=1804)
 Hash Cond: (ta.b = tb.b)
 -  Index Scan using pk_table_a on table_a ta
(cost=0.00..314541.78 rows=389648 width=16)
   Index Cond: (a = $1)
 -  Hash  (cost=524.71..524.71 rows=41671 width=1788)
   -  Seq Scan on table_b2 tb  (cost=0.00..524.71
rows=41671 width=1788)


I'm not sure what you mean without cursor, maybe not using prepare ?
Well we set up the JDBC driver to always prepare the queries, as this
gives us much better worst case plans than when letting postgres see the
parameter values, especially in queries with limit. So I simulate that
when explaining the behavior we see. All our limit queries are for
interactive display, so the worst case is of much higher importance for
us than the mean execution time... unfortunately postgres has a tendency
to take the best mean performance path than avoid worst case, and it is
not easy to convince it otherwise.

Cheers,
Csaba.





---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Csaba Nagy
[snip]

Ok, I was not able to follow your explanation, it's too deep for me into
what the planner does...

 Incidentally, the way out of this is to improve the stats by setting
 stats target = 1000 on column a of ta. That will allow the optimizer to
 have a better estimate of the tail of the distribution of a, which
 should then be more sensibly reflected in the cost of the Index Scan.
 That doesn't solve the actual problem, but should help in your case.

OK, I can confirm that. I set the statistics target for column a on
table_a to 1000, analyzed, and got the plan below. The only downside is
that analyze became quite expensive on table_a, it took 15 minutes and
touched half of the pages... I will experiment with lower settings,
maybe it will work with less than 1000 too.

db explain analyze execute test_001(31855344);

QUERY
PLAN  
--
 Limit  (cost=0.00..4499.10 rows=10 width=1804) (actual
time=103.566..120.363 rows=2 loops=1)
   -  Nested Loop  (cost=0.00..344630.97 rows=766 width=1804) (actual
time=103.563..120.359 rows=2 loops=1)
 -  Index Scan using pk_table_a on table_a ta
(cost=0.00..67097.97 rows=78772 width=16) (actual time=71.965..77.284
rows=2 loops=1)
   Index Cond: (a = $1)
 -  Index Scan using pk_table_b2 on table_b2 tb
(cost=0.00..3.51 rows=1 width=1788) (actual time=21.526..21.528 rows=1
loops=2)
   Index Cond: (ta.b = tb.b)
 Total runtime: 120.584 ms

Thanks,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Csaba Nagy
 OK, I can confirm that. I set the statistics target for column a on
 table_a to 1000, analyzed, and got the plan below. The only downside is
 that analyze became quite expensive on table_a, it took 15 minutes and
 touched half of the pages... I will experiment with lower settings,
 maybe it will work with less than 1000 too.

So, just to finish this up: setting statistics to 100 worked too, and it
has an acceptable impact on analyze. My original (more complicated)
query is working fine now, with visible effects on server load...

Thanks Simon for your help !

Cheers,
Csaba.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How to use a trigger to write rows to a remote server

2007-07-18 Thread Csaba Nagy
On Wed, 2007-07-18 at 15:36, Michael Dengler wrote:
 Row X is inserted into TableX in DB1 on server1TableX trigger
 function fires and contacts DB2 on server2 and inserts the row into
 TableY on server2. 

This kind of problem is usually solved more robustly by inserting the
change into a local table and let the remote server (or some external
program) poll that periodically, and make the necessary changes to the
remote server. This method does not have the problems Heikki mentions in
his reply with disconnections and transaction rollbacks, as the external
program/remote server will only see committed transactions and it can
apply the accumulated changes after connection is recovered in case of
failure, without blocking the activity on the master.

This is also covered in a few past posts on the postgres lists (I guess
you should look in the general list for that), in particular you could
be interested in the possibility of notifications if you want your
poller to be notified immediately when a change occurs.

Cheers,
Csaba.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Foreign Key Deadlocking

2007-04-19 Thread Csaba Nagy
 A frequently mentioned approach to avoid the point of contention is to
 have a totals record and have the triggers insert deltas records; to
 get the sum, add them all.  Periodically, take the deltas and apply them
 to the totals.

This is what we do here too. There is only one exception to this rule,
in one case we actually need to have the inserted records and the
updated parent in one transaction for data consistency, in that case the
delta approach won't work... we didn't find any other solution to that
except patching postgres not to lock the parent keys at all, which has
it's own problems too (occasional breakage of the foreign key
relationship when the parent is deleted and a child still slips in, but
this is very rare in our case not to cause problems which cannot be
cleaned up with relative ease - not to mention that there could be other
problems we didn't discover yet or our usage patterns are avoiding).

Cheers,
Csaba.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Foreign Key Deadlocking

2007-04-18 Thread Csaba Nagy
 Can someone confirm that I've identified the right fix?

I'm pretty sure that won't help you... see:

http://archives.postgresql.org/pgsql-general/2006-12/msg00029.php

The deadlock will be there if you update/insert the child table and
update/insert the parent table in the same transaction (even if you
update some other field on the parent table than the key referenced by
the child table). If your transactions always update/insert only one of
those tables, it won't deadlock (assuming you order the inserts/updates
properly per PK).

Cheers,
Csaba.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Using the 8.2 autovacuum values with 8.1

2007-02-23 Thread Csaba Nagy
On Thu, 2007-02-22 at 22:53, Mark Stosberg wrote:
 Thanks to everyone for the feedback about vacuuming. It's been very
 useful. The pointers to the pgstattuple and Pgfouine tools were also
 helpful.
 
 I'm now considering the following plan for trying Autovacuuming again
 with 8.1. I'd like any peer review you have to offer of the following:
 
 1. First, I'll move the settings to match the defaults in 8.2. The ones
 I noticed in particular were:
 
 autovacuum_vacuum_threshold changes: 1000 - 500
 autovacuum_anayze_threshold changes: 500 - 250
 autovacuum_scale_factor  changes:  .4 - .2
 autovacuum_analyze_scale_factor changes .2 - .1
 
 2. Try the vacuum cost delay feature, starting with a 20ms value:
 
 autovacuum_vacuum_cost_delay = 20
 
 3. Immediately add a row to pg_autovacuum for a huge logging table that
 would be too slow to vacuum usually. We'll still vacuum it once a week
 for good measure by cron.
 
 4. For good measure, I think I still keep the nightly cron entry that
 does a complete vacuum analyze (except for that large table...).
 
 Seem like a reasonable plan?

You likely don't need the nightly full vacuum run... we also do here a
nightly vacuum beside autovacuum, but not a full one, only for tables
which are big enough that we don't want autovacuum to touch them in high
business time but they have enough change that we want a vacuum on them
frequent enough. I discover them by checking the stats, for example:

SELECT 
c.relname,
c.reltuples::bigint as rowcnt,
pg_stat_get_tuples_inserted(c.oid) AS inserted, 
pg_stat_get_tuples_updated(c.oid) AS updated, 
pg_stat_get_tuples_deleted(c.oid) AS deleted
FROM pg_class c
WHERE c.relkind = 'r'::char
GROUP BY c.oid, c.relname, c.reltuples
HAVING pg_stat_get_tuples_updated(c.oid) +
pg_stat_get_tuples_deleted(c.oid)  1000
ORDER BY pg_stat_get_tuples_updated(c.oid) +
pg_stat_get_tuples_deleted(c.oid) DESC;


The top tables in this list for which the (deleted + updated) / rowcnt 
is relatively small but still significant need your attention for
nightly vacuum... the rest is handled just fine by autovacuum.

On the other end of the scale, if you have tables for which the
deletion/update rate is way higher then the row count, that's likely a
hot-spot table which you probably need extra vacuuming during the day.

Cheers,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Csaba Nagy
On Tue, 2007-02-06 at 01:35, Karen Hill wrote:
 [snip] So far I've been sitting here for about 2 million ms
 waiting for it to complete, and I'm not sure how many inserts postgres
 is doing per second.

One way is to run analyze verbose on the target table and see how many
pages it has, and then do it again 1 minute later and check how many
pages it grew. Then multiply the page increase by the record per page
ratio you can get from the same analyze's output, and you'll get an
estimated growth rate. Of course this will only work if you didn't have
lots of free space in the table to start with... if you do have lots of
free space, you still can estimate the growth based on the analyze
results, but it will be more complicated.


In any case, it would be very nice to have more tools to attach to
running queries and see how they are doing... starting with what exactly
they are doing (are they in RI checks maybe ?), the actual execution
plan they are using, how much they've done from their work... it would
help a lot debugging performance problems.

Cheers,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Keeping processes open for re-use

2006-11-09 Thread Csaba Nagy
On Thu, 2006-11-09 at 13:35, Hilary Forbes wrote:
 [snip] Is there a way that we can achieve this in Postgres? We have a
 situation whereby we have lots of web based users doing short quick
 queries and obviously the start up time for a process must add to
 their perceived response time.

Yes: google for connection pooling. Note that different solutions
exist for different programming languages, so you should look for
connection pooling for the language you're using.

HTH,
Csaba.



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Hints proposal

2006-10-16 Thread Csaba Nagy
 2d) Hints will damage the ongoing development of the optimizer by 
 reducing or eliminating test cases for its improvement.

You have no evidence for this. The mindset of the postgres community you
cite further below usually mandates that you say things if you have
evidence for them... and this one could be even backwards, by putting
such a tool in normal mortals hands that they can experiment with
execution plans to see which one works better, thus giving more data to
the developers than it is possible now. This is of course a speculation
too, but not at all weaker than yours.

 2e) Hints will divert developer resource away from ongoing development 
 of the optimizer.

This is undebatable, although the long term cost/benefit is not clear.
And I would guess simple hinting would not need a genius to implement it
as planner optimizations mostly do... so it could possibly be done by
somebody else than the core planner hackers (is there any more of them
than Tom ?), and such not detract them too much from the planner
optimization tasks.

 2f) Hints may demoralize the developer community - many of whom will 
 have been attracted to Postgres precisely because this was a realm where 
 crude solutions were discouraged.

I still don't get it why are you so against hints. Hints are a crude
solution only if you design them to be like that... otherwise they are
just yet another tool to get the work done, preferably now. 

 I understand that these points may seem a bit 'feel-good' and intangible 
 - especially for the DBA's moving to Pg from Oracle, but I think they 
 illustrate the mindset of the Postgres developer community, and the 
 developer community is, after all - the primary reason why Pg is such a 
 good product.

I fail to see why would be a hinted postgres an inferior product...

 Of course - if we can find a way to define 'hint like' functionality 
 that is more in keeping with the 'Postgres way' (e.g. some of the 
 relation level statistical additions as discussed), then some of 2d-2f) 
 need not apply.

I bet most of the users who wanted hints are perfectly fine with any
variations of it, if it solves the problems at hand. 

Cheers,
Csaba.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Hints proposal

2006-10-12 Thread Csaba Nagy
OK, I just have to comment...

Jim C. Nasby [EMAIL PROTECTED] writes:
  These hints would outright force the planner to do things a certain way.
  ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */
 
 This proposal seems to deliberately ignore every point that has been
 made *against* doing things that way.  It doesn't separate the hints
 from the queries, it doesn't focus on fixing the statistical or cost
 misestimates that are at the heart of the issue, and it takes no account
 of the problem of hints being obsoleted by system improvements.

But whatever arguments you made about planner improvements and the like,
it will NEVER be possible to correctly estimate in all cases the
statistics for a query, even if you perfectly know WHAT statistics you
need, which is also not the case all the time. 

Tom, you're the one who knows best how the planner works... can you bet
anything you care about on the fact that one day the planner will never
ever generate a catastrophic plan without DBA tweaking ? And how far in
time we'll get to that point ?

Until that point is achieved, the above proposal is one of the simplest
to understand for the tweaking DBA, and the fastest to deploy when faced
with catastrophic plans. And I would guess it is one of the simplest to
be implemented and probably not very high maintenance either, although
this is just a guess.

If I could hint some of my queries, I would enable anonymous prepared
statements to take into account the parameter values, but I can't
because that results in runaway queries every now and then, so I had to
force postgres generate generic queries without knowing anything about
parameter values... so the effect for me is an overall slower postgres
system because I couldn't fix the particular problems I had and had to
tweak general settings. And when I have a problem I can't wait until the
planner is fixed, I have to solve it immediately... the current means to
do that are suboptimal. 

The argument that planner hints would hide problems from being solved is
a fallacy. To put a hint in place almost the same amount of analysis is
needed from the DBA as solving the problem now, so users who ask now for
help will further do it even in the presence of hints. The ones who
wouldn't are not coming for help now either, they know their way out of
the problems... and the ones who still report a shortcoming of the
planner will do it with hints too.

I would even say it would be an added benefit, cause then you could
really see how well a specific plan will do without having the planner
capable to generate alone that plan... so knowledgeable users could come
to you further down the road when they know where the planner is wrong,
saving you time.

I must say it again, this kind of query-level hinting would be the
easiest to understand for the developers... there are many
trial-end-error type of programmers out there, if you got a hint wrong,
you fix it and move on, doesn't need to be perfect, it just have to be
good enough. I heavily doubt that postgres will get bad publicity
because user Joe sot himself in the foot by using bad hints... the
probability for that is low, you must actively put those hints there,
and if you take the time to do that then you're not the average Joe, and
probably not so lazy either, and if you're putting random hints, then
you would probably mess it up some other way anyway.

And the thing about missing new features is also not very founded. If I
would want to exclude a full table scan on a specific table for a
specific query, than that's about for sure that I want to do that
regardless what new features postgres will offer in the future. Picking
one specific access method is more prone to missing new access methods,
but even then, when I upgrade the DB server to a new version, I usually
have enough other compatibility problems (till now I always had some on
every upgrade I had) that making a round of upgrading hints is not an
outstanding problem. And if the application works good enough with
suboptimal plans, why would I even take that extra effort ?

I guess the angle is: I, as a practicing DBA would like to be able to
experiment and get most out of the imperfect tool I have, and you, the
developers, want to make the tool perfect... I don't care about perfect
tools, it just have to do the job... hints or anything else, if I can
make it work GOOD ENOUGH, it's all fine. And hints is something I would
understand and be able to use.

Thanks for your patience if you're still reading this...

Cheers,
Csaba.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PERFORM] Hints proposal

2006-10-12 Thread Csaba Nagy
 Hmmm, if you already understand Visual Basic syntax, should we support 
 that too?  Or maybe we should support MySQL's use of '-00-00' as the 
 zero date because people understand that?

You completely misunderstood me... I have no idea about oracle hints,
never used Oracle in fact. My company uses oracle, but I have only very
very limited contact with oracle issues, and never touched a hint.

I'm only talking about ease of use, learning curves, and complexity in
general. While I do like the idea of an all automatic system optimizer
which takes your query portofolio and analyzes the data based on those
queries and creates you all the indexes you need and all that, that's
not gonna happen soon, because it's a very complex thing to implement.

The alternative is that you take your query portofolio, analyze it
yourself, figure out what statistics you need, create indexes, tweak
queries, hint the planner for correlations and stuff... which is a
complex task, and if you have to tell the server about some correlations
with the phase of the moon, you're screwed cause there will never be any
DB engine which will understand that. 

But you always can put the corresponding hint in the query when you know
the correlation is there...

The problem is that the application sometimes really knows better than
the server, when the correlations are not standard.

 We're just not going to adopt a bad design because Oracle DBAs are used 
 to it.   If we wanted to do that, we could shut down the project and 
 join a proprietary DB staff.

I have really nothing to do with Oracle. I think you guys are simply too
blinded by Oracle hate... I don't care about Oracle.

 The current discussion is:
 
 a) Planner tweaking is sometimes necessary;
 b) Oracle HINTS are a bad design for planner tweaking;

While there are plenty of arguments you made against query level hints
(can we not call them Oracle-hints ?), there are plenty of users of
postgres who expressed they would like them. I guess they were tweaking
postgres installations when they needed it, and not Oracle
installations. I expressed it clearly that for me query level hinting
would give more control and better understanding of what I have to do
for the desired result. Perfect planning - forget it, I only care about
good enough with reasonable tuning effort. If I have to tweak statistics
I will NEVER be sure postgres will not backfire on me again. On the
other hand if I say never do a seq scan on this table for this query, I
could be sure it won't...

 c) Can we come up with a good design for planner tweaking?

Angles again: good enough now is better for end users, but programmers
always go for perfect tomorrow... pity.

Cheers,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Hints proposal

2006-10-12 Thread Csaba Nagy
 I'm not suggesting that we do that, but it seems better then embedding
 the hints in the queries themselves.

OK, what about this: if I execute the same query from a web client, I
want the not-so-optimal-but-safe plan, if I execute it asynchronously, I
let the planner choose the
best-overall-performance-but-sometimes-may-be-slow plan ?

What kind of statistics/table level hinting will get you this ?

I would say only query level hinting will buy you query level control.
And that's perfectly good in some situations.

I really can't see why a query-level hinting mechanism is so evil, why
it couldn't be kept forever, and augmented with the possibility of
correlation hinting, or table level hinting. 

These are really solving different problems, with some overlapping...

Cheers,
Csaba.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Unsubscribe

2006-10-04 Thread Csaba Nagy
On Wed, 2006-10-04 at 18:02, Csaba Nagy wrote:
  If we didn't want to add it for each list we could just add a link here:
  
  http://www.postgresql.org/community/lists/subscribe

OK, now that I had a second look on that page, it does contain
unsubscription info... but it's well hidden for the fugitive look... the
caption is a big Subscribe to Lists, you wouldn't think at a first
glance think that the form is actually used to unsubscribe too, would
you ?

So maybe it's just that the text should be more explicit about what it
actually does...

Cheers,
Csaba.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] slow queue-like empty table

2006-09-28 Thread Csaba Nagy
On Thu, 2006-09-28 at 09:36, Tobias Brox wrote:
 [Tobias Brox - Thu at 08:56:31AM +0200]
  It really seems like some transaction is still viewing the queue, since
  it found 38k of non-removable rows ... but how do I find the pid of the
  transaction viewing the queue?  As said, the pg_locks didn't give me any
  hints ...

The open transaction doesn't have to have any locks on your queue table
to prevent vacuuming dead rows. It's mere existence is enough... MVCC
means that a still running transaction could still see those dead rows,
and so VACUUM can't remove them until there's no transaction which
started before they were deleted.

So long running transactions are your enemy when it comes to high
insert/delete rate queue tables.

So you should check for idle in transaction sessions, those are bad...
or any other long running transaction.


 Dropping the table and recreating it solved the immediate problem, but
 there must be some better solution than that? :-)

If you must have long running transactions on your system (like
vacuuming another big table - that also qualifies as a long running
transaction, though this is fixed in 8.2), then you could use CLUSTER
(see the docs), which is currently not MVCC conforming and deletes all
the dead space regardless if any other running transaction can see it or
not. This is only acceptable if you're application handles the queue
table independently, not mixed in complex transactions. And the CLUSTER
command takes an exclusive lock on the table, so it won't work for e.g.
during a pg_dump, it would keep the queue table locked exclusively for
the whole duration of the pg_dump (it won't be able to actually get the
lock, but it will prevent any other activity on it, as it looks like in
progress exclusive lock requests block any new shared lock request).

HTH,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] autovacuum on a -mostly- r/o table

2006-09-27 Thread Csaba Nagy
On Wed, 2006-09-27 at 18:08, Edoardo Ceccarelli wrote:
 How can I configure the vacuum to run after the daily batch insert/update?

Check out this:
http://www.postgresql.org/docs/8.1/static/catalog-pg-autovacuum.html

By inserting the right row you can disable autovacuum to vacuum your big
tables, and then you can schedule vacuum nightly for those just as
before. There's still a benefit in that you don't need to care about
vacuuming the rest of the tables, which will be done just in time.

Cheers,
Csaba.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Optimizing DELETE

2006-09-19 Thread Csaba Nagy
 I've just fired off a DELETE FROM table command (i.e. unfiltered 
 DELETE) on a trivially small table but with many foreign key references 
 (on similar-sized tables), and I'm waiting for it to finish. It's been 
 10 minutes now, which seems very excessive for a table of 9000 rows on a 
 3 GHz desktop machine.

If you have missing indexes on the child tables foreign keys, that might
be a cause of slow delete. The cascading delete must look up the to be
deleted rows in all child tables, which will do sequential scans if you
don't have proper indexes.

Try to do an explain analyze for deleting one row, that should also show
you the time spent in triggers, which might clue you in what's taking so
long.

Cheers,
Csaba.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Help tuning autovacuum - seeing lots of relationbloat

2006-06-21 Thread Csaba Nagy
On Wed, 2006-06-21 at 17:27, jody brownell wrote:
 Our application is broken down quite well. We have two main writing processes 
 writing to two separate sets of tables. No crossing over, nothign to prohibit 
 the 
 vacuuming in the nature which you describe.

It really doesn't matter what table are you touching, as it doesn't
matter if you read or write either, what matters is how long ago was the
last begin without commit or rollback. VACUUM will not touch
tuples which were deleted after the oldest not yet finished transaction
started, regardless if that transaction touched the vacuumed table or
not in any way...

 My longest transaction on the tables in question are typically quite short 
 until 
 of course they begin to bloat.

Well, your application might be completely well behaved and still your
DBA (or your favorite DB access tool for that matter) can leave open
transactions in an interactive session. It never hurts to check if you
actually have idle in transaction sessions. It happened a few times to
us, some of those were bad coding on ad-hoc tools written by us, others
were badly behaved DB access tools opening a transaction immediately
after connect and after each successful command, effectively leaving an
open transaction when leaving it open while having lunch...

So it might very well be that some interactive or ad hoc tools you're
using to manage the DB are your problem.

Cheers,
Csaba.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Help tuning autovacuum - seeing lots of relationbloat

2006-06-21 Thread Csaba Nagy
On Wed, 2006-06-21 at 18:21, jody brownell wrote:
 That is interesting.
 
 There is one thread keeping a transaction open it appears from ps
 
 postgres: app app xxx(42644) idle in transaction

That shouldn't be a problem on itself, idle in transaction happens all
the time between 2 commands in the same transaction... you only have a
problem if you see the same PID always idle, that means somebody left
an open transaction and left for lunch.

[snip]
 this was with the Idle in transaction though. 

This probably means you don't have long running transactions currently.
However, if you happen to have just one such long transaction, the dead
space accumulates and normal vacuum will not be able to clean that
anymore. But I guess if you didn't find one now then you should take a
look at Tom's suggestion and bump up debug level to see if autovacuum
picks your table at all...

 Ah HA! Wondering, my autovacuum naptime is 60 seconds, that is also the 
 interval which I wake up and begin persistence.
 Wondering if I am simply locking autovacuum out of the tables b/c they are on 
 a similar timeline.
 
 I will try a 30 second naptime, if this is it, that should increase the 
 likely hood of falling on the right side of the TX more often.
 
 make sense?

I don't think that's your problem... vacuum wouldn't be locked out by
any activity which doesn't lock exclusively the table (and I guess
you're not doing that). If your persistence finishes quickly then that's
not the problem.

Oh, just occured to me... in order to use autovacuum you also need to
enable the statistics collector on row level:

stats_start_collector = on
stats_row_level = on

See also:
http://www.postgresql.org/docs/8.1/static/maintenance.html#AUTOVACUUM

This was not mentioned in the settings in your original post, so I guess
you didn't touch that, and I think they are disabled by default.

If this is disabled, you should enable it and pg_ctl reload , that
should fix the problem.

Cheers,
Csaba.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Identical query on two machines, different plans....

2006-04-20 Thread Csaba Nagy
You very likely forgot to run ANALYZE on your laptop after copying the
data. Observe the different row count estimates in the 2 plans...

HTH,
Csaba.


 QUERY PLAN
 ---
  Index Scan using ticketing_codes_uq_value_group_id on ticketing_codes
 (cost=0.00..6.02 rows=1 width=4) (actual time=0.104..0.107 rows=1
 ^^ 
 loops=1)
Index Cond: (((code_value)::text = 'C7ZP2U'::text) AND (code_group_id
 = 1000))
  Total runtime: 0.148 ms
 (3 rows)
 
 
 PLAN 
 --
  Bitmap Heap Scan on ticketing_codes  (cost=2.01..1102.05 rows=288

 width=4) (actual time=88.164..88.170 rows=1 loops=1)
Recheck Cond: (((code_value)::text = 'C7ZP2U'::text) AND
 (code_group_id = 1000))
-  Bitmap Index Scan on ticketing_codes_uq_value_group_id
 (cost=0.00..2.01 rows=288 width=0) (actual time=54.397..54.397 rows=1
 loops=1)
  Index Cond: (((code_value)::text = 'C7ZP2U'::text) AND
 (code_group_id = 1000))
  Total runtime: 88.256 ms
 (5 rows)
 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Identical query on two machines, different plans....

2006-04-20 Thread Csaba Nagy
OK, I marked the wrong row counts, but the conclusion is the same.

Cheers,
Csaba.


  QUERY PLAN
  ---
   Index Scan using ticketing_codes_uq_value_group_id on ticketing_codes
  (cost=0.00..6.02 rows=1 width=4) (actual time=0.104..0.107 rows=1
^^   
  loops=1)
 Index Cond: (((code_value)::text = 'C7ZP2U'::text) AND (code_group_id
  = 1000))
   Total runtime: 0.148 ms
  (3 rows)
  
  
  PLAN 
  --
   Bitmap Heap Scan on ticketing_codes  (cost=2.01..1102.05 rows=288
 
  width=4) (actual time=88.164..88.170 rows=1 loops=1)
 Recheck Cond: (((code_value)::text = 'C7ZP2U'::text) AND
  (code_group_id = 1000))
 -  Bitmap Index Scan on ticketing_codes_uq_value_group_id
  (cost=0.00..2.01 rows=288 width=0) (actual time=54.397..54.397 rows=1
 
  loops=1)
   Index Cond: (((code_value)::text = 'C7ZP2U'::text) AND
  (code_group_id = 1000))
   Total runtime: 88.256 ms
  (5 rows)
  
 
 
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] WAL logging of SELECT ... INTO command

2006-03-22 Thread Csaba Nagy
On Wed, 2006-03-22 at 16:35, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  Oh, so in other words, SELECT * INTO temp FROM table is inherently
  non-deterministic at the physical level, so the only way to be able to
  allow PITR to work is to duplicate all the physical changes. Darn.
 
 Well, lemme put it this way: I'm not prepared to require that PG be
 deterministic at the physical level.  One obvious source of
 non-determinancy is the FSM, which is likely to hand out different free
 space to different transactions depending on what else is going on at
 the same time.  There are others, such as deliberately random
 tie-breaking during btree index insertion.

While you're at talking about WAL and PITR... I see from the aboce
discussion that PITR is already demanding special handling in the code
(I hope I got this one right, as the following are based on this).

What if the PITR logging would be disconnected from the WAL logging
completely ?

What I mean is to introduce a WAL subscription mechanism, which
basically means some incoming connections where we stream the log
records. We don't need to write them to disk at all in the normal case,
I guess usually PITR will store the records on some other machine so it
means network, not disk. And it doesn't need to be done synchronously,
it can lag behind the running transactions, and we can do it in batches
of WAL records.

It also would mean that the local WAL does not need to log the things
which are only needed for the PITR... that would likely mean some spared
WAL disk activity. Of course it also would mean that the local WAL and
PITR WAL are not the same, but that is not an issue I guess.

It would also permit immediate recycling of the WAL files if the current
archiving style is not used.

The drawbacks I can see (please add yours):
1) the need for the subscription management code with the added
complexity it implies;
2) problems if the WAL stream lags too much behind;
3) problems if the subscribed client's connection is interrupted;

Nr. 2 could be solved by saving the PITR WAL separately if the lag grows
over a threshold, and issue a warning. This could still be acceptable,
as the writing doesn't have to be synchronous and can be made in
relatively large blocks.
There could be a second bigger lag threshold which completely cancels
the subscription. All these thresholds should be configurable, as it
depends on the application what's more important, to have the standby
available all the time or have the primary faster if loaded...

Nr. 3. can be solved by either canceling the subscription on connection
drop, or by allowing a certain amount of time after which the
subscription is canceled. The client can reconnect before this timeout
expires. In the meantime the primary can store the PITR WAL on disk as
mentioned above...

Cheers,
Csaba.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Migration study, step 1: bulk write performance

2006-03-21 Thread Csaba Nagy
 Did you try mounting ext3 whith data=writeback by chance? People have
 found that makes a big difference in performance.

I'm not sure, there's other people here doing the OS stuff - I'm pretty
much ignorant about what data=writeback could mean :-D

They knew however that for the data partitions no FS journaling is
needed, and for the WAL partition meta data journaling is enough, so I
guess they tuned ext3 for this.

Cheers,
Csaba.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Migration study, step 1: bulk write performance

2006-03-21 Thread Csaba Nagy
 For the record, that's the wrong way round. For the data partitioning
 metadata journaling is enough, and for the WAL partition you don't need any
 FS journaling at all.

Yes, you're right: the data partition shouldn't loose file creation,
deletion, etc., which is not important for the WAL partition where the
WAL files are mostly recycled... right ?

Cheers,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Migration study, step 1: bulk write performance

2006-03-20 Thread Csaba Nagy
Mikael,

I've just recently passed such an experience, i.e. migrating from
another vendor to postgres of a DB about the same size category you
have.

I think you got it right with the fsync turned off during migration
(just don't forget to turn it back after finishing ;-), and using tables
without indexes/foreign keys. In our case recreating all the
indexes/foreign keys/other constraints took actually longer than the raw
data transfer itself... but it's possible that the process was not tuned
100%, we are still learning how to tune postgres...

What I can add from our experience: ext3 turned out lousy for our
application, and converting to XFS made a quite big improvement for our
DB load. I don't have hard figures, but I think it was some 30%
improvement in overall speed, and it had a huge improvement for heavy
load times... what I mean is that with ext3 we had multiple parallel big
tasks executing in more time than if we would have executed them
sequentially, and with XFS that was gone, load scales linearly. In any
case you should test the performance of your application on different FS
and different settings, as this could make a huge difference.

And another thing, we're still fighting with performance problems due to
the fact that our application was designed to perform well with the
other DB product... I think you'll have more work to do in this regard
than just some search/replace ;-)

Cheers,
Csaba.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] neverending vacuum

2006-02-27 Thread Csaba Nagy
 So one very effective way of speeding this process up is giving the
 vacuum process lots of memory, because it will have to do fewer passes
 at each index.  How much do you have?

OK, this is my problem... it is left at default (16 megabyte ?). This
must be a mistake in configuration, on other similar boxes I set this to
262144 (256 megabyte). The box has 4 Gbyte memory.

Thanks for the explanation - you were right on the spot, it will likely
solve the problem.

Cheers,
Csaba.




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Can this query go faster???

2005-12-06 Thread Csaba Nagy
Joost,

Why do you use an offset here ? I guess you're traversing the table
somehow, in this case it would be better to remember the last zipcode +
housenumber and put an additional condition to get the next bigger than
the last one you've got... that would go for the index on
zipcode+housenumber and be very fast. The big offset forces postgres to
traverse that many entries until it's able to pick the one row for the
result...

On Tue, 2005-12-06 at 10:43, Joost Kraaijeveld wrote:
 Hi,
 
 Is it possible to get this query run faster than it does now, by adding
 indexes, changing the query?
 
 SELECT customers.objectid FROM prototype.customers, prototype.addresses
 WHERE
 customers.contactaddress = addresses.objectid
 ORDER BY zipCode asc, housenumber asc
 LIMIT 1 OFFSET 283745
 
 Explain:
 
 Limit  (cost=90956.71..90956.71 rows=1 width=55)
   -  Sort  (cost=90247.34..91169.63 rows=368915 width=55)
 Sort Key: addresses.zipcode, addresses.housenumber
 -  Hash Join  (cost=14598.44..56135.75 rows=368915 width=55)
   Hash Cond: (outer.contactaddress = inner.objectid)
   -  Seq Scan on customers  (cost=0.00..31392.15
 rows=368915 width=80)
   -  Hash  (cost=13675.15..13675.15 rows=369315 width=55)
 -  Seq Scan on addresses  (cost=0.00..13675.15
 rows=369315 width=55)
 
 The customers table has an index on contactaddress and objectid.
 The addresses table has an index on zipcode+housenumber and objectid.
 
 TIA
 
 --
 Groeten,
 
 Joost Kraaijeveld
 Askesis B.V.
 Molukkenstraat 14
 6524NB Nijmegen
 tel: 024-3888063 / 06-51855277
 fax: 024-3608416
 e-mail: [EMAIL PROTECTED]
 web: www.askesis.nl
 
 
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Can this query go faster???

2005-12-06 Thread Csaba Nagy
On Tue, 2005-12-06 at 13:20, Joost Kraaijeveld wrote:
[snip]
 Ah, a misunderstanding: I only need to calculate an index if the user
 wants a record that is not in or adjacent to the cache (in which case I
 can do a select values  last value in the cache. So  I must always
 materialize all rows below the wanted index.

In this case the query will very likely not work faster. It must always
visit all the records till the required offset. If the plan should be
faster using the index, then you probably need to analyze (I don't
recall from your former posts if you did it recently or not), in any
case you could check an explain analyze to see if the planner is
mistaken or not - you might already know this.

Cheers,
Csaba.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] VERY slow after many updates

2005-11-19 Thread Csaba Nagy
Alex,

I suppose the table is a kind of 'queue' table, where you
insert/get/delete continuously, and the life of the records is short.
Considering that in postgres a delete will still leave you the record in
the table's file and in the indexes, just mark it as dead, your table's
actual size can grow quite a lot even if the number of live records will
stay small (you will have a lot of dead tuples, the more tasks
processed, the more dead tuples). So I guess you should vacuum this
table very often, so that the dead tuples are reused. I'm not an expert
on this, but it might be good to vacuum after each n deletions, where n
is ~ half the average size of the queue you expect to have. From time to
time you might want to do a vacuum full on it and a reindex.

Right now I guess a vacuum full + reindex will help you. I think it's
best to do:

vacuum download_queue;
vacuum full download_queue;
reindex download_queue;

I think the non-full vacuum which is less obtrusive than the full one
will do at least some of the work and it will bring all needed things in
FS cache, so the full vacuum to be as fast as possible (vacuum full
locks exclusively the table). At least I do it this way with good
results for small queue-like tables...

BTW, I wonder if the download_queue_user_index index is helping you at
all on that table ? Do you expect it to grow bigger than 1000 ?
Otherwise it has no point to index it.

HTH,
Csaba.

On Sat, 2005-11-19 at 08:46, Alex Wang wrote:
 I am using PostgreSQL in an embedded system which has only 32 or 64 MB RAM 
 (run on PPC 266 MHz or ARM 266MHz CPU). I have a table to keep downlaod 
 tasks. There is a daemon keep looking up the table and fork a new process to 
 download data from internet.
 
 Daemon:
 . Check the table every 5 seconds
 . Fork a download process to download if there is new task
 Downlaod process (there are 5 download process max):
 . Update the download rate and downloaded size every 3 seconds.
 
 At begining, everything just fine. The speed is good. But after 24 hours, 
 the speed to access database become very very slow. Even I stop all 
 processes, restart PostgreSQL and use psql to select data, this speed is 
 still very very slow (a SQL command takes more than 2 seconds). It is a 
 small table. There are only 8 records in the table.
 
 The only way to solve it is remove all database, run initdb, create new 
 database and insert new records. I tried to run vacummdb but still very 
 slow.
 
 Any idea to make it faster?
 
 Thanks,
 Alex
 
 --
 Here is the table schema:
 create table download_queue (
task_id SERIAL,
username varchar(128),
pid int,
url text,
filename varchar(1024),
status int,
created_time int,
started_time int,
total_size int8,
current_size int8,
current_rate int,
CONSTRAINT download_queue_pkey PRIMARY KEY(task_id)
 );
 CREATE INDEX download_queue_user_index ON download_queue USING BTREE 
 (username);
 
 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] VERY slow after many updates

2005-11-19 Thread Csaba Nagy
Just for clarification, update is actually equal to delete+insert in
Postgres. So if you update rows, it's the same as you would delete the
row and insert a new version. So the table is bloating also in this
situation.
I think there is an added problem when you update, namely to get to a
row, postgres will traverse all dead rows matching the criteria... so
even if you have an index, getting 1 row which was updated 1 times
will access 1 rows only to find 1 which is still alive. So in this
case vacuuming should happen even more often, to eliminate the dead
rows.
And the index was probably only helping because the table was really
bloated, so if you vacuum it often enough you will be better off without
the index if the row count will stay low.

Cheers,
Csaba.


On Sat, 2005-11-19 at 13:05, Alex Wang wrote:
 Hi Csaba,
 
 Thanks for your reply.
 
 Yes, it's a queue table. But I did not perform many insert/delete before 
 it becomes slow. After insert 10 records, I just do get/update continuously. 
 After 24 hour, the whole database become very slow (not only the 
 download_queue table but other tables, too).  But you are right. Full vacuum 
 fixes the problem. Thank you very much!
 
 I expect there will be less than 1000 records in the table. The index does 
 obvous improvement on SELECT task_id, username FROM download_queue WHERE 
 username  '%s' even there are only 100 records.
 
 Thanks,
 Alex
 
 - Original Message - 
 From: Csaba Nagy [EMAIL PROTECTED]
 To: Alex Wang [EMAIL PROTECTED]
 Cc: postgres performance list pgsql-performance@postgresql.org
 Sent: Saturday, November 19, 2005 7:12 PM
 Subject: Re: [PERFORM] VERY slow after many updates
 
 
  Alex,
 
  I suppose the table is a kind of 'queue' table, where you
  insert/get/delete continuously, and the life of the records is short.
  Considering that in postgres a delete will still leave you the record in
  the table's file and in the indexes, just mark it as dead, your table's
  actual size can grow quite a lot even if the number of live records will
  stay small (you will have a lot of dead tuples, the more tasks
  processed, the more dead tuples). So I guess you should vacuum this
  table very often, so that the dead tuples are reused. I'm not an expert
  on this, but it might be good to vacuum after each n deletions, where n
  is ~ half the average size of the queue you expect to have. From time to
  time you might want to do a vacuum full on it and a reindex.
 
  Right now I guess a vacuum full + reindex will help you. I think it's
  best to do:
 
  vacuum download_queue;
  vacuum full download_queue;
  reindex download_queue;
 
  I think the non-full vacuum which is less obtrusive than the full one
  will do at least some of the work and it will bring all needed things in
  FS cache, so the full vacuum to be as fast as possible (vacuum full
  locks exclusively the table). At least I do it this way with good
  results for small queue-like tables...
 
  BTW, I wonder if the download_queue_user_index index is helping you at
  all on that table ? Do you expect it to grow bigger than 1000 ?
  Otherwise it has no point to index it.
 
  HTH,
  Csaba.
 
  On Sat, 2005-11-19 at 08:46, Alex Wang wrote:
  I am using PostgreSQL in an embedded system which has only 32 or 64 MB 
  RAM
  (run on PPC 266 MHz or ARM 266MHz CPU). I have a table to keep downlaod
  tasks. There is a daemon keep looking up the table and fork a new process 
  to
  download data from internet.
 
  Daemon:
  . Check the table every 5 seconds
  . Fork a download process to download if there is new task
  Downlaod process (there are 5 download process max):
  . Update the download rate and downloaded size every 3 seconds.
 
  At begining, everything just fine. The speed is good. But after 24 hours,
  the speed to access database become very very slow. Even I stop all
  processes, restart PostgreSQL and use psql to select data, this speed is
  still very very slow (a SQL command takes more than 2 seconds). It is a
  small table. There are only 8 records in the table.
 
  The only way to solve it is remove all database, run initdb, create new
  database and insert new records. I tried to run vacummdb but still very
  slow.
 
  Any idea to make it faster?
 
  Thanks,
  Alex
 
  --
  Here is the table schema:
  create table download_queue (
 task_id SERIAL,
 username varchar(128),
 pid int,
 url text,
 filename varchar(1024),
 status int,
 created_time int,
 started_time int,
 total_size int8,
 current_size int8,
 current_rate int,
 CONSTRAINT download_queue_pkey PRIMARY KEY(task_id)
  );
  CREATE INDEX download_queue_user_index ON download_queue USING BTREE
  (username);
 
 
 
 
  -- 
  This message has been scanned for viruses and
  dangerous content by MailScanner, and is
  believed to be clean.
  
 


---(end of broadcast

Re: [PERFORM] Deleting Records

2005-10-20 Thread Csaba Nagy
Christian,

Do you have foreign keys pointing to your table with ON CASCADE... ?
Cause in that case you're not only deleting your 22000 records, but the
whole tree of cascades. And if you don't have an index on one of those
foreign keys, then you might have a sequential scan of the child table
on each deleted row... I would check the foreign keys.

HTH,
Csaba.


On Thu, 2005-10-20 at 10:43, Christian Paul B. Cosinas wrote:
 Hi!
 
 I'm experiencing a very slow deletion of records. Which I thin is not right.
 I have a Dual Xeon Server with 6gig Memory.
 I am only deleting about 22,000 records but it took me more than 1 hour to
 finish this.
 
 What could possibly I do so that I can make this fast?
 
 Here is the code inside my function:
 
   FOR temp_rec IN SELECT * FROM item_qc_doer LOOP
   DELETE FROM qc_session WHERE item_id = temp_rec.item_id;
   DELETE FROM item_qc_doer WHERE item_id = temp_rec.item_id;
   END LOOP;
 
 Item_qc_oder table contains 22,000 records.
 
 
 I choose Polesoft Lockspam to fight spam, and you?
 http://www.polesoft.com/refer.html
 
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Help tuning postgres

2005-10-18 Thread Csaba Nagy
In the light of what you've explained below about nonremovable row
versions reported by vacuum, I wonder if I should worry about the
following type of report:

INFO:  vacuuming public.some_table
INFO:  some_table: removed 29598 row versions in 452 pages
DETAIL:  CPU 0.01s/0.04u sec elapsed 18.77 sec.
INFO:  some_table: found 29598 removable, 39684 nonremovable row
versions in 851 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.02s/0.07u sec elapsed 23.16 sec.
VACUUM


Does that mean that 39684 nonremovable pages are actually the active
live pages in the table (as it reports 0 dead) ? I'm sure I don't have
any long running transaction, at least according to pg_stats_activity
(backed by the linux ps too). Or I should run a vacuum full...

This table is one of which has frequently updated rows.

TIA,
Csaba.


On Thu, 2005-10-13 at 14:40, Andrew Sullivan wrote:
 On Thu, Oct 13, 2005 at 10:15:03AM +0200, Csaba Nagy wrote:
  
  OK, this sounds interesting, but I don't understand: why would an update
  chase down a lot of dead tuples ? Should I read up on some docs, cause
  I obviously don't know enough about how updates work on postgres...
 
 Right.  Here's the issue:
 
 MVCC does not replace rows when you update.  Instead, it marks the
 old row as expired, and sets the new values.  The old row is still
 there, and it's available for other transactions who need to see it. 
 As the docs say (see
 http://www.postgresql.org/docs/8.0/interactive/transaction-iso.html),
 In effect, a SELECT query sees a snapshot of the database as of the
 instant that that query begins to run.  And that can be true because
 the original data is still there, although marked as expired for
 subsequent transactions.
 
 UPDATE works the same was as SELECT in terms of searching for rows
 (so does any command that searches for data).  
 
 Now, when you select data, you actually have to traverse all the
 existing versions of the tuple in order to get the one that's live
 for you.  This is normally not a problem: VACUUM goes around and
 cleans out old, expired data that is not live for _anyone_.  It does
 this by looking for the oldest transaction that is open.  (As far as
 I understand it, this is actually the oldest transaction in the
 entire back end; but I've never understood why that should the the
 case, and I'm too incompetent/dumb to understand the code, so I may
 be wrong on this point.)  If you have very long-running transactions,
 then, you can end up with a lot of versions of dead tuples on the
 table, and so reading the few records you want can turn out actually
 to be a very expensive operation, even though it ought to be cheap.
 
 You can see this by using the VERBOSE option to VACUUM:
 
 test=# VACUUM VERBOSE eval1 ;
 INFO:  vacuuming public.eval1
 INFO:  eval1: found 0 removable, 0 nonremovable row versions in 0
 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 There were 0 unused item pointers.
 0 pages are entirely empty.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  vacuuming pg_toast.pg_toast_18831
 INFO:  index pg_toast_18831_index now contains 0 row versions in 1
 pages
 DETAIL:  0 index pages have been deleted, 0 are currently reusable.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  pg_toast_18831: found 0 removable, 0 nonremovable row
 versions in 0 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 There were 0 unused item pointers.
 0 pages are entirely empty.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 VACUUM
 
 Note those removable and nonremovable row versions.  It's the
 unremovable ones that can hurt.  WARNING: doing VACUUM on a big table
 on a disk that's already pegged is going to cause you performance
 pain, because it scans the whole table.  In some cases, though, you
 have no choice: if the winds are already out of your sails, and
 you're effectively stopped, anything that might get you moving again
 is an improvement.
 
  And how would the analyze help in finding this out ? I thought it would
  only show me additionally the actual timings, not more detail in what
  was done...
 
 Yes, it shows the actual timings, and the actual number of rows.  But
 if the estimates that the planner makes are wildly different than the
 actual results, then you know your statistics are wrong, and that the
 planner is going about things the wrong way.  ANALYSE is a big help. 
 There's also a verbose option to it, but it's usually less useful in
 production situations.
 
 A


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Help tuning postgres

2005-10-18 Thread Csaba Nagy
First of all thanks all for the input.

I probably can't afford even the reindex till Christmas, when we have
about 2 weeks of company holiday... but I guess I'll have to do
something until Christmas.

The system should at least look like working all the time. I can have
downtime, but only for short periods preferably less than 1 minute. The
tables we're talking about have ~10 million rows the smaller ones and
~150 million rows the bigger ones, and I guess reindex will take quite
some time.

I wonder if I could device a scheme like:
 
 - create a temp table exactly like the production table, including
indexes and foreign keys;
 - create triggers on the production table which log all inserts,
deletes, updates to a log table;
 - activate these triggers;
 - copy all data from the production table to a temp table (this will
take the bulk of the time needed for the whole operation);
 - replay the log on the temp table repeatedly if necessary, until the
temp table is sufficiently close to the original;
 - rename the original table to something else, and then rename the temp
table to the original name, all this in a transaction - this would be
ideally the only visible delay for the user, and if the system is not
busy, it should be quick I guess;
 - replay on more time the log;

All this should happen in a point in time when there's little traffic to
the data base.

Replaying could be as simple as a few delete triggers on the log table,
which replay the deleted record on the production table, and the replay
then consisting in a delete operation on the log table. This is so that
new log entries can be replayed later without replaying again what was
already replayed.

The big tables I should do this procedure on have low probability of
conflicting operations (like insert and immediate delete of the same
row, or multiple insert of the same row, multiple conflicting updates of
the same row, etc.), this is why I think replaying the log will work
fine... of course this whole set up will be a lot more work than just
reindex...

I wonder if somebody tried anything like this and if it has chances to
work ?

Thanks,
Csaba.

On Tue, 2005-10-18 at 17:18, Robert Treat wrote:
 reindex should be faster, since you're not dumping/reloading the table
 contents on top of rebuilding the index, you're just rebuilding the
 index. 
 
 
 Robert Treat
 emdeon Practice Services
 Alachua, Florida
 
 On Wed, 2005-10-12 at 13:32, Steve Poe wrote:
  
  Would it not be faster to do a dump/reload of the table than reindex or
  is it about the same? 
  
  Steve Poe
  
  On Wed, 2005-10-12 at 13:21 -0400, Tom Lane wrote:
   Emil Briggs [EMAIL PROTECTED] writes:
Not yet, the db is in production use and I have to plan for a down-time
for that... or is it not impacting the activity on the table ?
   
It will cause some performance hit while you are doing it.
   
   It'll also lock out writes on the table until the index is rebuilt,
   so he does need to schedule downtime.
   
 regards, tom lane
   
   ---(end of broadcast)---
   TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly
  
  
  ---(end of broadcast)---
  TIP 5: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Help tuning postgres

2005-10-13 Thread Csaba Nagy
[snip]
 Yes, but it could be a disk issue because you're doing more work than
 you need to.  If your UPDATEs are chasing down a lot of dead tuples,
 for instance, you'll peg your I/O even though you ought to have I/O
 to burn.

OK, this sounds interesting, but I don't understand: why would an update
chase down a lot of dead tuples ? Should I read up on some docs, cause
I obviously don't know enough about how updates work on postgres...

And how would the analyze help in finding this out ? I thought it would
only show me additionally the actual timings, not more detail in what
was done...

Thanks,
Csaba.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Help tuning postgres

2005-10-13 Thread Csaba Nagy
Thanks Andrew, this explanation about the dead rows was enlightening.
Might be the reason for the slowdown I see on occasions, but not for the
case which I was first observing. In that case the updated rows are
different for each update. It is possible that each row has a few dead
versions, but not too many, each row is updated just a limited number of
times.

However, we have other updates which access the same row 1000s of times
(up to millions of times), and that could hurt if it's like you said,
i.e. if each update has to crawl over all the dead rows... I have now
autovacuum in place, and I'm sure it will kick in at ~ a few 1s of
updates, but in the meantime it could get bad.
In any case, I suppose that those disk pages should be in OS cache
pretty soon and stay there, so I still don't understand why the disk
usage is 100% in this case (with very low CPU activity, the CPUs are
mostly waiting/idle)... the amount of actively used data is not that
big.

I'll try to vacuum through cron jobs the most exposed tables to this
multiple-dead-row-versions symptom, cause autovacuum probably won't do
it often enough. Let's see if it helps...

Thanks,
Csaba.


On Thu, 2005-10-13 at 14:40, Andrew Sullivan wrote:
 On Thu, Oct 13, 2005 at 10:15:03AM +0200, Csaba Nagy wrote:
  
  OK, this sounds interesting, but I don't understand: why would an update
  chase down a lot of dead tuples ? Should I read up on some docs, cause
  I obviously don't know enough about how updates work on postgres...
 
 Right.  Here's the issue:
 
 MVCC does not replace rows when you update.  Instead, it marks the
 old row as expired, and sets the new values.  The old row is still
 there, and it's available for other transactions who need to see it. 
 As the docs say (see
 http://www.postgresql.org/docs/8.0/interactive/transaction-iso.html),
 In effect, a SELECT query sees a snapshot of the database as of the
 instant that that query begins to run.  And that can be true because
 the original data is still there, although marked as expired for
 subsequent transactions.
 
 UPDATE works the same was as SELECT in terms of searching for rows
 (so does any command that searches for data).  
 
 Now, when you select data, you actually have to traverse all the
 existing versions of the tuple in order to get the one that's live
 for you.  This is normally not a problem: VACUUM goes around and
 cleans out old, expired data that is not live for _anyone_.  It does
 this by looking for the oldest transaction that is open.  (As far as
 I understand it, this is actually the oldest transaction in the
 entire back end; but I've never understood why that should the the
 case, and I'm too incompetent/dumb to understand the code, so I may
 be wrong on this point.)  If you have very long-running transactions,
 then, you can end up with a lot of versions of dead tuples on the
 table, and so reading the few records you want can turn out actually
 to be a very expensive operation, even though it ought to be cheap.
 
 You can see this by using the VERBOSE option to VACUUM:
 
 test=# VACUUM VERBOSE eval1 ;
 INFO:  vacuuming public.eval1
 INFO:  eval1: found 0 removable, 0 nonremovable row versions in 0
 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 There were 0 unused item pointers.
 0 pages are entirely empty.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  vacuuming pg_toast.pg_toast_18831
 INFO:  index pg_toast_18831_index now contains 0 row versions in 1
 pages
 DETAIL:  0 index pages have been deleted, 0 are currently reusable.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  pg_toast_18831: found 0 removable, 0 nonremovable row
 versions in 0 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 There were 0 unused item pointers.
 0 pages are entirely empty.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 VACUUM
 
 Note those removable and nonremovable row versions.  It's the
 unremovable ones that can hurt.  WARNING: doing VACUUM on a big table
 on a disk that's already pegged is going to cause you performance
 pain, because it scans the whole table.  In some cases, though, you
 have no choice: if the winds are already out of your sails, and
 you're effectively stopped, anything that might get you moving again
 is an improvement.
 
  And how would the analyze help in finding this out ? I thought it would
  only show me additionally the actual timings, not more detail in what
  was done...
 
 Yes, it shows the actual timings, and the actual number of rows.  But
 if the estimates that the planner makes are wildly different than the
 actual results, then you know your statistics are wrong, and that the
 planner is going about things the wrong way.  ANALYSE is a big help. 
 There's also a verbose option to it, but it's usually less useful in
 production situations.
 
 A


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining

[PERFORM] Help tuning postgres

2005-10-12 Thread Csaba Nagy
Hi all,

After a long time of reading the general list it's time to subscribe to
this one...

We have adapted our application (originally written for oracle) to
postgres, and switched part of our business to a postgres data base.

The data base has in the main tables around 150 million rows, the whole
data set takes ~ 30G after the initial migration. After ~ a month of
usage that bloated to ~ 100G. We installed autovacuum after ~ 2 weeks.

The main table is heavily updated during the active periods of usage,
which is coming in bursts.

Now Oracle on the same hardware has no problems handling it (the load),
but postgres comes to a crawl. Examining the pg_stats_activity table I
see the updates on the main table as being the biggest problem, they are
very slow. The table has a few indexes on it, I wonder if they are
updated too on an update ? The index fields are not changing. In any
case, I can't explain why the updates are so much slower on postgres.

Sorry for being fuzzy a bit, I spent quite some time figuring out what I
can do and now I have to give up and ask for help.

The machine running the DB is a debian linux, details:

$ cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 11
model name  : Intel(R) Pentium(R) III CPU family  1266MHz
stepping: 1
cpu MHz : 1263.122
cache size  : 512 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 2
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 mmx fxsr sse
bogomips: 2490.36
 
processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model   : 11
model name  : Intel(R) Pentium(R) III CPU family  1266MHz
stepping: 1
cpu MHz : 1263.122
cache size  : 512 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 2
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 mmx fxsr sse
bogomips: 2514.94
 

$ uname -a
Linux *** 2.6.12.3 #1 SMP Tue Oct 11 13:13:00 CEST 2005 i686 GNU/Linux


$ cat /proc/meminfo
MemTotal:  4091012 kB
MemFree:118072 kB
Buffers: 18464 kB
Cached:3393436 kB
SwapCached:  0 kB
Active: 947508 kB
Inactive:  2875644 kB
HighTotal: 3211264 kB
HighFree:  868 kB
LowTotal:   879748 kB
LowFree:117204 kB
SwapTotal:   0 kB
SwapFree:0 kB
Dirty:   13252 kB
Writeback:   0 kB
Mapped: 829300 kB
Slab:64632 kB
CommitLimit:   2045504 kB
Committed_AS:  1148064 kB
PageTables:  75916 kB
VmallocTotal:   114680 kB
VmallocUsed:96 kB
VmallocChunk:   114568 kB


The disk used for the data is an external raid array, I don't know much
about that right now except I think is some relatively fast IDE stuff.
In any case the operations should be cache friendly, we don't scan over
and over the big tables...

The postgres server configuration is attached.

I have looked in the postgres statistics tables, looks like most of the
needed data is always cached, as in the most accessed tables the
load/hit ratio is mostly something like 1/100, or at least 1/30.


Is anything in the config I got very wrong for the given machine, or
what else should I investigate further ? If I can't make this fly, the
obvious solution will be to move back to Oracle, cash out the license
and forget about postgres forever...

TIA,
Csaba.

# -
# PostgreSQL configuration file
# -
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have 
# to SIGHUP the postmaster for the changes to take effect, or use 
# pg_ctl reload. Some settings, such as listen_address, require
# a postmaster shutdown and restart to take effect.


#---
# FILE LOCATIONS
#---

# The default values of these variables 

Re: [PERFORM] Help tuning postgres

2005-10-12 Thread Csaba Nagy
[snip]
 Have you tried reindexing your active tables?
 
Not yet, the db is in production use and I have to plan for a down-time
for that... or is it not impacting the activity on the table ?

 Emil
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [JDBC] Performance problem using V3 protocol in jdbc driver

2005-08-18 Thread Csaba Nagy
Barry,

I have made a similar experience, moving a big Oracle data base to
Postgres 8.03 on linux.
The first impact was similar, huge performance problems.
The main problem was bad planner choices. The cause in our case: bad
parameter types in the jdbc set methods (I guess you use Java). For
oracle we used the NUMERIC type to set primary keys, but the postgres id
type used was BIGINT, and it just refused to use the index in this case.
Imagine that kicking in on a 100 million rows table... a sequential scan
started a few times a second, now that made the DB unusable.
So we fixed the code that for oracle continues to use NUMERIC and for
postgres it uses BIGINT, and that is very important on setNull calls
too.

One very useful tool was the following query:

prepare ps as
SELECT procpid, substring(current_query for 97),
to_char((now()-query_start), 'HH24:MI:SS') as t
FROM pg_stat_activity
where current_query not like '%insufficient%'
  and current_query not like '%IDLE%' order by t desc;

Then you just execute ps; in psql, and it will show you the queries
which are already running for a while.

Other problems were caused by complex queries, where more than 2 tables
were joined. For oracle we were giving hints in the form of special
comments, to point to the right index, right plan, but that's not an
option for postgres (yet ?). So the fix in this case was to use explicit
joins which do influence the postgres planner choices. This fixed
another class of issues for us...

Another problem: if you want to avoid worst-case plans, and do away with
a generic plan for all cases, then you might force the usage of server
side prepare statements in all cases. I had to do that, a lot of queries
were performing very badly without this. Now maybe that could be solved
by raising the statistics targets where needed, but in my case the
generic plan was always good enough, by design. We rely on the DB
picking a good generic plan in all cases. One typical example for us
would be: a limit query which select 20 rows out of 100 million, with a
where clause which actually selects 1 row out of it for the last
chunk... it was going for an index scan, but on the wrong index. The
right index would have selected that exactly 1 row, the wrong one had to
cruise through a few million rows... the limit fooled the planner that
it will get 20 rows quickly. Now when I forced the usage of a prepared
statement, it went for the right index and all was good.
I actually set this in our connection pool:
((PGConnection)connection).setPrepareThreshold(1);
but it is possible to set/reset it on a statement level, I just didn't
find any query I should to do it for yet... the DB is steady now.

Another issue was that we've had some functional indexes on oracle
returning null for uninteresting rows, to lower the index size. This is
easier to implement on postgres using a partial index, which has a lot
simpler syntax than the oracle hack, and it is easier to handle. The
catch was that we needed to change the where clause compared to oracle
so that postgres picks the partial index indeed. There are cases where
the planner can't figure out that it can use the index, especially if
you use prepared statements and one of the parameters is used in the
index condition. In this case it is needed to add the proper restriction
to the where clause to point postgres to use the partial index. Using
partial indexes speeds up the inserts and updates on those tables, and
could speed up some selects too.

Hmmm... that's about what I recall now... beside the postgres admin
stuff, have you analyzed your data after import ? I forgot to do that at
first, and almost reverted again back to oracle... and then after a few
days it was very clear that running the auto-vacuum daemon is also a
must :-)
And: for big data sets is important to tweak all performance settings in
the config file, otherwise you get surprises. We've been running a
smaller instance of the same code on postgres for quite a while before
deciding to migrate a big one, and that was cruising along happily with
the default settings, so the first time we needed to do optimizations
was when using a data set with a lot of data in it...

HTH,
Csaba.


On Wed, 2005-08-17 at 06:42, Barry Lind wrote:
 We just moved a large production instance of ours from Oracle to
 Postgres 8.0.3 on linux.  When running on Oracle the machine hummed
 along using about 5% of the CPU easily handling the fairly constant
 load, after moving the data to Postgres the machine was pretty much
 maxed out on CPU and could no longer keep up with the transaction
 volume.  On a hunch I switched the jdbc driver to using the V2
 protocol and the load on the machine dropped down to what it was when
 using Oracle and everything was fine.
 
  
 
 Now obviously I have found a work around for the performance problem,
 but I really don’t want to rely on using the V2 protocol forever, and
 don’t want to have to recommend to our customers that they need