[PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Gavin Hamill
Hullo, here's one of those dreadful touchy-feely hand-waving problems.
Our 5-node 8.1.3 Slony system has just started taking /much/ longer to
VACUUM ANALYZE..

The data set has not increased more than usual (nightly backups stand
at 1.3GB, growing by 10MB per day), and no configuration has changed on
the machines.

Nodes 2 and 3 take only the tables necessary to run our search (10 out
of the full 130) and are much lighter (only 7GB on disk cf. 30GB for
the full master) , yet the nightly VACUUM FULL has jumped from 2 hours
to 4 in the space of one day!

Like I say, no config changes, no reboots / postmaster restarts, no extra 
processes, and every machine has a comfortable overhead of free page slots + 
relations.

>From a few days ago:
2006-10-20 03:04:29 UTC INFO:  "Allocation": found 786856 removable, 4933448 
nonremovable row versions in 53461 pages
2006-10-20 03:04:29 UTC DETAIL:  0 dead row versions cannot be removed yet.
2006-10-20 03:07:32 UTC INFO:  index "allocation_pkey" now contains 4933448 row 
versions in 93918 pages
2006-10-20 03:07:32 UTC DETAIL:  786856 index row versions were removed.
2006-10-20 03:14:21 UTC INFO:  index "ix_date" now contains 4933448 row 
versions in 74455 pages
2006-10-20 03:14:21 UTC DETAIL:  786856 index row versions were removed.
2006-10-20 03:22:32 UTC INFO:  index "ix_dateprice" now contains 4933448 row 
versions in 81313 pages
2006-10-20 03:22:32 UTC DETAIL:  786856 index row versions were removed.
2006-10-20 03:24:41 UTC INFO:  index "ix_dateroom" now contains 4933448 row 
versions in 44610 pages
2006-10-20 03:24:41 UTC DETAIL:  786856 index row versions were removed.
2006-10-20 03:27:52 UTC INFO:  index "ix_room" now contains 4933448 row 
versions in 35415 pages
2006-10-20 03:27:52 UTC DETAIL:  786856 index row versions were removed.
2006-10-20 03:31:43 UTC INFO:  "Allocation": moved 348324 row versions, 
truncated 53461 to 46107 pages
2006-10-20 03:31:43 UTC DETAIL:  CPU 4.72s/17.63u sec elapsed 230.81 sec.

>From last night:
2006-10-26 01:00:30 UTC INFO:  vacuuming "public.Allocation"
2006-10-26 01:00:36 UTC INFO:  "Allocation": found 774057 removable, 4979938 
nonremovable row versions in 53777 pages
2006-10-26 01:00:36 UTC DETAIL:  0 dead row versions cannot be removed yet.
2006-10-26 01:06:18 UTC INFO:  index "allocation_pkey" now contains 4979938 row 
versions in 100800 pages
2006-10-26 01:06:18 UTC DETAIL:  774057 index row versions were removed.
2006-10-26 01:19:22 UTC INFO:  index "ix_date" now contains 4979938 row 
versions in 81630 pages
2006-10-26 01:19:22 UTC DETAIL:  774057 index row versions were removed.
2006-10-26 01:35:17 UTC INFO:  index "ix_dateprice" now contains 4979938 row 
versions in 87750 pages
2006-10-26 01:35:17 UTC DETAIL:  774057 index row versions were removed.
2006-10-26 01:41:27 UTC INFO:  index "ix_dateroom" now contains 4979938 row 
versions in 46320 pages
2006-10-26 01:41:27 UTC DETAIL:  774057 index row versions were removed.
2006-10-26 01:48:18 UTC INFO:  index "ix_room" now contains 4979938 row 
versions in 36513 pages
2006-10-26 01:48:18 UTC DETAIL:  774057 index row versions were removed.
2006-10-26 01:56:35 UTC INFO:  "Allocation": moved 322744 row versions, 
truncated 53777 to 46542 pages
2006-10-26 01:56:35 UTC DETAIL:  CPU 4.21s/15.90u sec elapsed 496.30 sec.

As you can see, the amount of system + user time for these runs are comparable, 
but the amount of real time has more than doubled. 

This isn't even a case for making the cost-based delay vacuum more aggressive 
because I already have vacuum_cost_delay = 0 on all machines to make the vacuum 
run as quickly as possible.

Any ideas warmly received! :)

Cheers,
Gavin.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread Magnus Hagander
> > > You may try to figure out what's the process doing (the backend 
> > > obviously, not the frontend (Tcl) process) by attaching 
> to it with 
> > > strace.
> > 
> > It's so sad when us poor Windows guys get helpful hints from people 
> > assume that we're smart enough to run *NIX... ;-)
> 
> You should try a google search on strace and NT or windows or 
> XP...  I was surprised how many various implementations of it I found.

Let me know if you find one that's stable, I've been wanting that. I've
tried one or two, but it's always been just a matter of time before the
inevitable BSOD.

> > > Maybe it's swamped by a context switch storm (but in that case, 
> > > probably the other processes would be affected as well).
> > 
> > What is a context switch storm? (and what a great name for a heavy 
> > metal rock band!)
> 
> I can just see the postgresql group getting together at the 
> next O'Reilley's conference and creating that band.  And it 
> will all be your fault.

*DO NOT LET DEVRIM SEE THIS THREAD*


> A context switch storm is when your machine spends more time 
> trying to figure out what to do than actually doing anything. 
>  The CPU spends most it's time switching between programs 
> than running them.

I can see Windows benig more sucepitble to this than say Linux, because
switching between processes there is a lot more expensive than on Linux.

> Seeing as PostgreSQL runs one thread / process per 
> connection, it's pretty unlikely that the problem here is one 
> "hungry" thread.  Do all four CPUs show busy, or just one?  
> Do you have a way of measuring how much time is spent waiting 
> on I/O on a windows machine like top / vmstat does in unix?

There are plenty of counters in the Performance Monitor. Specificall,
look at "disk queue counters" - they indicate when the I/O subsystem is
backed up.


//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread Rocco Altier
I seem to remember Oleg/Teodor recently reporting a problem with Windows
hanging on a multi-processor machine, during a heavy load operation.

In their case it seemed like a vacuum would allow it to wake up.  They
did commit a patch that did not make it into the last minor version for
lack of testing.

Perhaps you could see if that patch might work for you, which would also
help ease the argument against the patches lack of testing.

-rocco

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread Carlo Stonebanks
> I can just see the postgresql group getting together at the next
> O'Reilley's conference and creating that band.  And it will all be your
> fault.

Finally, a chance for me to wear my black leather pants.

> A context switch storm is when your machine spends more time trying to
> figure out what to do than actually doing anything.  The CPU spends most
> it's time switching between programs than running them.

Is thatl likely on a new 4 CPU server that has no clients connected and that 
is only running four (admittedly heavy) TCL data load scripts?

> Seeing as PostgreSQL runs one thread / process per connection, it's
> pretty unlikely that the problem here is one "hungry" thread.  Do all
> four CPUs show busy, or just one?  Do you have a way of measuring how
> much time is spent waiting on I/O on a windows machine like top / vmstat
> does in unix?

Before optimising the queries, all four CPU's were pinned to max performance 
(that's why I only run four imports at a time). After opimisation, all four 
CPU's are busy, but usage is spikey (which looks more normal), but all are 
obviously busy. I have this feeling that when an import app freezes, one CPU 
goes idle while the others stay busy - I will confirm that with the next 
import operation.

I suspect that the server has the Xeon processors that were of a generation 
which PostgreSQL had a problem with - should a postgresql process be able to 
distrivute its processing load across CPU's? (i.e. When I see one CPU at 
100% while all others are idle?)

> Note that if you have an import process that needs a big chunk of
> memory, you can set just that one connection to use a large setting and
> leave the default smaller.

Total memory usage is below the max available. Each postgresql process takes 
up 500MB, there are four running and I have 4GB of RAM.

Carlo 



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Tom Lane
Gavin Hamill <[EMAIL PROTECTED]> writes:
> Nodes 2 and 3 take only the tables necessary to run our search (10 out
> of the full 130) and are much lighter (only 7GB on disk cf. 30GB for
> the full master) , yet the nightly VACUUM FULL has jumped from 2 hours
> to 4 in the space of one day!

I guess the most useful question to ask is "why are you doing VACUUM FULL?"
Plain VACUUM should be considerably faster, and for the level of row
turnover shown by your log, there doesn't seem to be a reason to use FULL.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread Carlo Stonebanks
This is pretty interesting - where can I read more on this? Windows isn't 
actually hanging, one single command line window is - from its behaviour, it 
looks like the TCL postgresql package is waiting for pg_exec to come back 
from the commit (I believe the commit has actually gone through).

It could even be that there's something wrong with the TCL package, but from 
my understanding it is one of the most complete interfaces out there - which 
is weird, because TCL seems to be the most unpopular language in the 
community.

Caro


""Rocco Altier"" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
>I seem to remember Oleg/Teodor recently reporting a problem with Windows
> hanging on a multi-processor machine, during a heavy load operation.
>
> In their case it seemed like a vacuum would allow it to wake up.  They
> did commit a patch that did not make it into the last minor version for
> lack of testing.
>
> Perhaps you could see if that patch might work for you, which would also
> help ease the argument against the patches lack of testing.
>
> -rocco
>
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend
> 



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread Merlin Moncure

On 10/26/06, Carlo Stonebanks <[EMAIL PROTECTED]> wrote:

This is pretty interesting - where can I read more on this? Windows isn't
actually hanging, one single command line window is - from its behaviour, it
looks like the TCL postgresql package is waiting for pg_exec to come back
from the commit (I believe the commit has actually gone through).

It could even be that there's something wrong with the TCL package, but from
my understanding it is one of the most complete interfaces out there - which
is weird, because TCL seems to be the most unpopular language in the
community.


when it happens, make sure to query pg_locks and see what is going on
there lock issues are not supposed to manifest on a commit, which
releases locks, but you never know.  There have been reports of
insonsistent lock ups on windows (espeically multi-processor) which
you might be experiencing. Make sure you have the very latest version
of pg 8.1.x.  Also consider checking out 8.2 and see if you can
reproduce the behavior there...this will require compiling postgresql.

merlin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Gavin Hamill
On Thu, 26 Oct 2006 10:47:21 -0400
Tom Lane <[EMAIL PROTECTED]> wrote:

> Gavin Hamill <[EMAIL PROTECTED]> writes:
> > Nodes 2 and 3 take only the tables necessary to run our search (10
> > out of the full 130) and are much lighter (only 7GB on disk cf.
> > 30GB for the full master) , yet the nightly VACUUM FULL has jumped
> > from 2 hours to 4 in the space of one day!
> 
> I guess the most useful question to ask is "why are you doing VACUUM
> FULL?" Plain VACUUM should be considerably faster, and for the level
> of row turnover shown by your log, there doesn't seem to be a reason
> to use FULL.

I do FULL on the 'light' clients simply because 'I can'. The example
posted was a poor choice - the other tables have a larger churn.

Anyway, once it starts, the load balancer takes it out of rotation so
no love is lost.

The same behaviour is shown on the 'heavy' clients (master + 2 slaves)
which take all tables - although I cannot afford to VACUUM FULL on
there, the usual VACUUM ANALYZE has begun to take vastly more time
since yesterday than in the many previous months we've been using pg.

Cheers,
Gavin.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Stored procedure slower than sql?

2006-10-26 Thread Matthew Peters
(Repost - did not appear to make it to the list the first time)

I have written a stored procedure for 8.1 that wraps a single (albeit complex) 
query, and uses 2 IN parameters (BIGINT, INTEGER) in the FROM JOIN and WHERE 
clauses.  The procedure is written in SQL (as opposed to plpgsql - although 
testing in plpgsql produces the same problem).  The results are SETOF a custom 
type (SMALLINT, NUMERIC(38,2), NUMERIC(38,2)).  The central query, when tested 
in psql and pgadmin III returns in 500 ms.  As a stored procedure, it returns 
in 22000 ms!  How can a stored procedure containing a single query not 
implement the same execution plan (assumption based on the dramatic performance 
difference) that an identical ad-hoc query generates?  I ran a series of tests, 
and determined that if I replaced the parameters with hard-coded values, the 
execution time returned to 500ms.  Can anyone shed some light on this for me - 
it seems counter-intuitive?

Here are some particulars about the underlying query and tables:

The query selects a month number from a generate_series(1,12) left outer joined 
on a subquery, which produces three fields.  The subquery is a UNION ALL of 5 
tables.  Each of the five tables has 100 inherited partitions.  As you can see 
from the execution plan, the partitioning constraint is successfully 
restricting the query to the appropriate partition for each of the five tables. 
 The constraint for each partition is a CHAR(2) field "partition_key"  = '00' 
(where '00' is a two-digit CHAR(2) value that is returned from a function call 
ala table1.partition_key = partition_key($1) ) 

Execution Plan :

Sort  (cost=10410.15..10410.65 rows=200 width=68) (actual time=273.050..273.071 
rows=12 loops=1)
  Sort Key: mm.monthnumber
  ->  HashAggregate  (cost=10398.01..10402.51 rows=200 width=68) (actual 
time=272.970..273.001 rows=12 loops=1)
    ->  Hash Ltable5 Join  (cost=10370.01..10390.51 rows=1000 width=68) 
(actual time=272.817..272.902 rows=13 loops=1)
  Hash Cond: (("outer".monthnumber)::double precision = 
"inner".monthnumber)
  ->  Function Scan on generate_series mm  (cost=0.00..12.50 
rows=1000 width=4) (actual time=0.018..0.043 rows=12 loops=1)
  ->  Hash  (cost=10369.99..10369.99 rows=10 width=72) (actual 
time=272.769..272.769 rows=8 loops=1)
    ->  Append  (cost=1392.08..10369.89 rows=10 width=47) 
(actual time=39.581..272.734 rows=8 loops=1)
  ->  Subquery Scan "*SELECT* 1"  
(cost=1392.08..1392.15 rows=2 width=47) (actual time=39.576..39.582 rows=1 
loops=1)
    ->  HashAggregate  (cost=1392.08..1392.13 
rows=2 width=47) (actual time=39.571..39.573 rows=1 loops=1)
  ->  Result  (cost=0.00..1392.05 rows=2 
width=47) (actual time=25.240..39.538 rows=1 loops=1)
    ->  Append  (cost=0.00..1392.03 
rows=2 width=47) (actual time=25.224..39.518 rows=1 loops=1)
  ->  Seq Scan on table1 
table1  (cost=0.00..14.50 rows=1 width=47) (actual time=0.003..0.003 rows=0 
loops=1)
    Filter: ((partition_key 
= '12'::bpchar) AND (substr((indexed_field)::text, 2, 1) = '5'::text) AND 
(table_key = 10265512) AND (date_part('year'::text, (event_date)::timestamp 
without time zone) = 2005::double precision))
  ->  Seq Scan on table1_p12 
table1  (cost=0.00..1377.53 rows=1 width=28) (actual time=25.214..39.503 rows=1 
loops=1)
    Filter: ((partition_key 
= '12'::bpchar) AND (substr((indexed_field)::text, 2, 1) = '5'::text) AND 
(table_key = 10265512) AND (date_part('year'::text, (event_date)::timestamp 
without time zone) = 2005::double precision))
  ->  Subquery Scan "*SELECT* 2"  
(cost=2741.47..2741.56 rows=2 width=47) (actual time=78.140..78.140 rows=0 
loops=1)
    ->  HashAggregate  (cost=2741.47..2741.54 
rows=2 width=47) (actual time=78.134..78.134 rows=0 loops=1)
  ->  Result  (cost=0.00..2741.45 rows=2 
width=47) (actual time=78.128..78.128 rows=0 loops=1)
    ->  Append  (cost=0.00..2741.43 
rows=2 width=47) (actual time=78.122..78.122 rows=0 loops=1)
  ->  Seq Scan on table2 
table2  (cost=0.00..12.40 rows=1 width=47) (actual time=0.004..0.004 rows=0 
loops=1)
    Filter: ((partition_key 
= '12'::bpchar) AND (substr((indexed_field)::text, 2, 1) = '5'::text) AND 
(table_key = 10265512) AND (date_part('year'::text, (event_date)::timestamp 
without time zone) = 2005::double precision))
      ->  Seq Scan on table2_p12 
table2  (cost=0.00..2729.03 row

Re: [PERFORM] Stored procedure slower than sql?

2006-10-26 Thread Tom Lane
"Matthew Peters" <[EMAIL PROTECTED]> writes:
> How can a stored procedure containing a single query not implement the
> same execution plan (assumption based on the dramatic performance
> difference) that an identical ad-hoc query generates?

Parameterized vs non parameterized query?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Stored procedure slower than sql?

2006-10-26 Thread Matthew Peters
Parameterized.

IE (very simplified)

CREATE OR REPLACE FUNCTION my_function(IN param1 BIGINT, IN param2
INTEGER)
RETURNS my_type
SECURITY DEFINER
AS
$$
/* my_type = (a,b,c) */
Select a,b,c
FROM my_table
WHERE indexed_column = $1
AND partition_constraint_column = $2;
$$
LANGUAGE SQL;




Matthew A. Peters
Sr. Software Engineer, Haydrian Corp.
[EMAIL PROTECTED]
(mobile) 425-941-6566
 Haydrian Corp.
-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 26, 2006 9:15 AM
To: Matthew Peters
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Stored procedure slower than sql? 
Importance: High

"Matthew Peters" <[EMAIL PROTECTED]> writes:
> How can a stored procedure containing a single query not implement the
> same execution plan (assumption based on the dramatic performance
> difference) that an identical ad-hoc query generates?

Parameterized vs non parameterized query?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread Richard Troy

> A context switch storm is when your machine spends more time trying to
> figure out what to do than actually doing anything.  The CPU spends most
> it's time switching between programs than running them.

Well, we usually use the term "thrashing" as the generic for when your
machine is spending more time on overhead than doing user work - this
would include paging or context switching, along with whatever else. A
context-switch storm would be a specific form of thrashing!

Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread Richard Troy


On Thu, 26 Oct 2006, Carlo Stonebanks wrote:
>
> It could even be that there's something wrong with the TCL package, but from
> my understanding it is one of the most complete interfaces out there - which
> is weird, because TCL seems to be the most unpopular language in the
> community.
>

Not that this matters much and it's slightly off the topic of performance,
but...

...I would have to check my _ancient_ emails for the name of the guy and
the dates, but the integration was first done while I was a researcher at
Berkeley, at the tail end of the Postgres team's funding. My team used
Postgres with TCL internals to implement "the query from hell" inside the
server. That was about 1994 or '95, IIRC. At that time, most people who
knew both said that they were roughly equivalent, with PERL being _vastly_
less intelligible (to humans) and they hated it. What happened was PERL
got exposure that TCL didn't and people who didn't know better jumped on
it.

So, it was one of the most complete interfaces because it was done first,
or nearly first, by the original guys that created the original Postgres.

Richard


-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread Carlo Stonebanks
> when it happens, make sure to query pg_locks and see what is going on
> there lock issues are not supposed to manifest on a commit, which
> releases locks, but you never know.

There aren't any pedning locks (assuming that pgAdmin is using pg_locks to 
display pendin glocks).

> There have been reports of
> insonsistent lock ups on windows (espeically multi-processor) which
> you might be experiencing. Make sure you have the very latest version
> of pg 8.1.x.  Also consider checking out 8.2 and see if you can
> reproduce the behavior there...this will require compiling postgresql.

Are these associated with any type of CPU?

Carlo 



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


OT: TCL vs Perl Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread mark
Perl started out fast - TCL started out slow. Perl used syntax that,
although it would drive some people crazy, followed a linguistic curve
that Larry Wall claimed was healthy. The English language is crazy,
and yet, it has become standard world wide as well. Designed, regular
languages like Esperanto have not received much support either.

Perl is designed to be practical. TCL was designed to be minimalistic.

Perl uses common idioms for UNIX programmers. // for regular expressions,
$VAR for variables, Many of the statement are familiar for C programmers.
++ for increment (compare against 'incr abc' for TCL). $a=5 for assignment,
compare against 'set abc 5' in TCL.

TCL tries to have a reduced syntax, where 'everything is a string'
which requires wierdness for people.  For example, newline is
end-of-line, so { must be positioned correctly. Code is a string, so
in some cases you need to escape code, otherwise not.

Perl has object oriented support built-in. It's ugly, but it works.
TCL has a questionable '[incr tcl]' package.

Perl has a wealth of modules on CPAN to do almost anything you need to.
TCL has the beginning of one (not as rich), but comes built-in with things
like event loops, and graphicals (Tk).

I could go on and on - but I won't, because this is the PostgreSQL
mailing list. People either get Perl, or TCL, or they don't. More
people 'get' Perl, because it was marketted better, it's syntax is
deceivingly comparable to other well known languages, and for the
longest time, it was much faster than TCL to write (especially when
using regular expressions) and faster to run.

Did TCL get treated unfairly as a result? It's a language. Who cares! :-)

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: OT: TCL vs Perl Re: [PERFORM] commit so slow program looks frozen

2006-10-26 Thread Joshua D. Drake

> Perl has a wealth of modules on CPAN to do almost anything you need to.
> TCL has the beginning of one (not as rich), but comes built-in with things
> like event loops, and graphicals (Tk).
> 
> I could go on and on - but I won't, because this is the PostgreSQL
> mailing list. People either get Perl, or TCL, or they don't. More
> people 'get' Perl, because it was marketted better, it's syntax is
> deceivingly comparable to other well known languages, and for the
> longest time, it was much faster than TCL to write (especially when
> using regular expressions) and faster to run.
> 
> Did TCL get treated unfairly as a result? It's a language. Who cares! :-)

You forgot the god of scripting languages, Python... (Yes perl is much
better at system level scripting than Python).

Sincerely,

Joshua D. Drake

> 
> Cheers,
> mark
> 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Configuration Issue ?

2006-10-26 Thread Jim C. Nasby
On Wed, Oct 25, 2006 at 05:31:29PM -0400, Mark Lonsdale wrote:
> Set my sort_mem to 8192

You really need to look at what your workload is before trying to tweak
sort_mem. With 8G of memory, sort_mem=40 (~400MB) with only 10
active connections might be a good setting. It's usually better to get a
sort to fit into memory than spill to disk. Since you never mentioned
what kind of workload you have or how many active connections there are,
it's pretty much impossible to make a recommendation on that setting.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Jim C. Nasby
On Thu, Oct 26, 2006 at 04:06:09PM +0100, Gavin Hamill wrote:
> On Thu, 26 Oct 2006 10:47:21 -0400
> Tom Lane <[EMAIL PROTECTED]> wrote:
> 
> > Gavin Hamill <[EMAIL PROTECTED]> writes:
> > > Nodes 2 and 3 take only the tables necessary to run our search (10
> > > out of the full 130) and are much lighter (only 7GB on disk cf.
> > > 30GB for the full master) , yet the nightly VACUUM FULL has jumped
> > > from 2 hours to 4 in the space of one day!
> > 
> > I guess the most useful question to ask is "why are you doing VACUUM
> > FULL?" Plain VACUUM should be considerably faster, and for the level
> > of row turnover shown by your log, there doesn't seem to be a reason
> > to use FULL.
> 
> I do FULL on the 'light' clients simply because 'I can'. The example
> posted was a poor choice - the other tables have a larger churn.
> 
> Anyway, once it starts, the load balancer takes it out of rotation so
> no love is lost.
> 
> The same behaviour is shown on the 'heavy' clients (master + 2 slaves)
> which take all tables - although I cannot afford to VACUUM FULL on
> there, the usual VACUUM ANALYZE has begun to take vastly more time
> since yesterday than in the many previous months we've been using pg.

Are you sure that there's nothing else happening on the machine that
could affect the vacuum times? Like, say a backup? Or perhaps updates
coming in from Slony that didn't used to be there?
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Stored procedure slower than sql?

2006-10-26 Thread Jim C. Nasby
The planner has no idea what $1 and $2 are when it plans the query, so
that could easily explain why the performance is different. You can
prepare statements in psql (at least in 8.1), which would be a good way
to verify that theory (compare EXPLAIN for prepared vs. non).

On Thu, Oct 26, 2006 at 09:21:37AM -0700, Matthew Peters wrote:
> Parameterized.
> 
> IE (very simplified)
> 
> CREATE OR REPLACE FUNCTION my_function(IN param1 BIGINT, IN param2
> INTEGER)
> RETURNS my_type
> SECURITY DEFINER
> AS
> $$
>   /* my_type = (a,b,c) */
>   Select a,b,c
>   FROM my_table
>   WHERE indexed_column = $1
>   AND partition_constraint_column = $2;
> $$
> LANGUAGE SQL;
> 
> 
> 
> 
> Matthew A. Peters
> Sr. Software Engineer, Haydrian Corp.
> [EMAIL PROTECTED]
> (mobile) 425-941-6566
>  Haydrian Corp.
> -Original Message-
> From: Tom Lane [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, October 26, 2006 9:15 AM
> To: Matthew Peters
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Stored procedure slower than sql? 
> Importance: High
> 
> "Matthew Peters" <[EMAIL PROTECTED]> writes:
> > How can a stored procedure containing a single query not implement the
> > same execution plan (assumption based on the dramatic performance
> > difference) that an identical ad-hoc query generates?
> 
> Parameterized vs non parameterized query?
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly
> 

-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Gavin Hamill
On Thu, 26 Oct 2006 14:17:29 -0500
"Jim C. Nasby" <[EMAIL PROTECTED]> wrote:

> Are you sure that there's nothing else happening on the machine that
> could affect the vacuum times? Like, say a backup? Or perhaps updates
> coming in from Slony that didn't used to be there?

I'm absolutely certain. The backups run from only one slave, given that
it is a full copy of node 1. Our overnight traffic has not increased
any, and the nightly backups show that the overall size of the DB has
not increased more than usual growth.

Plus, I have fairly verbose logging, and it's not showing anything out
of the ordinary. 

Like I said, it's one of those awful hypothesis/hand-waving problems :)

Cheers,
Gavin.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] query slows down drastically with increased number of fields

2006-10-26 Thread Tom Darci




Hello 
All-
 
  We have a 
question about numbers of fields in the select clause of a query and how that 
affects query speed.
  The following 
query simply selects the primary key field from a table with 100,000 
records:
 

select p.opidFROM ott_op p

 
  It runs in about half a second (running in 
PgAdmin... the query run time, not the data retrieval 
time)
 
  When we change it by adding fields to the select 
list, it slows down drastically. This version takes about 3 
seconds:
 

select 
p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, 
p.opidFROM ott_op p

 
  The more fields we add, the slower it 
gets.
 
  My guess is that we are missing a 
configuration setting... any 
ideas?
  Any help much 
appreciated.
 
Thanks,
-Tom


Re: [PERFORM] query slows down drastically with increased number of fields

2006-10-26 Thread Tom Lane
"Tom Darci" <[EMAIL PROTECTED]> writes:
>   It runs in about half a second (running in PgAdmin... the query run
> time, not the data retrieval time)

I don't have a lot of faith in PgAdmin's ability to distinguish the two.
In fact, for a query such as you have here that's just a bare seqscan,
it's arguably *all* data retrieval time --- the backend will start
emitting records almost instantly.

FWIW, in attempting to duplicate your test I get

regression=# explain analyze select f1 from foo;
 QUERY PLAN

 Seq Scan on foo  (cost=0.00..1541.00 rows=10 width=4) (actual 
time=0.161..487.192 rows=10 loops=1)
 Total runtime: 865.454 ms
(2 rows)

regression=# explain analyze select f1,f1,f1,f1,f1,f1,f1,f1,f1,f1,f1 from foo;
 QUERY PLAN

 Seq Scan on foo  (cost=0.00..1541.00 rows=10 width=4) (actual 
time=0.169..603.795 rows=10 loops=1)
 Total runtime: 984.124 ms
(2 rows)

Note that this test doesn't perform conversion of the field values to
text form, so it's an underestimate of the total time spent by the
backend for the real query.  But I think almost certainly, your speed
difference is all about having to send more values to the client.
The costs not measured by the explain-analyze scenario would scale darn
near linearly with the number of repetitions of f1.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Jim C. Nasby
On Thu, Oct 26, 2006 at 09:35:56PM +0100, Gavin Hamill wrote:
> On Thu, 26 Oct 2006 14:17:29 -0500
> "Jim C. Nasby" <[EMAIL PROTECTED]> wrote:
> 
> > Are you sure that there's nothing else happening on the machine that
> > could affect the vacuum times? Like, say a backup? Or perhaps updates
> > coming in from Slony that didn't used to be there?
> 
> I'm absolutely certain. The backups run from only one slave, given that
> it is a full copy of node 1. Our overnight traffic has not increased
> any, and the nightly backups show that the overall size of the DB has
> not increased more than usual growth.
> 
> Plus, I have fairly verbose logging, and it's not showing anything out
> of the ordinary. 
> 
> Like I said, it's one of those awful hypothesis/hand-waving problems :)

Well, the fact that it's happening on all your nodes leads me to think
Slony is somehow involved. Perhaps it suddenly decided to change how
often it's issuing syncs? I know it issues vacuums as well, so maybe
that's got something to do with it... (though I'm guessing you've
already looked in pg_stat_activity/logs to see if anything
correlates...) Still, it might be worth asking about this on the slony
list...
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] query slows down drastically with increased number of fields

2006-10-26 Thread George Pavlov
i have wondered myself. i wouldn't do it through pgAdmin (not sure what
the best test it, but i thought psql from the same machine might be
better--see below). anyway, the funny thing is that if you concatenate
them the time drops:

~% time psql -dXXX -hYYY -UZZZ -c"select consumer_id from consumer" -o
/dev/null
psql -dXXX -hYYY -UZZZ -c"select consumer_id from consumer" -o   0.09s
user 0.01s system 29% cpu 0.341 total

~% time psql -dXXX -hstgdb0 -p5432 -Umnp -c"select
consumer_id,consumer_id,consumer_id,consumer_id,consumer_id,consumer_id,
consumer_id,consumer_id from consumer" -o /dev/null
psql -dXXX -hYYY -UZZZ -o /dev/null  0.76s user 0.06s system 45% cpu
1.796 total

~% time psql -dXXX -hYYY -UZZZ -c"select
consumer_id||consumer_id||consumer_id||consumer_id||consumer_id||consume
r_id||consumer_id||consumer_id from consumer" -o /dev/null
psql -dXXX -hYYY -UZZZ -o /dev/null  0.18s user 0.04s system 20% cpu
1.061 total

 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Tom Darci
> Sent: Wednesday, October 25, 2006 10:21 AM
> To: pgsql-performance@postgresql.org
> Subject: [PERFORM] query slows down drastically with 
> increased number of fields
> 
> Hello All-
>  
>   We have a question about numbers of fields in the select 
> clause of a query and how that affects query speed.
>   The following query simply selects the primary key field 
> from a table with 100,000 records:
>  
> 
> select p.opid
> FROM 
> ott_op p
> 
> 
>  
>   It runs in about half a second (running in PgAdmin... the 
> query run time, not the data retrieval time)
>  
>   When we change it by adding fields to the select list, it 
> slows down drastically. This version takes about 3 seconds:
>  
> 
> select p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, 
> p.opid, p.opid, p.opid, p.opid, p.opid
> FROM 
> ott_op p
> 
> 
>  
>   The more fields we add, the slower it gets.
>  
>   My guess is that we are missing a configuration setting... 
> any ideas?
>   Any help much appreciated.
>  
> Thanks,
> -Tom
> 

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-26 Thread Andrew Sullivan
On Thu, Oct 26, 2006 at 09:35:56PM +0100, Gavin Hamill wrote:
> 
> I'm absolutely certain. The backups run from only one slave, given that
> it is a full copy of node 1. Our overnight traffic has not increased
> any, and the nightly backups show that the overall size of the DB has
> not increased more than usual growth.

A couple things from your posts:

1.  Don't do VACUUM FULL, please.  It takes longer, and blocks
other things while it's going on, which might mean you're having
table bloat in various slony-related tables.

2.  Are your slony logs showing increased time too?  Are your
targets getting further behind?

3.  Your backups "from the slave" aren't done with pg_dump,
right?

But I suspect Slony has a role here, too.  I'd look carefully at the
slony tables -- especially the sl_log and pg_listen things, which
both are implicated.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
"The year's penultimate month" is not in truth a good way of saying
November.
--H.W. Fowler

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] query slows down drastically with increased number of fields

2006-10-26 Thread Jim C. Nasby
On Thu, Oct 26, 2006 at 03:03:38PM -0700, George Pavlov wrote:
> i have wondered myself. i wouldn't do it through pgAdmin (not sure what
> the best test it, but i thought psql from the same machine might be
> better--see below). anyway, the funny thing is that if you concatenate
> them the time drops:

Sure. Take a look at the output and you'll see there's less data to
shove around.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] query slows down drastically with increased number of fields

2006-10-26 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> On Thu, Oct 26, 2006 at 03:03:38PM -0700, George Pavlov wrote:
>> anyway, the funny thing is that if you concatenate
>> them the time drops:

> Sure. Take a look at the output and you'll see there's less data to
> shove around.

Even more to the point, psql's time to format its standard ASCII-art
output is proportional to the number of columns, because it has to
determine how wide to make each one ... if you used one of the other
display formats such as "expanded" or "unaligned" mode, there's probably
be less difference.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] query slows down drastically with increased number of

2006-10-26 Thread Tom Darci
Thanks for all the feedback, folks.

Running explain analyze (see below) I get results similar to Tom Lane,
where the 2 queries run at the same speed.
And running in psql (see below) we see the expected speed degradation
for multiple fields, although concatenation is not getting us any
advantage.


--
=== RUNNING EXPLAIN ANALYZE ===
--

ot6_tdarci=# explain analyze select p.opid from ott_op p;

  QUERY PLAN


--
 Seq Scan on ott_op p  (cost=1.00..12654.44 rows=114344
width=4) (actual time=0.008..260.739 rows=114344 loops=1)
 Total runtime: 472.833 ms

Time: 473.240 ms


ot6_tdarci=# explain analyze select p.opid, p.opid, p.opid, p.opid,
p.opid, p.opid, p.opid, p.opid, p.opid, p.opid from ott_op p;

  QUERY PLAN


--
 Seq Scan on ott_op p  (cost=1.00..12654.44 rows=114344
width=4) (actual time=0.006..260.795 rows=114344 loops=1)
 Total runtime: 472.980 ms

Time: 473.439 ms

--
=== RUNNING THE QUERIES ===
--

ot6_tdarci=# \o /dev/null

ot6_tdarci=# select p.opid from ott_op p;
Time: 157.419 ms

ot6_tdarci=# select p.opid, p.opid, p.opid, p.opid, p.opid, p.opid,
p.opid, p.opid, p.opid, p.opid from ott_op p;
Time: 659.505 ms

ot6_tdarci=# select p.opid || p.opid || p.opid || p.opid || p.opid ||
p.opid || p.opid || p.opid || p.opid || p.opid from ott_op p;
Time: 672.113 ms 

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 26, 2006 2:53 PM
To: Tom Darci
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] query slows down drastically with increased
number of fields

"Tom Darci" <[EMAIL PROTECTED]> writes:
>   It runs in about half a second (running in PgAdmin... the query run 
> time, not the data retrieval time)

I don't have a lot of faith in PgAdmin's ability to distinguish the two.
In fact, for a query such as you have here that's just a bare seqscan,
it's arguably *all* data retrieval time --- the backend will start
emitting records almost instantly.

FWIW, in attempting to duplicate your test I get

regression=# explain analyze select f1 from foo;
 QUERY PLAN


 Seq Scan on foo  (cost=0.00..1541.00 rows=10 width=4) (actual
time=0.161..487.192 rows=10 loops=1)  Total runtime: 865.454 ms
(2 rows)

regression=# explain analyze select f1,f1,f1,f1,f1,f1,f1,f1,f1,f1,f1
from foo;
 QUERY PLAN


 Seq Scan on foo  (cost=0.00..1541.00 rows=10 width=4) (actual
time=0.169..603.795 rows=10 loops=1)  Total runtime: 984.124 ms
(2 rows)

Note that this test doesn't perform conversion of the field values to
text form, so it's an underestimate of the total time spent by the
backend for the real query.  But I think almost certainly, your speed
difference is all about having to send more values to the client.
The costs not measured by the explain-analyze scenario would scale darn
near linearly with the number of repetitions of f1.

regards, tom lane



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] Index ignored with "is not distinct from", 8.2 beta2

2006-10-26 Thread JEAN-PIERRE PELLETIER

Hi,

I wanted to use "exp1 is not distinct from exp2" which I tough was syntaxic 
sugar for
exp1 is not null and exp2 is not null and exp1 = exp2 or exp1 is null and 
exp2 is null

but my index is ignored with "is not distinct from".

Is this the expected behavior ?

create temporary table t as select * from generate_series(1,100) t(col);
create unique index i on t(col);
analyze t;

-- These queries don't use the index
select count(*) from t where col is not distinct from 123;
select count(*) from t where not col is distinct from 123;

-- This query use the index
select count(*) from t where col is not null and 123 is not null and col = 
123 or col is null and 123 is null;


explain analyze select count(*) from t where col is not distinct from 123;
QUERY PLAN

Aggregate  (cost=19154.79..19154.80 rows=1 width=0) (actual 
time=228.200..228.202 rows=1 loops=1)
  ->  Seq Scan on t  (cost=0.00..17904.90 rows=499956 width=0) (actual 
time=0.042..228.133 rows=1 loops=1)

Filter: (NOT (col IS DISTINCT FROM 123))
Total runtime: 228.290 ms
(4 rows)
Time: 219.000 ms

explain analyze select count(*) from t where not col is distinct from 123;
QUERY PLAN

Aggregate  (cost=19154.79..19154.80 rows=1 width=0) (actual 
time=235.950..235.952 rows=1 loops=1)
  ->  Seq Scan on t  (cost=0.00..17904.90 rows=499956 width=0) (actual 
time=0.040..235.909 rows=1 loops=1)

Filter: (NOT (col IS DISTINCT FROM 123))
Total runtime: 236.065 ms
(4 rows)
Time: 250.000 ms

explain analyze select count(*) from t where col is not null and 123 is not 
null and col = 123 or col is null and 123 is null;

   QUERY PLAN
---
Aggregate  (cost=8.13..8.14 rows=1 width=0) (actual time=0.267..0.268 rows=1 
loops=1)
  ->  Index Scan using i on t  (cost=0.00..8.13 rows=1 width=0) (actual 
time=0.237..0.241 rows=1 loops=1)

Index Cond: (col = 123)
Total runtime: 0.366 ms
(4 rows)
Time: 0.000 ms

I am on Windows XP Service pack 2 with PostgreSQL 8.2 beta2

Thanks,
Jean-Pierre Pelletier
e-djuster



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match