[PERFORM] VACUUMs take twice as long across all nodes
Hullo, here's one of those dreadful touchy-feely hand-waving problems. Our 5-node 8.1.3 Slony system has just started taking /much/ longer to VACUUM ANALYZE.. The data set has not increased more than usual (nightly backups stand at 1.3GB, growing by 10MB per day), and no configuration has changed on the machines. Nodes 2 and 3 take only the tables necessary to run our search (10 out of the full 130) and are much lighter (only 7GB on disk cf. 30GB for the full master) , yet the nightly VACUUM FULL has jumped from 2 hours to 4 in the space of one day! Like I say, no config changes, no reboots / postmaster restarts, no extra processes, and every machine has a comfortable overhead of free page slots + relations. >From a few days ago: 2006-10-20 03:04:29 UTC INFO: "Allocation": found 786856 removable, 4933448 nonremovable row versions in 53461 pages 2006-10-20 03:04:29 UTC DETAIL: 0 dead row versions cannot be removed yet. 2006-10-20 03:07:32 UTC INFO: index "allocation_pkey" now contains 4933448 row versions in 93918 pages 2006-10-20 03:07:32 UTC DETAIL: 786856 index row versions were removed. 2006-10-20 03:14:21 UTC INFO: index "ix_date" now contains 4933448 row versions in 74455 pages 2006-10-20 03:14:21 UTC DETAIL: 786856 index row versions were removed. 2006-10-20 03:22:32 UTC INFO: index "ix_dateprice" now contains 4933448 row versions in 81313 pages 2006-10-20 03:22:32 UTC DETAIL: 786856 index row versions were removed. 2006-10-20 03:24:41 UTC INFO: index "ix_dateroom" now contains 4933448 row versions in 44610 pages 2006-10-20 03:24:41 UTC DETAIL: 786856 index row versions were removed. 2006-10-20 03:27:52 UTC INFO: index "ix_room" now contains 4933448 row versions in 35415 pages 2006-10-20 03:27:52 UTC DETAIL: 786856 index row versions were removed. 2006-10-20 03:31:43 UTC INFO: "Allocation": moved 348324 row versions, truncated 53461 to 46107 pages 2006-10-20 03:31:43 UTC DETAIL: CPU 4.72s/17.63u sec elapsed 230.81 sec. >From last night: 2006-10-26 01:00:30 UTC INFO: vacuuming "public.Allocation" 2006-10-26 01:00:36 UTC INFO: "Allocation": found 774057 removable, 4979938 nonremovable row versions in 53777 pages 2006-10-26 01:00:36 UTC DETAIL: 0 dead row versions cannot be removed yet. 2006-10-26 01:06:18 UTC INFO: index "allocation_pkey" now contains 4979938 row versions in 100800 pages 2006-10-26 01:06:18 UTC DETAIL: 774057 index row versions were removed. 2006-10-26 01:19:22 UTC INFO: index "ix_date" now contains 4979938 row versions in 81630 pages 2006-10-26 01:19:22 UTC DETAIL: 774057 index row versions were removed. 2006-10-26 01:35:17 UTC INFO: index "ix_dateprice" now contains 4979938 row versions in 87750 pages 2006-10-26 01:35:17 UTC DETAIL: 774057 index row versions were removed. 2006-10-26 01:41:27 UTC INFO: index "ix_dateroom" now contains 4979938 row versions in 46320 pages 2006-10-26 01:41:27 UTC DETAIL: 774057 index row versions were removed. 2006-10-26 01:48:18 UTC INFO: index "ix_room" now contains 4979938 row versions in 36513 pages 2006-10-26 01:48:18 UTC DETAIL: 774057 index row versions were removed. 2006-10-26 01:56:35 UTC INFO: "Allocation": moved 322744 row versions, truncated 53777 to 46542 pages 2006-10-26 01:56:35 UTC DETAIL: CPU 4.21s/15.90u sec elapsed 496.30 sec. As you can see, the amount of system + user time for these runs are comparable, but the amount of real time has more than doubled. This isn't even a case for making the cost-based delay vacuum more aggressive because I already have vacuum_cost_delay = 0 on all machines to make the vacuum run as quickly as possible. Any ideas warmly received! :) Cheers, Gavin. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] commit so slow program looks frozen
> > > You may try to figure out what's the process doing (the backend > > > obviously, not the frontend (Tcl) process) by attaching > to it with > > > strace. > > > > It's so sad when us poor Windows guys get helpful hints from people > > assume that we're smart enough to run *NIX... ;-) > > You should try a google search on strace and NT or windows or > XP... I was surprised how many various implementations of it I found. Let me know if you find one that's stable, I've been wanting that. I've tried one or two, but it's always been just a matter of time before the inevitable BSOD. > > > Maybe it's swamped by a context switch storm (but in that case, > > > probably the other processes would be affected as well). > > > > What is a context switch storm? (and what a great name for a heavy > > metal rock band!) > > I can just see the postgresql group getting together at the > next O'Reilley's conference and creating that band. And it > will all be your fault. *DO NOT LET DEVRIM SEE THIS THREAD* > A context switch storm is when your machine spends more time > trying to figure out what to do than actually doing anything. > The CPU spends most it's time switching between programs > than running them. I can see Windows benig more sucepitble to this than say Linux, because switching between processes there is a lot more expensive than on Linux. > Seeing as PostgreSQL runs one thread / process per > connection, it's pretty unlikely that the problem here is one > "hungry" thread. Do all four CPUs show busy, or just one? > Do you have a way of measuring how much time is spent waiting > on I/O on a windows machine like top / vmstat does in unix? There are plenty of counters in the Performance Monitor. Specificall, look at "disk queue counters" - they indicate when the I/O subsystem is backed up. //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] commit so slow program looks frozen
I seem to remember Oleg/Teodor recently reporting a problem with Windows hanging on a multi-processor machine, during a heavy load operation. In their case it seemed like a vacuum would allow it to wake up. They did commit a patch that did not make it into the last minor version for lack of testing. Perhaps you could see if that patch might work for you, which would also help ease the argument against the patches lack of testing. -rocco ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] commit so slow program looks frozen
> I can just see the postgresql group getting together at the next > O'Reilley's conference and creating that band. And it will all be your > fault. Finally, a chance for me to wear my black leather pants. > A context switch storm is when your machine spends more time trying to > figure out what to do than actually doing anything. The CPU spends most > it's time switching between programs than running them. Is thatl likely on a new 4 CPU server that has no clients connected and that is only running four (admittedly heavy) TCL data load scripts? > Seeing as PostgreSQL runs one thread / process per connection, it's > pretty unlikely that the problem here is one "hungry" thread. Do all > four CPUs show busy, or just one? Do you have a way of measuring how > much time is spent waiting on I/O on a windows machine like top / vmstat > does in unix? Before optimising the queries, all four CPU's were pinned to max performance (that's why I only run four imports at a time). After opimisation, all four CPU's are busy, but usage is spikey (which looks more normal), but all are obviously busy. I have this feeling that when an import app freezes, one CPU goes idle while the others stay busy - I will confirm that with the next import operation. I suspect that the server has the Xeon processors that were of a generation which PostgreSQL had a problem with - should a postgresql process be able to distrivute its processing load across CPU's? (i.e. When I see one CPU at 100% while all others are idle?) > Note that if you have an import process that needs a big chunk of > memory, you can set just that one connection to use a large setting and > leave the default smaller. Total memory usage is below the max available. Each postgresql process takes up 500MB, there are four running and I have 4GB of RAM. Carlo ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] VACUUMs take twice as long across all nodes
Gavin Hamill <[EMAIL PROTECTED]> writes: > Nodes 2 and 3 take only the tables necessary to run our search (10 out > of the full 130) and are much lighter (only 7GB on disk cf. 30GB for > the full master) , yet the nightly VACUUM FULL has jumped from 2 hours > to 4 in the space of one day! I guess the most useful question to ask is "why are you doing VACUUM FULL?" Plain VACUUM should be considerably faster, and for the level of row turnover shown by your log, there doesn't seem to be a reason to use FULL. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] commit so slow program looks frozen
This is pretty interesting - where can I read more on this? Windows isn't actually hanging, one single command line window is - from its behaviour, it looks like the TCL postgresql package is waiting for pg_exec to come back from the commit (I believe the commit has actually gone through). It could even be that there's something wrong with the TCL package, but from my understanding it is one of the most complete interfaces out there - which is weird, because TCL seems to be the most unpopular language in the community. Caro ""Rocco Altier"" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] >I seem to remember Oleg/Teodor recently reporting a problem with Windows > hanging on a multi-processor machine, during a heavy load operation. > > In their case it seemed like a vacuum would allow it to wake up. They > did commit a patch that did not make it into the last minor version for > lack of testing. > > Perhaps you could see if that patch might work for you, which would also > help ease the argument against the patches lack of testing. > > -rocco > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend > ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] commit so slow program looks frozen
On 10/26/06, Carlo Stonebanks <[EMAIL PROTECTED]> wrote: This is pretty interesting - where can I read more on this? Windows isn't actually hanging, one single command line window is - from its behaviour, it looks like the TCL postgresql package is waiting for pg_exec to come back from the commit (I believe the commit has actually gone through). It could even be that there's something wrong with the TCL package, but from my understanding it is one of the most complete interfaces out there - which is weird, because TCL seems to be the most unpopular language in the community. when it happens, make sure to query pg_locks and see what is going on there lock issues are not supposed to manifest on a commit, which releases locks, but you never know. There have been reports of insonsistent lock ups on windows (espeically multi-processor) which you might be experiencing. Make sure you have the very latest version of pg 8.1.x. Also consider checking out 8.2 and see if you can reproduce the behavior there...this will require compiling postgresql. merlin ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] VACUUMs take twice as long across all nodes
On Thu, 26 Oct 2006 10:47:21 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: > Gavin Hamill <[EMAIL PROTECTED]> writes: > > Nodes 2 and 3 take only the tables necessary to run our search (10 > > out of the full 130) and are much lighter (only 7GB on disk cf. > > 30GB for the full master) , yet the nightly VACUUM FULL has jumped > > from 2 hours to 4 in the space of one day! > > I guess the most useful question to ask is "why are you doing VACUUM > FULL?" Plain VACUUM should be considerably faster, and for the level > of row turnover shown by your log, there doesn't seem to be a reason > to use FULL. I do FULL on the 'light' clients simply because 'I can'. The example posted was a poor choice - the other tables have a larger churn. Anyway, once it starts, the load balancer takes it out of rotation so no love is lost. The same behaviour is shown on the 'heavy' clients (master + 2 slaves) which take all tables - although I cannot afford to VACUUM FULL on there, the usual VACUUM ANALYZE has begun to take vastly more time since yesterday than in the many previous months we've been using pg. Cheers, Gavin. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Stored procedure slower than sql?
(Repost - did not appear to make it to the list the first time) I have written a stored procedure for 8.1 that wraps a single (albeit complex) query, and uses 2 IN parameters (BIGINT, INTEGER) in the FROM JOIN and WHERE clauses. The procedure is written in SQL (as opposed to plpgsql - although testing in plpgsql produces the same problem). The results are SETOF a custom type (SMALLINT, NUMERIC(38,2), NUMERIC(38,2)). The central query, when tested in psql and pgadmin III returns in 500 ms. As a stored procedure, it returns in 22000 ms! How can a stored procedure containing a single query not implement the same execution plan (assumption based on the dramatic performance difference) that an identical ad-hoc query generates? I ran a series of tests, and determined that if I replaced the parameters with hard-coded values, the execution time returned to 500ms. Can anyone shed some light on this for me - it seems counter-intuitive? Here are some particulars about the underlying query and tables: The query selects a month number from a generate_series(1,12) left outer joined on a subquery, which produces three fields. The subquery is a UNION ALL of 5 tables. Each of the five tables has 100 inherited partitions. As you can see from the execution plan, the partitioning constraint is successfully restricting the query to the appropriate partition for each of the five tables. The constraint for each partition is a CHAR(2) field "partition_key" = '00' (where '00' is a two-digit CHAR(2) value that is returned from a function call ala table1.partition_key = partition_key($1) ) Execution Plan : Sort (cost=10410.15..10410.65 rows=200 width=68) (actual time=273.050..273.071 rows=12 loops=1) Sort Key: mm.monthnumber -> HashAggregate (cost=10398.01..10402.51 rows=200 width=68) (actual time=272.970..273.001 rows=12 loops=1) -> Hash Ltable5 Join (cost=10370.01..10390.51 rows=1000 width=68) (actual time=272.817..272.902 rows=13 loops=1) Hash Cond: (("outer".monthnumber)::double precision = "inner".monthnumber) -> Function Scan on generate_series mm (cost=0.00..12.50 rows=1000 width=4) (actual time=0.018..0.043 rows=12 loops=1) -> Hash (cost=10369.99..10369.99 rows=10 width=72) (actual time=272.769..272.769 rows=8 loops=1) -> Append (cost=1392.08..10369.89 rows=10 width=47) (actual time=39.581..272.734 rows=8 loops=1) -> Subquery Scan "*SELECT* 1" (cost=1392.08..1392.15 rows=2 width=47) (actual time=39.576..39.582 rows=1 loops=1) -> HashAggregate (cost=1392.08..1392.13 rows=2 width=47) (actual time=39.571..39.573 rows=1 loops=1) -> Result (cost=0.00..1392.05 rows=2 width=47) (actual time=25.240..39.538 rows=1 loops=1) -> Append (cost=0.00..1392.03 rows=2 width=47) (actual time=25.224..39.518 rows=1 loops=1) -> Seq Scan on table1 table1 (cost=0.00..14.50 rows=1 width=47) (actual time=0.003..0.003 rows=0 loops=1) Filter: ((partition_key = '12'::bpchar) AND (substr((indexed_field)::text, 2, 1) = '5'::text) AND (table_key = 10265512) AND (date_part('year'::text, (event_date)::timestamp without time zone) = 2005::double precision)) -> Seq Scan on table1_p12 table1 (cost=0.00..1377.53 rows=1 width=28) (actual time=25.214..39.503 rows=1 loops=1) Filter: ((partition_key = '12'::bpchar) AND (substr((indexed_field)::text, 2, 1) = '5'::text) AND (table_key = 10265512) AND (date_part('year'::text, (event_date)::timestamp without time zone) = 2005::double precision)) -> Subquery Scan "*SELECT* 2" (cost=2741.47..2741.56 rows=2 width=47) (actual time=78.140..78.140 rows=0 loops=1) -> HashAggregate (cost=2741.47..2741.54 rows=2 width=47) (actual time=78.134..78.134 rows=0 loops=1) -> Result (cost=0.00..2741.45 rows=2 width=47) (actual time=78.128..78.128 rows=0 loops=1) -> Append (cost=0.00..2741.43 rows=2 width=47) (actual time=78.122..78.122 rows=0 loops=1) -> Seq Scan on table2 table2 (cost=0.00..12.40 rows=1 width=47) (actual time=0.004..0.004 rows=0 loops=1) Filter: ((partition_key = '12'::bpchar) AND (substr((indexed_field)::text, 2, 1) = '5'::text) AND (table_key = 10265512) AND (date_part('year'::text, (event_date)::timestamp without time zone) = 2005::double precision)) -> Seq Scan on table2_p12 table2 (cost=0.00..2729.03 row
Re: [PERFORM] Stored procedure slower than sql?
"Matthew Peters" <[EMAIL PROTECTED]> writes: > How can a stored procedure containing a single query not implement the > same execution plan (assumption based on the dramatic performance > difference) that an identical ad-hoc query generates? Parameterized vs non parameterized query? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Stored procedure slower than sql?
Parameterized. IE (very simplified) CREATE OR REPLACE FUNCTION my_function(IN param1 BIGINT, IN param2 INTEGER) RETURNS my_type SECURITY DEFINER AS $$ /* my_type = (a,b,c) */ Select a,b,c FROM my_table WHERE indexed_column = $1 AND partition_constraint_column = $2; $$ LANGUAGE SQL; Matthew A. Peters Sr. Software Engineer, Haydrian Corp. [EMAIL PROTECTED] (mobile) 425-941-6566 Haydrian Corp. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Thursday, October 26, 2006 9:15 AM To: Matthew Peters Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Stored procedure slower than sql? Importance: High "Matthew Peters" <[EMAIL PROTECTED]> writes: > How can a stored procedure containing a single query not implement the > same execution plan (assumption based on the dramatic performance > difference) that an identical ad-hoc query generates? Parameterized vs non parameterized query? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] commit so slow program looks frozen
> A context switch storm is when your machine spends more time trying to > figure out what to do than actually doing anything. The CPU spends most > it's time switching between programs than running them. Well, we usually use the term "thrashing" as the generic for when your machine is spending more time on overhead than doing user work - this would include paging or context switching, along with whatever else. A context-switch storm would be a specific form of thrashing! Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] commit so slow program looks frozen
On Thu, 26 Oct 2006, Carlo Stonebanks wrote: > > It could even be that there's something wrong with the TCL package, but from > my understanding it is one of the most complete interfaces out there - which > is weird, because TCL seems to be the most unpopular language in the > community. > Not that this matters much and it's slightly off the topic of performance, but... ...I would have to check my _ancient_ emails for the name of the guy and the dates, but the integration was first done while I was a researcher at Berkeley, at the tail end of the Postgres team's funding. My team used Postgres with TCL internals to implement "the query from hell" inside the server. That was about 1994 or '95, IIRC. At that time, most people who knew both said that they were roughly equivalent, with PERL being _vastly_ less intelligible (to humans) and they hated it. What happened was PERL got exposure that TCL didn't and people who didn't know better jumped on it. So, it was one of the most complete interfaces because it was done first, or nearly first, by the original guys that created the original Postgres. Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] commit so slow program looks frozen
> when it happens, make sure to query pg_locks and see what is going on > there lock issues are not supposed to manifest on a commit, which > releases locks, but you never know. There aren't any pedning locks (assuming that pgAdmin is using pg_locks to display pendin glocks). > There have been reports of > insonsistent lock ups on windows (espeically multi-processor) which > you might be experiencing. Make sure you have the very latest version > of pg 8.1.x. Also consider checking out 8.2 and see if you can > reproduce the behavior there...this will require compiling postgresql. Are these associated with any type of CPU? Carlo ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
OT: TCL vs Perl Re: [PERFORM] commit so slow program looks frozen
Perl started out fast - TCL started out slow. Perl used syntax that, although it would drive some people crazy, followed a linguistic curve that Larry Wall claimed was healthy. The English language is crazy, and yet, it has become standard world wide as well. Designed, regular languages like Esperanto have not received much support either. Perl is designed to be practical. TCL was designed to be minimalistic. Perl uses common idioms for UNIX programmers. // for regular expressions, $VAR for variables, Many of the statement are familiar for C programmers. ++ for increment (compare against 'incr abc' for TCL). $a=5 for assignment, compare against 'set abc 5' in TCL. TCL tries to have a reduced syntax, where 'everything is a string' which requires wierdness for people. For example, newline is end-of-line, so { must be positioned correctly. Code is a string, so in some cases you need to escape code, otherwise not. Perl has object oriented support built-in. It's ugly, but it works. TCL has a questionable '[incr tcl]' package. Perl has a wealth of modules on CPAN to do almost anything you need to. TCL has the beginning of one (not as rich), but comes built-in with things like event loops, and graphicals (Tk). I could go on and on - but I won't, because this is the PostgreSQL mailing list. People either get Perl, or TCL, or they don't. More people 'get' Perl, because it was marketted better, it's syntax is deceivingly comparable to other well known languages, and for the longest time, it was much faster than TCL to write (especially when using regular expressions) and faster to run. Did TCL get treated unfairly as a result? It's a language. Who cares! :-) Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: OT: TCL vs Perl Re: [PERFORM] commit so slow program looks frozen
> Perl has a wealth of modules on CPAN to do almost anything you need to. > TCL has the beginning of one (not as rich), but comes built-in with things > like event loops, and graphicals (Tk). > > I could go on and on - but I won't, because this is the PostgreSQL > mailing list. People either get Perl, or TCL, or they don't. More > people 'get' Perl, because it was marketted better, it's syntax is > deceivingly comparable to other well known languages, and for the > longest time, it was much faster than TCL to write (especially when > using regular expressions) and faster to run. > > Did TCL get treated unfairly as a result? It's a language. Who cares! :-) You forgot the god of scripting languages, Python... (Yes perl is much better at system level scripting than Python). Sincerely, Joshua D. Drake > > Cheers, > mark > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Configuration Issue ?
On Wed, Oct 25, 2006 at 05:31:29PM -0400, Mark Lonsdale wrote: > Set my sort_mem to 8192 You really need to look at what your workload is before trying to tweak sort_mem. With 8G of memory, sort_mem=40 (~400MB) with only 10 active connections might be a good setting. It's usually better to get a sort to fit into memory than spill to disk. Since you never mentioned what kind of workload you have or how many active connections there are, it's pretty much impossible to make a recommendation on that setting. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] VACUUMs take twice as long across all nodes
On Thu, Oct 26, 2006 at 04:06:09PM +0100, Gavin Hamill wrote: > On Thu, 26 Oct 2006 10:47:21 -0400 > Tom Lane <[EMAIL PROTECTED]> wrote: > > > Gavin Hamill <[EMAIL PROTECTED]> writes: > > > Nodes 2 and 3 take only the tables necessary to run our search (10 > > > out of the full 130) and are much lighter (only 7GB on disk cf. > > > 30GB for the full master) , yet the nightly VACUUM FULL has jumped > > > from 2 hours to 4 in the space of one day! > > > > I guess the most useful question to ask is "why are you doing VACUUM > > FULL?" Plain VACUUM should be considerably faster, and for the level > > of row turnover shown by your log, there doesn't seem to be a reason > > to use FULL. > > I do FULL on the 'light' clients simply because 'I can'. The example > posted was a poor choice - the other tables have a larger churn. > > Anyway, once it starts, the load balancer takes it out of rotation so > no love is lost. > > The same behaviour is shown on the 'heavy' clients (master + 2 slaves) > which take all tables - although I cannot afford to VACUUM FULL on > there, the usual VACUUM ANALYZE has begun to take vastly more time > since yesterday than in the many previous months we've been using pg. Are you sure that there's nothing else happening on the machine that could affect the vacuum times? Like, say a backup? Or perhaps updates coming in from Slony that didn't used to be there? -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Stored procedure slower than sql?
The planner has no idea what $1 and $2 are when it plans the query, so that could easily explain why the performance is different. You can prepare statements in psql (at least in 8.1), which would be a good way to verify that theory (compare EXPLAIN for prepared vs. non). On Thu, Oct 26, 2006 at 09:21:37AM -0700, Matthew Peters wrote: > Parameterized. > > IE (very simplified) > > CREATE OR REPLACE FUNCTION my_function(IN param1 BIGINT, IN param2 > INTEGER) > RETURNS my_type > SECURITY DEFINER > AS > $$ > /* my_type = (a,b,c) */ > Select a,b,c > FROM my_table > WHERE indexed_column = $1 > AND partition_constraint_column = $2; > $$ > LANGUAGE SQL; > > > > > Matthew A. Peters > Sr. Software Engineer, Haydrian Corp. > [EMAIL PROTECTED] > (mobile) 425-941-6566 > Haydrian Corp. > -Original Message- > From: Tom Lane [mailto:[EMAIL PROTECTED] > Sent: Thursday, October 26, 2006 9:15 AM > To: Matthew Peters > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Stored procedure slower than sql? > Importance: High > > "Matthew Peters" <[EMAIL PROTECTED]> writes: > > How can a stored procedure containing a single query not implement the > > same execution plan (assumption based on the dramatic performance > > difference) that an identical ad-hoc query generates? > > Parameterized vs non parameterized query? > > regards, tom lane > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly > -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] VACUUMs take twice as long across all nodes
On Thu, 26 Oct 2006 14:17:29 -0500 "Jim C. Nasby" <[EMAIL PROTECTED]> wrote: > Are you sure that there's nothing else happening on the machine that > could affect the vacuum times? Like, say a backup? Or perhaps updates > coming in from Slony that didn't used to be there? I'm absolutely certain. The backups run from only one slave, given that it is a full copy of node 1. Our overnight traffic has not increased any, and the nightly backups show that the overall size of the DB has not increased more than usual growth. Plus, I have fairly verbose logging, and it's not showing anything out of the ordinary. Like I said, it's one of those awful hypothesis/hand-waving problems :) Cheers, Gavin. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] query slows down drastically with increased number of fields
Hello All- We have a question about numbers of fields in the select clause of a query and how that affects query speed. The following query simply selects the primary key field from a table with 100,000 records: select p.opidFROM ott_op p It runs in about half a second (running in PgAdmin... the query run time, not the data retrieval time) When we change it by adding fields to the select list, it slows down drastically. This version takes about 3 seconds: select p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opidFROM ott_op p The more fields we add, the slower it gets. My guess is that we are missing a configuration setting... any ideas? Any help much appreciated. Thanks, -Tom
Re: [PERFORM] query slows down drastically with increased number of fields
"Tom Darci" <[EMAIL PROTECTED]> writes: > It runs in about half a second (running in PgAdmin... the query run > time, not the data retrieval time) I don't have a lot of faith in PgAdmin's ability to distinguish the two. In fact, for a query such as you have here that's just a bare seqscan, it's arguably *all* data retrieval time --- the backend will start emitting records almost instantly. FWIW, in attempting to duplicate your test I get regression=# explain analyze select f1 from foo; QUERY PLAN Seq Scan on foo (cost=0.00..1541.00 rows=10 width=4) (actual time=0.161..487.192 rows=10 loops=1) Total runtime: 865.454 ms (2 rows) regression=# explain analyze select f1,f1,f1,f1,f1,f1,f1,f1,f1,f1,f1 from foo; QUERY PLAN Seq Scan on foo (cost=0.00..1541.00 rows=10 width=4) (actual time=0.169..603.795 rows=10 loops=1) Total runtime: 984.124 ms (2 rows) Note that this test doesn't perform conversion of the field values to text form, so it's an underestimate of the total time spent by the backend for the real query. But I think almost certainly, your speed difference is all about having to send more values to the client. The costs not measured by the explain-analyze scenario would scale darn near linearly with the number of repetitions of f1. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] VACUUMs take twice as long across all nodes
On Thu, Oct 26, 2006 at 09:35:56PM +0100, Gavin Hamill wrote: > On Thu, 26 Oct 2006 14:17:29 -0500 > "Jim C. Nasby" <[EMAIL PROTECTED]> wrote: > > > Are you sure that there's nothing else happening on the machine that > > could affect the vacuum times? Like, say a backup? Or perhaps updates > > coming in from Slony that didn't used to be there? > > I'm absolutely certain. The backups run from only one slave, given that > it is a full copy of node 1. Our overnight traffic has not increased > any, and the nightly backups show that the overall size of the DB has > not increased more than usual growth. > > Plus, I have fairly verbose logging, and it's not showing anything out > of the ordinary. > > Like I said, it's one of those awful hypothesis/hand-waving problems :) Well, the fact that it's happening on all your nodes leads me to think Slony is somehow involved. Perhaps it suddenly decided to change how often it's issuing syncs? I know it issues vacuums as well, so maybe that's got something to do with it... (though I'm guessing you've already looked in pg_stat_activity/logs to see if anything correlates...) Still, it might be worth asking about this on the slony list... -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] query slows down drastically with increased number of fields
i have wondered myself. i wouldn't do it through pgAdmin (not sure what the best test it, but i thought psql from the same machine might be better--see below). anyway, the funny thing is that if you concatenate them the time drops: ~% time psql -dXXX -hYYY -UZZZ -c"select consumer_id from consumer" -o /dev/null psql -dXXX -hYYY -UZZZ -c"select consumer_id from consumer" -o 0.09s user 0.01s system 29% cpu 0.341 total ~% time psql -dXXX -hstgdb0 -p5432 -Umnp -c"select consumer_id,consumer_id,consumer_id,consumer_id,consumer_id,consumer_id, consumer_id,consumer_id from consumer" -o /dev/null psql -dXXX -hYYY -UZZZ -o /dev/null 0.76s user 0.06s system 45% cpu 1.796 total ~% time psql -dXXX -hYYY -UZZZ -c"select consumer_id||consumer_id||consumer_id||consumer_id||consumer_id||consume r_id||consumer_id||consumer_id from consumer" -o /dev/null psql -dXXX -hYYY -UZZZ -o /dev/null 0.18s user 0.04s system 20% cpu 1.061 total > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Tom Darci > Sent: Wednesday, October 25, 2006 10:21 AM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] query slows down drastically with > increased number of fields > > Hello All- > > We have a question about numbers of fields in the select > clause of a query and how that affects query speed. > The following query simply selects the primary key field > from a table with 100,000 records: > > > select p.opid > FROM > ott_op p > > > > It runs in about half a second (running in PgAdmin... the > query run time, not the data retrieval time) > > When we change it by adding fields to the select list, it > slows down drastically. This version takes about 3 seconds: > > > select p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, > p.opid, p.opid, p.opid, p.opid, p.opid > FROM > ott_op p > > > > The more fields we add, the slower it gets. > > My guess is that we are missing a configuration setting... > any ideas? > Any help much appreciated. > > Thanks, > -Tom > ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] VACUUMs take twice as long across all nodes
On Thu, Oct 26, 2006 at 09:35:56PM +0100, Gavin Hamill wrote: > > I'm absolutely certain. The backups run from only one slave, given that > it is a full copy of node 1. Our overnight traffic has not increased > any, and the nightly backups show that the overall size of the DB has > not increased more than usual growth. A couple things from your posts: 1. Don't do VACUUM FULL, please. It takes longer, and blocks other things while it's going on, which might mean you're having table bloat in various slony-related tables. 2. Are your slony logs showing increased time too? Are your targets getting further behind? 3. Your backups "from the slave" aren't done with pg_dump, right? But I suspect Slony has a role here, too. I'd look carefully at the slony tables -- especially the sl_log and pg_listen things, which both are implicated. A -- Andrew Sullivan | [EMAIL PROTECTED] "The year's penultimate month" is not in truth a good way of saying November. --H.W. Fowler ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] query slows down drastically with increased number of fields
On Thu, Oct 26, 2006 at 03:03:38PM -0700, George Pavlov wrote: > i have wondered myself. i wouldn't do it through pgAdmin (not sure what > the best test it, but i thought psql from the same machine might be > better--see below). anyway, the funny thing is that if you concatenate > them the time drops: Sure. Take a look at the output and you'll see there's less data to shove around. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] query slows down drastically with increased number of fields
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > On Thu, Oct 26, 2006 at 03:03:38PM -0700, George Pavlov wrote: >> anyway, the funny thing is that if you concatenate >> them the time drops: > Sure. Take a look at the output and you'll see there's less data to > shove around. Even more to the point, psql's time to format its standard ASCII-art output is proportional to the number of columns, because it has to determine how wide to make each one ... if you used one of the other display formats such as "expanded" or "unaligned" mode, there's probably be less difference. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] query slows down drastically with increased number of
Thanks for all the feedback, folks. Running explain analyze (see below) I get results similar to Tom Lane, where the 2 queries run at the same speed. And running in psql (see below) we see the expected speed degradation for multiple fields, although concatenation is not getting us any advantage. -- === RUNNING EXPLAIN ANALYZE === -- ot6_tdarci=# explain analyze select p.opid from ott_op p; QUERY PLAN -- Seq Scan on ott_op p (cost=1.00..12654.44 rows=114344 width=4) (actual time=0.008..260.739 rows=114344 loops=1) Total runtime: 472.833 ms Time: 473.240 ms ot6_tdarci=# explain analyze select p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid from ott_op p; QUERY PLAN -- Seq Scan on ott_op p (cost=1.00..12654.44 rows=114344 width=4) (actual time=0.006..260.795 rows=114344 loops=1) Total runtime: 472.980 ms Time: 473.439 ms -- === RUNNING THE QUERIES === -- ot6_tdarci=# \o /dev/null ot6_tdarci=# select p.opid from ott_op p; Time: 157.419 ms ot6_tdarci=# select p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid, p.opid from ott_op p; Time: 659.505 ms ot6_tdarci=# select p.opid || p.opid || p.opid || p.opid || p.opid || p.opid || p.opid || p.opid || p.opid || p.opid from ott_op p; Time: 672.113 ms -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Thursday, October 26, 2006 2:53 PM To: Tom Darci Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] query slows down drastically with increased number of fields "Tom Darci" <[EMAIL PROTECTED]> writes: > It runs in about half a second (running in PgAdmin... the query run > time, not the data retrieval time) I don't have a lot of faith in PgAdmin's ability to distinguish the two. In fact, for a query such as you have here that's just a bare seqscan, it's arguably *all* data retrieval time --- the backend will start emitting records almost instantly. FWIW, in attempting to duplicate your test I get regression=# explain analyze select f1 from foo; QUERY PLAN Seq Scan on foo (cost=0.00..1541.00 rows=10 width=4) (actual time=0.161..487.192 rows=10 loops=1) Total runtime: 865.454 ms (2 rows) regression=# explain analyze select f1,f1,f1,f1,f1,f1,f1,f1,f1,f1,f1 from foo; QUERY PLAN Seq Scan on foo (cost=0.00..1541.00 rows=10 width=4) (actual time=0.169..603.795 rows=10 loops=1) Total runtime: 984.124 ms (2 rows) Note that this test doesn't perform conversion of the field values to text form, so it's an underestimate of the total time spent by the backend for the real query. But I think almost certainly, your speed difference is all about having to send more values to the client. The costs not measured by the explain-analyze scenario would scale darn near linearly with the number of repetitions of f1. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Index ignored with "is not distinct from", 8.2 beta2
Hi, I wanted to use "exp1 is not distinct from exp2" which I tough was syntaxic sugar for exp1 is not null and exp2 is not null and exp1 = exp2 or exp1 is null and exp2 is null but my index is ignored with "is not distinct from". Is this the expected behavior ? create temporary table t as select * from generate_series(1,100) t(col); create unique index i on t(col); analyze t; -- These queries don't use the index select count(*) from t where col is not distinct from 123; select count(*) from t where not col is distinct from 123; -- This query use the index select count(*) from t where col is not null and 123 is not null and col = 123 or col is null and 123 is null; explain analyze select count(*) from t where col is not distinct from 123; QUERY PLAN Aggregate (cost=19154.79..19154.80 rows=1 width=0) (actual time=228.200..228.202 rows=1 loops=1) -> Seq Scan on t (cost=0.00..17904.90 rows=499956 width=0) (actual time=0.042..228.133 rows=1 loops=1) Filter: (NOT (col IS DISTINCT FROM 123)) Total runtime: 228.290 ms (4 rows) Time: 219.000 ms explain analyze select count(*) from t where not col is distinct from 123; QUERY PLAN Aggregate (cost=19154.79..19154.80 rows=1 width=0) (actual time=235.950..235.952 rows=1 loops=1) -> Seq Scan on t (cost=0.00..17904.90 rows=499956 width=0) (actual time=0.040..235.909 rows=1 loops=1) Filter: (NOT (col IS DISTINCT FROM 123)) Total runtime: 236.065 ms (4 rows) Time: 250.000 ms explain analyze select count(*) from t where col is not null and 123 is not null and col = 123 or col is null and 123 is null; QUERY PLAN --- Aggregate (cost=8.13..8.14 rows=1 width=0) (actual time=0.267..0.268 rows=1 loops=1) -> Index Scan using i on t (cost=0.00..8.13 rows=1 width=0) (actual time=0.237..0.241 rows=1 loops=1) Index Cond: (col = 123) Total runtime: 0.366 ms (4 rows) Time: 0.000 ms I am on Windows XP Service pack 2 with PostgreSQL 8.2 beta2 Thanks, Jean-Pierre Pelletier e-djuster ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match