Re: [PERFORM] Tuning for warm standby
On 9/27/07, Kevin Kempter [EMAIL PROTECTED] wrote: Hi All; I'm preparing to fire up WAL archiving on 8 production servers We will follow up with implementing a warm standby scenariio. Does anyone have any thoughts per how to maximize performance, yet minimize the potential for data loss assuming we were not able to retrieve the final un-archived WAL segment from the original pg_xlog dir in the case of a crash? the standby mechanism is actually very simple and there is very little to do for efficient operation. all the hard work is done inside the wal algorithms and from the outside it works like a fancy rsync. some performance tips: * don't use encrypted channel (scp) to transfer wal segments from primary to secondary. * make sure the link between servers is gigabit at least. bonded ethernet couldn't hurt if you can easily fit it in your topology * do not directly write wal segments (nfs, cifs) to the remote folder. whatever you use, make sure it puts files into the remote folder atomically unless it is specifically designed to handle wal segments. * there's not much to do on the standby side. I've set up a few warm standby systems with pg_standby...it works great. I find it works best using link mode (-l) and at lest 256 wal file before it prunes. 'archive_timeout' is a way to guarantee your last transferred file is no older than 'x' seconds. I am not a big fan of setting this...most of the servers I work with are fairly busy and I'd prefer to let the server decide when to flip files. I would only consider setting this in a server that had very little writing going on but what did get written was important. There is a new player in warm standby systems (developed by skype!): http://pgfoundry.org/projects/skytools/ I haven't looked at it yet, but supposedly it can stream WAL files over real time. definately worth looking in to. This would moot some of the other advice I've given here. merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Postgres 7.4.2 hanging when vacuum full is run
Hi - This has been happening more recently. Our database hangs after a VACUUM and is unresponsive when we come in next morning. The vacuum job runs at 03:00 am daily. The command is : /usr/local/pgsql/bin/vacuumdb --full -d DbName Also, what exactly does this mean VACUUM waiting. Is there a reason why it is never emerging from the VACUUM job? I understand that doing a vacuumdb --full causes the tables to lock (not really sure about the workings of vacuum). Any light on this would be really appreciated. Thanks, Radhika Below is what ps -ef |grep postgres shows: 5530 ?S 0:13 /usr/local/pgsql/bin/postmaster -i 5534 ?S 0:01 postgres: stats buffer process 5535 ?S 0:04 postgres: stats collector process 5621 ?S 0:53 postgres: slony myDB 10.142.20.50 idle 5626 ?S 0:51 postgres: slony myDB 10.142.20.50 idle 5627 ?S 0:34 postgres: slony myDB 10.142.20.50 idle 5628 ?S 5:40 postgres: slony myDB 10.142.20.50 idle 5637 ?S 2:09 postgres: slony myDB 10.132.20.26 idle 5638 ?S 1:56 postgres: slony myDB 10.132.20.26 idle 5745 ?S 42:08 postgres: abc myDB [local] idle 20774 ?S 4:29 postgres: abc myDB [local] idle 20775 ?S 0:00 postgres: abc myDB [local] idle in transaction 20776 ?S 0:00 postgres: abc myDB [local] idle 17509 ?S 0:06 postgres: abc myDB [local] VACUUM waiting 24656 ?S 0:00 postgres: abc myDB [local] INSERT waiting 30489 ?S 0:00 postgres: abc myDB [local] SELECT waiting 30637 ?S 0:00 postgres: abc myDB [local] UPDATE waiting 30647 ?S 0:00 postgres: abc myDB [local] UPDATE waiting 30668 ?S 0:00 postgres: abc myDB [local] UPDATE waiting -- It is all a matter of perspective. You choose your view by choosing where to stand. --Larry Wall
Re: [PERFORM] Postgres 7.4.2 hanging when vacuum full is run
On top of what Vivek said, you need to update your pg install. 7.4.2 had a few data eating bugs if I remember correctly. 7.4 branch is up to 7.4.18, and those are a lot of bug fixes (2+ years) you're missing. If one of those bugs eats your data, don't expect any sympathy. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] sequence query performance issues
Richard Huxton [EMAIL PROTECTED] writes: Hmm - why is it doing that? I'm betting that the OP's people.uid column is not an integer. Existing PG releases can't use hashed subplans for cross-data-type comparisons (8.3 will be a bit smarter). regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Postgres 7.4.2 hanging when vacuum full is run
On Sep 28, 2007, at 10:28 AM, Radhika S wrote: 20775 ?S 0:00 postgres: abc myDB [local] idle in transaction 20776 ?S 0:00 postgres: abc myDB [local] idle 17509 ?S 0:06 postgres: abc myDB [local] VACUUM waiting 24656 ?S 0:00 postgres: abc myDB [local] INSERT waiting You're vacuum is probably waiting for the idle in transaction session to finish, so it can clean up. It can't take a lock if your transaction has locks. Your other tasks are probably waiting behind the vacuum. Don't leave your transactions open for a long time. it is bad.
Re: [PERFORM] Searching for the cause of a bad plan
On Thu, 2007-09-27 at 11:07 -0700, Ron Mayer wrote: Csaba Nagy wrote: Well, my problem was actually solved by rising the statistics target, Would it do more benefit than harm if postgres increased the default_statistics_target? I see a fair number of people (myself included) asking questions who's resolution was to ALTER TABLE SET STATISTICS; and I think relatively fewer (if any?) people concerned about space in pg_statistic or people improving analyze time by reducing the statistics target. Well, the cost of raising the statistics target is far from zero: with all defaults the analyze time was ~ 10 seconds, with one column set to 100 was ~ 1.5 minutes, with one column set to 1000 was 15 minutes for the table in question (few 100M rows). Of course the IO load must have been proportional to the timings... so I'm pretty sure the current default is serving well most of the situations. Cheers, Csaba. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] sequence query performance issues
Peter Koczan wrote: Hello, I have a weird performance issue with a query I'm testing. Basically, I'm trying to port a function that generates user uids, and since postgres offers a sequence generator function, I figure I'd take advantage of that. Basically, I generate our uid range, filter out those which are in use, and randomly pick however many I need. However, when I run it it takes forever (10 minutes and I get nothing so I cancelled the query) and cpu usage on the server is maxed out. I'd suspect either an unconstrained join or looping through seq-scans. Here's my query (I'll post the explain output later so as not to obscure my question): = select a.uid from generate_series(1000, 32767) as a(uid) where a.uid not in (select uid from people) order by random() limit 1; I let this run to it's conclusion and it's the materialize. If you see, it's materializing the result-set once for every value it tests against (loops=31768) QUERY PLAN -- Limit (cost=62722.66..62722.67 rows=1 width=4) (actual time=189963.485..189963.485 rows=0 loops=1) - Sort (cost=62722.66..62723.91 rows=500 width=4) (actual time=189961.063..189961.063 rows=0 loops=1) Sort Key: random() - Function Scan on generate_series a (cost=184.00..62700.25 rows=500 width=4) (actual time=189960.797..189960.797 rows=0 loops=1) Filter: (NOT (subplan)) SubPlan - Materialize (cost=184.00..284.00 rows=1 width=2) (actual time=0.000..2.406 rows=9372 loops=31768) - Seq Scan on people (cost=0.00..174.00 rows=1 width=2) (actual time=0.055..7.181 rows=1 loops=1) Total runtime: 189967.150 ms Hmm - why is it doing that? It's clearly confused about something. I suspect the root of the problem is that it doesn't know what generate_series() will return. To the planner it's just another set-returning function. This means it's getting (i) the # of rows wrong (rows=500) and also doesn't know (ii) there will be no nulls or (iii) what the range of values returned will be. Easy enough to test: CREATE TEMP TABLE all_uids (uid int2); INSERT INTO all_uids SELECT generate_series(1000,32767); ANALYSE all_uids; EXPLAIN ANALYSE SELECT a.uid FROM all_uids a WHERE a.uid NOT IN (SELECT uid FROM people) ORDER BY random() LIMIT 1; QUERY PLAN -- Limit (cost=1884.14..1884.14 rows=1 width=2) (actual time=39.019..39.019 rows=0 loops=1) - Sort (cost=1884.14..1923.85 rows=15884 width=2) (actual time=39.014..39.014 rows=0 loops=1) Sort Key: random() - Seq Scan on all_uids a (cost=199.00..775.81 rows=15884 width=2) (actual time=38.959..38.959 rows=0 loops=1) Filter: (NOT (hashed subplan)) SubPlan - Seq Scan on people (cost=0.00..174.00 rows=1 width=2) (actual time=0.046..7.282 rows=1 loops=1) Total runtime: 39.284 ms That's more sensible. I'd actually use a table to track unused_uids and have triggers that kept everything in step. However, if you didn't want to do that, I'd try a left-join. EXPLAIN ANALYSE SELECT a.uid FROM generate_series(1000, 32767) as a(uid) LEFT JOIN people p ON a.uid=p.uid WHERE p.uid IS NULL ORDER BY random() LIMIT 1; Not ideal, but like I say I'd use an unused_uids table. If nothing else, I'd be wary about immediately re-using a uid - your db+application might cope fine, but these values have a tendency to be referred to elsewhere. HTH -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] sequence query performance issues
Tom Lane wrote: Richard Huxton [EMAIL PROTECTED] writes: Hmm - why is it doing that? I'm betting that the OP's people.uid column is not an integer. Existing PG releases can't use hashed subplans for cross-data-type comparisons (8.3 will be a bit smarter). Looked like an int2 to me (width=2, max value ~ 32k) -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] sequence query performance issues
Hmm - why is it doing that? I'm betting that the OP's people.uid column is not an integer. Existing PG releases can't use hashed subplans for cross-data-type comparisons (8.3 will be a bit smarter). *light bulb* Ahhh, that's it. So, I guess the solution is either to cast the column or wait for 8.3 (which isn't a problem since the port won't be done until 8.3 is released anyway). Thanks again. Peter ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Searching for the cause of a bad plan
Just an idea, but with the 8.3 concurrent scan support would it be possible to hang a more in depth analyze over exisiting sequential scans. Then it would be a lower cost to have higher resolution in the statistics because the I/O component would be hidden. The biggest problem with that is that it wouldn't be deterministic... the table in question from my original post is never scanned sequentially in normal operation. The other way around is also possible, when sequential scans are too frequent, in that case you wouldn't want to also analyze all the time. So there would be a need for logic of when to analyze or not with a sequential scan and when do it proactively without waiting for one... and I'm not sure it will be worth the complexity. I think it would me much more productive if some long running query tracking combined with a background planner thread would do targeted analyzes for specific correlations/distributions/conditions based on what queries are actually running on the system. Cheers, Csaba. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] OOM Errors as a result of table inheritance and a bad plan(?)
In keeping with some of the recent threads regarding the planner... I have a fair sized data warehouse in which I am trying to perform an aggregation, but getting OOM errors in Postgres(8.2.4). I believe the reason for the OOM is that Postgres is attempting to do a hash aggregation, but it has grossly underestimated the rows resulting from the aggregation. The data in the database is very uniformly distributed so I don't believe that the table stats are the cause of the problem. This may be related to table inheritance, and can be demonstrated pretty easily. CREATE TABLE foo(a INT);ANALYZE foo; CREATE TABLE foo_1() INHERITS(foo);insert into foo_1 select generate_series(1,10);insert into foo_1 select generate_series(1,10);insert into foo_1 select generate_series(1,10);ANALYZE foo_1; CREATE TABLE foo_2() INHERITS(foo);insert into foo_2 select generate_series(1,10);insert into foo_2 select generate_series(1,10);insert into foo_2 select generate_series(1,10);ANALYZE foo_2; -- If I query a particular partition, the plan estimate for the hash aggregate is good EXPLAIN ANALYZE SELECT a,COUNT(*) from foo_1 group by a; HashAggregate (cost=5822.00..7061.01 rows=99121 width=4) (actual time=554.556..657.121 rows=10 loops=1) - Seq Scan on foo_1 (cost=0.00..4322.00 rows=30 width=4) (actual time=0.014..203.290 rows=30 loops=1) Total runtime: 712.211 ms -- If I query the base table, the plan estimate for the hash aggregate is off by several orders of magnitude EXPLAIN ANALYZE SELECT a,COUNT(*) from foo group by a; HashAggregate (cost=11686.10..11688.60 rows=200 width=4) (actual time=1724.188..1826.630 rows=10 loops=1) - Append (cost=0.00..8675.40 rows=602140 width=4) (actual time=0.016..1045.134 rows=60 loops=1) - Seq Scan on foo (cost=0.00..31.40 rows=2140 width=4) (actual time=0.001..0.001 rows=0 loops=1) - Seq Scan on foo_1 foo (cost=0.00..4322.00 rows=30 width=4) (actual time=0.012..205.130 rows=30 loops=1) - Seq Scan on foo_2 foo (cost=0.00..4322.00 rows=30 width=4) (actual time=0.011..203.542 rows=30 loops=1) Total runtime: 1879.550 ms(6 rows) -- Is there something magical about the hash aggregate estimate of 200 rows? -- I can have 30,000 or 300,000 rows in each child partition table and multiple partition's with different values of a and yet it always come up with 200. -- eg. create table foo_3() inherits(foo);insert into foo_3 select generate_series(10,30);analyze foo_3; EXPLAIN ANALYZE SELECT a,COUNT(*) from foo group by a; QUERY PLAN -- HashAggregate (cost=15568.12..15570.62 rows=200 width=4) (actual time=2386.813..2691.254 rows=30 loops=1) - Append (cost=0.00..11557.41 rows=802141 width=4) (actual time=0.016..1403.121 rows=81 loops=1) - Seq Scan on foo (cost=0.00..31.40 rows=2140 width=4) (actual time=0.002..0.002 rows=0 loops=1) - Seq Scan on foo_1 foo (cost=0.00..4322.00 rows=30 width=4) (actual time=0.013..201.549 rows=30 loops=1) - Seq Scan on foo_2 foo (cost=0.00..4322.00 rows=30 width=4) (actual time=0.010..211.332 rows=30 loops=1) - Seq Scan on foo_3 foo (cost=0.00..2882.01 rows=21 width=4) (actual time=0.011..137.262 rows=21 loops=1) Total runtime: 2851.990 ms Is this a bug, or some subtlety of the Postgres query planner? In my particular case, I am doing a join out to another small table as part of the aggregation and using constraint exclusion on the partitions, but I believe the cause of my problem is the same. I am running on 64bit FreeBSD/Postgres 8.2.4 on a machine with 8GB of memory. An explain of the query and resulting OOM diagnostics follow: The aggregation will result in 5,000,000 rows, not 5,000.In the stats_dtl table there are 12 observations(hourly) for each customerThere are 125 different policy_id and 25 different policy_group_id'sPolicy's and policy_groups are even distributed across all customers userquery Scan table_4760 (cost=2897243.60..2897418.60 rows=5000 width=152) - HashAggregate (cost=2897243.60..2897368.60 rows=5000 width=40) - Hash Join (cost=7.81..2241002.00 rows=37499520 width=40) Hash Cond: (public.customer_stats_dtl.policy_id = policy.policy_id) - Append (cost=0.00..1641001.87 rows=5232 width=40) - Seq Scan on customer_stats_dtl (cost=0.00..22.45 rows=4 width=40) Filter: ((period_start = '2007-09-08 20:00:00-04'::timestamp with time zone) AND (period_start '2007-09-09 08:00:00-04'::timestamp with time zone)) - Seq Scan on customer_stats_dtl_027 customer_stats_dtl (cost=0.00..1640979.42 rows=5228 width=40)
[PERFORM] Non-blocking vacuum full
Hello, I was wondering whether any thought has previously been given to having a non-blocking vacuum full, in the sense of space reclamation and table compactation. The motivation is that it is useful to be able to assume that operations that span a table will *roughtly* scale linearly with the size of the table. But when you have a table that over an extended period of time begins small, grows large, and grows small again (where large might be, say, 200 GB), that assumption is most definitely not correct when you're on the downward slope of that graph. Having this assumption remain true simplifies things a lot for certain workloads (= my particular work load ;)). I have only looked very very briefly at the PG code so I don't know how far fetched it is, but my thought was that it should be possible to have a slow background process (similar to normal non-full vacuums nows) that would, instead of registering dead tuples in the FSM, move live tuples around. Combine that slow moving operations with a policy to a new tuple space allocation policy that prefers earlier locations on-disk, it should in time result in a situation where the physical on-disk file contains only dead tuples after a certain percentage location. At this point the file can be truncated, giving space back to the OS as well as eliminating all that dead space from having to be covered by sequential scans on the table. This does of course increase the total cost of all updates and deletes, but would be very useful in some senarios. It also has the interesting property that the scan for live tuples to move need not touch the entire table to be effective; it could by design be applied to the last n percentage of the table, where n would be scaled appropriately with the frequency of the checks relative to update/insert frequency. Other benefits: * Never vacuum full - EVER. Not even after discovering too small max_fsm_pages or too infrequent vacuums and needing to retroactively shrink the table. * Increased locality in general; even if one does not care about the diskspace or sequential scanning. Particularly relevant for low-update frequency tables suffering from sudden shrinkage, where a blocking VACUUM FULL Is not acceptable. * Non-blocking CLUSTER is perhaps suddently more trivial to implement? Or at least SORTOFCLUSTER when you want it for reasons other than perfect order (mostly sorted). Opinions/thoughts? -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller [EMAIL PROTECTED]' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org pgpFbOXmSf908.pgp Description: PGP signature
Re: [PERFORM] Non-blocking vacuum full
Peter Schuller wrote: I have only looked very very briefly at the PG code so I don't know how far fetched it is, but my thought was that it should be possible to have a slow background process (similar to normal non-full vacuums nows) that would, instead of registering dead tuples in the FSM, move live tuples around. What you've described is actually very close to VACUUM FULL. VACUUM FULL needs to take an exclusive lock to lock out concurrent scanners that might miss or see a tuple twice, when a live tuple is moved. That's the fundamental problem you need to solve. I think it's doable, if you take a copy of the tuple, and set the ctid pointer on the old one like an UPDATE, and wait until the old tuple is no longer visible to anyone before removing it. It does require some changes to tuple visibility code. For example, a transaction running in serializable mode shouldn't throw a serialization error when it tries to update an old, moved row version, but follow the ctid pointer instead. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] OOM Errors as a result of table inheritance and a bad plan(?)
Arctic Toucan [EMAIL PROTECTED] writes: -- Is there something magical about the hash aggregate estimate of 200 rows? Yeah, it's the default :-( Is this a bug, or some subtlety of the Postgres query planner? It's an, um, known deficiency --- the planner hasn't got any idea how to construct aggregated statistics for an inheritance tree. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Non-blocking vacuum full
Heikki Linnakangas wrote: Peter Schuller wrote: to have a slow background process (similar to normal non-full vacuums ... I think it's doable, if you take a copy of the tuple, and set the ctid pointer on the old one like an UPDATE, and wait until the old tuple is no longer visible to anyone before removing it. It does require some changes to tuple visibility code. Wouldn't just having this slow background process repeatedly alternating between update table set anycol=anycol where ctid [some ctid near the end] and running normal VACUUM statements do what the original poster was asking? And with 8.3, I guess also avoiding HOT? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match