Re: [PERFORM] Query only slow on first run

2007-11-29 Thread cluster
You're essentially asking for a random sample of data that is not currently in memory. You're not going to get that without some I/O. No, that sounds reasonable enough. But do you agree with the statement that my query will just get slower and slower over time as the number of posts

Re: [PERFORM] TB-sized databases

2007-11-29 Thread Gregory Stark
Simon Riggs [EMAIL PROTECTED] writes: On Wed, 2007-11-28 at 14:48 +0100, Csaba Nagy wrote: In fact an even more useful option would be to ask the planner to throw error if the expected cost exceeds a certain threshold... Well, I've suggested it before: statement_cost_limit on

[PERFORM] 7.4 Checkpoint Question

2007-11-29 Thread Brad Nicholson
I have a legacy system still on 7.4 (I know, I know...the upgrade is coming soon). I have a fairly big spike happening once a day, every day, at the same time. It happens during a checkpoint, no surprise there. I know the solution to the problem (upgrade to a modern version), but what I'm

Re: [PERFORM] Query only slow on first run

2007-11-29 Thread Tom Lane
cluster [EMAIL PROTECTED] writes: You're essentially asking for a random sample of data that is not currently in memory. You're not going to get that without some I/O. No, that sounds reasonable enough. But do you agree with the statement that my query will just get slower and slower over

Re: [PERFORM] TB-sized databases

2007-11-29 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes: Simon Riggs [EMAIL PROTECTED] writes: Tom's previous concerns were along the lines of How would know what to set it to?, given that the planner costs are mostly arbitrary numbers. Hm, that's only kind of true. The units are not the problem. The

Re: [PERFORM] 7.4 Checkpoint Question

2007-11-29 Thread Andrew Sullivan
On Thu, Nov 29, 2007 at 10:10:54AM -0500, Brad Nicholson wrote: This is a _really _low volume system, less than 500 writes/hour. Normal operation sees checkpoint related spikes of around 200-300 milliseconds. We always checkpoint at the checkpoint timeout (every 5 minutes). During this one

Re: [PERFORM] 7.4 Checkpoint Question

2007-11-29 Thread Simon Riggs
On Thu, 2007-11-29 at 10:10 -0500, Brad Nicholson wrote: I have a legacy system still on 7.4 (I know, I know...the upgrade is coming soon). I have a fairly big spike happening once a day, every day, at the same time. It happens during a checkpoint, no surprise there. I know the solution

Re: [PERFORM] TB-sized databases

2007-11-29 Thread Simon Riggs
On Thu, 2007-11-29 at 10:45 -0500, Tom Lane wrote: Gregory Stark [EMAIL PROTECTED] writes: Simon Riggs [EMAIL PROTECTED] writes: Tom's previous concerns were along the lines of How would know what to set it to?, given that the planner costs are mostly arbitrary numbers. Hm, that's only

Re: [PERFORM] 7.4 Checkpoint Question

2007-11-29 Thread Brad Nicholson
On Thu, 2007-11-29 at 16:14 +, Simon Riggs wrote: On Thu, 2007-11-29 at 10:10 -0500, Brad Nicholson wrote: I have a legacy system still on 7.4 (I know, I know...the upgrade is coming soon). I have a fairly big spike happening once a day, every day, at the same time. It happens

Re: [PERFORM] TB-sized databases

2007-11-29 Thread Stephen Frost
* Simon Riggs ([EMAIL PROTECTED]) wrote: On Thu, 2007-11-29 at 10:45 -0500, Tom Lane wrote: Given that this list spends all day every day discussing cases where the planner is wrong, I'd have to think that that's a bet I wouldn't take. I think you have a point, but the alternative is often

Re: [PERFORM] TB-sized databases

2007-11-29 Thread Csaba Nagy
On Thu, 2007-11-29 at 10:45 -0500, Tom Lane wrote: Given that this list spends all day every day discussing cases where the planner is wrong, I'd have to think that that's a bet I wouldn't take. You could probably avoid this risk by setting the cutoff at something like 100 or 1000 times what

[PERFORM] Configuring a Large RAM PostgreSQL Server

2007-11-29 Thread Alex Hochberger
Does anyone have any white papers or basic guides for a large RAM server? We are consolidating two databases to enable better data-mining that currently run on a 4 GB and 2 GB machine. The data issues on the 4 GB machine are numerous, things like create index fail and update queries

Re: [PERFORM] Configuring a Large RAM PostgreSQL Server

2007-11-29 Thread Richard Huxton
Alex Hochberger wrote: Does anyone have any white papers or basic guides for a large RAM server? We are consolidating two databases to enable better data-mining that currently run on a 4 GB and 2 GB machine. The data issues on the 4 GB machine are numerous, things like create index fail and

Re: [PERFORM] Configuring a Large RAM PostgreSQL Server

2007-11-29 Thread Alex Hochberger
It's not on rebuilding the index, it's on CREATE INDEX. I attribute it to wrong setting, Ubuntu bizarre-ness, and general problems. We need new hardware, the servers are running on aging infrastructure, and we decided to get a new system that will last us the next 3-4 years all at once.

Re: [PERFORM] Configuring a Large RAM PostgreSQL Server

2007-11-29 Thread Alvaro Herrera
On Nov 29, 2007, at 2:15 PM, Richard Huxton wrote: Alex Hochberger wrote: Problem Usage: we have a 20GB table with 120m rows that we are splitting into some sub-tables. Generally, we do large data pulls from here, 1 million - 4 million records at a time, stored in a new table for

Re: [PERFORM] GiST indexing tuples

2007-11-29 Thread Steinar H. Gunderson
On Thu, Nov 29, 2007 at 03:23:10PM -0500, Matthew T. O'Connor wrote: Sorry in advance if this is a stupid question, but how is this better than two index, one on a and one on b? I supposed there could be a space savings but beyond that? You could index on both columns simultaneously without

Re: [PERFORM] GiST indexing tuples

2007-11-29 Thread Matthew T. O'Connor
Matthew wrote: For instance, the normal B-tree index on (a, b) is able to answer queries like a = 5 AND b 1 or a 5. An R-tree would be able to index these, plus queries like a 5 AND b 1. Sorry in advance if this is a stupid question, but how is this better than two index, one on a and one

Re: [PERFORM] clear pg_stats

2007-11-29 Thread Heikki Linnakangas
Campbell, Lance wrote: How can I clear the pg_stats views without restarting PostgreSQL? I thought there was a function. SELECT pg_stat_reset(); -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP

Re: [PERFORM] clear pg_stats

2007-11-29 Thread Magnus Hagander
Campbell, Lance wrote: How can I clear the pg_stats views without restarting PostgreSQL? I thought there was a function. pg_stat_reset() //Magnus ---(end of broadcast)--- TIP 4: Have you searched our list archives?

[PERFORM] clear pg_stats

2007-11-29 Thread Campbell, Lance
How can I clear the pg_stats views without restarting PostgreSQL? I thought there was a function. Thanks, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu

Re: [PERFORM] TB-sized databases

2007-11-29 Thread Mark Kirkwood
Simon Riggs wrote: On Wed, 2007-11-28 at 14:48 +0100, Csaba Nagy wrote: In fact an even more useful option would be to ask the planner to throw error if the expected cost exceeds a certain threshold... Well, I've suggested it before: statement_cost_limit on pgsql-hackers, 1 March

Re: [PERFORM] Configuring a Large RAM PostgreSQL Server

2007-11-29 Thread Josh Berkus
Alex, The new machine will have 48 GB of RAM, so figuring out starting   points for the Shared Buffers and Work_mem/Maintenance_work_mem is   going to be a crap shoot, since the defaults still seem to be based   upon 256MB of RAM or less. Why a crap shoot? Set shared_buffers to 12GB. Set

Re: [PERFORM] TB-sized databases

2007-11-29 Thread Russell Smith
Simon Riggs wrote: On Tue, 2007-11-27 at 18:06 -0500, Pablo Alcaraz wrote: Simon Riggs wrote: All of those responses have cooked up quite a few topics into one. Large databases might mean text warehouses, XML message stores, relational archives and fact-based business data warehouses.