Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Scott Marlowe
On Nov 28, 2007 3:15 PM, cluster <[EMAIL PROTECTED]> wrote: > > The indexes don't contain visibility information, so Postgres has to look up > > the row on disk to verify it isn't dead. > > I guess this fact drastically decreases the performance. :-( > The number of rows with a random_number will j

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Tom Lane
cluster <[EMAIL PROTECTED]> writes: > I could really use any kind of suggestion on how to improve the query in > order to make it scale better for large data sets The 6-7000 ms for a > clean run is really a showstopper. Need to get it below 70 ms somehow. Buy a faster disk? You're essentially a

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Jean-David Beyer
cluster wrote: >> The indexes don't contain visibility information, so Postgres has to >> look up the row on disk to verify it isn't dead. > > I guess this fact drastically decreases the performance. :-( The number > of rows with a random_number will just grow over time while the number of > ques

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread cluster
The indexes don't contain visibility information, so Postgres has to look up the row on disk to verify it isn't dead. I guess this fact drastically decreases the performance. :-( The number of rows with a random_number will just grow over time while the number of questions with status = 1 will

Re: [PERFORM] TB-sized databases

2007-11-28 Thread david
On Wed, 28 Nov 2007, Simon Riggs wrote: On Wed, 2007-11-28 at 14:48 +0100, Csaba Nagy wrote: In fact an even more useful option would be to ask the planner to throw error if the expected cost exceeds a certain threshold... Well, I've suggested it before: statement_cost_limit on pgsql-hacker

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Steinar H. Gunderson
On Wed, Nov 28, 2007 at 09:16:08PM +0100, cluster wrote: > Hmm, actually I still don't understand why it takes 6400 ms to fetch the > rows. As far as I can see the index used is "covering" so that real row > lookups shouldn't be necessary. The indexes don't contain visibility information, so Pos

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread cluster
I'm wondering why --- doesn't seem like it should take 6400msec to fetch 646 rows, unless perhaps the data is just horribly misordered relative to the index. Which may in fact be the case ... Hmm, actually I still don't understand why it takes 6400 ms to fetch the rows. As far as I can see th

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Matthew
On Wed, 28 Nov 2007, Simon Riggs wrote: > statement_cost_limit on pgsql-hackers, 1 March 2006 > > Would people like me to re-write and resubmit this patch for 8.4? Yes please. The more options, the better. > Tom's previous concerns were along the lines of "How would know what to > set it to?", gi

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Simon Riggs
On Tue, 2007-11-27 at 18:06 -0500, Pablo Alcaraz wrote: > Simon Riggs wrote: > > All of those responses have cooked up quite a few topics into one. Large > > databases might mean text warehouses, XML message stores, relational > > archives and fact-based business data warehouses. > > > > The main t

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Simon Riggs
On Wed, 2007-11-28 at 14:48 +0100, Csaba Nagy wrote: > In fact an even more useful option would be to ask the planner to throw > error if the expected cost exceeds a certain threshold... Well, I've suggested it before: statement_cost_limit on pgsql-hackers, 1 March 2006 Would people like me to

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Tom Lane
"Dave Dutcher" <[EMAIL PROTECTED]> writes: > ... According to the explain analyze > there are only 646 rows in posts which match your criteria, so it does seem > like scanning posts first might be the right thing to do. No, that's not right. What the output actually shows is that only 646 posts

Re: [PERFORM] Training Recommendations

2007-11-28 Thread Usama Munir Dar
EnterpriseDB (www.enterprisedb.com), ofcourse Campbell, Lance wrote: PostgreSQL: 8.2.4 Does anyone have any companies they would recommend using for performance tuning training of PostgreSQL for Linux? Or general DBA training? Thanks, Lance Campbell Project Manager/Software A

Re: [PERFORM] Windows XP selects are very slow

2007-11-28 Thread Usama Munir Dar
PG generally comes with very basic default settings, one *start* maybe this page for you http://www.webservices.uiuc.edu/postgresql/ Then obviously you will need to work though your query plans and iterate. Shadkam Islam wrote: Hi All, We are having a table whose data we need to bucketize an

Re: [PERFORM] GiST indexing tuples

2007-11-28 Thread Matthew
On Wed, 28 Nov 2007, Tom Lane wrote: > Have you looked at contrib/seg/ ? Yes, I had a pretty good look at that. However, I believe that in order to use seg's indexes, I would need to put my data into seg's data type, and reformat my query, as I stated in my original message. What I'm looking for i

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Dave Dutcher
> -Original Message- > From: tmp > We have primarily two tables of interest here: questions > (~100k rows) and posts (~400k rows). Each post refers to a > question, but only the "posts" rows for which the > corresponding "question.status = 1" are relevant. This > reduces the number of r

Re: [PERFORM] GiST indexing tuples

2007-11-28 Thread Tom Lane
Matthew <[EMAIL PROTECTED]> writes: >> This sounds like something an R-tree can do. > I *know* that. However, Postgres (as far as I can see) doesn't provide a > simple R-tree index that will index two integers. This means I have to > write one myself. I'm asking whether it is possible to get two v

[PERFORM] Optimizer regression 8.2.1 -> 8.2.3 on TSEARCH2 queries with ORDER BY and LIMIT

2007-11-28 Thread Brendan McMahon
Hi folks, An apparent optimizer regression between 8.2.1 & 8.2.3 ? : select pk,... from tbl where tsv @@ to_tsquery(...) order by pk limit 10 disadvantageously uses PK index scan against a 2.5 million row (vacuum analysed) table whenever limit<=16 , leading to an increase in query ti

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Alvaro Herrera
Pablo Alcaraz escribió: > In my opinion there are queries that I think they ll need to be tuned for > "huge databases" (huge databases = a database which relevant > tables(indexes) are (will be) far bigger that all the ram available): > > -- example table > CREATE TABLE homes ( >id bigse

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Pablo Alcaraz
Pablo Alcaraz wrote: Simon Riggs wrote: All of those responses have cooked up quite a few topics into one. Large databases might mean text warehouses, XML message stores, relational archives and fact-based business data warehouses. The main thing is that TB-sized databases are performance criti

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Craig James
tmp wrote: what exactly is that "random_number" column A random float that is initialized when the row is created and never modified afterwards. The physical row ordering will clearly not match the random_number ordering. However, other queries uses a row ordering by the primary key so I don

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Matthew
On Wed, 28 Nov 2007, Gregory Stark wrote: > > Is there something wrong with: > > set enable_seqscan = off > > ? > > This does kind of the opposite of what you would actually want here. What you > want is that if you give it a query which would be best satisfied by a > sequential scan it should thro

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Pablo Alcaraz
Matthew wrote: On Tue, 27 Nov 2007, Pablo Alcaraz wrote: it would be nice to do something with selects so we can recover a rowset on huge tables using a criteria with indexes without fall running a full scan. You mean: Be able to tell Postgres "Don't ever do a sequential scan of this t

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Csaba Nagy
On Wed, 2007-11-28 at 08:54 -0500, Bill Moran wrote: > > Nothing wrong with enable_seqscan = off except it is all or nothing type > > of thing... > > If that's true, then I have a bug report to file: [snip] > It looks to me to be session-alterable. I didn't mean that it can't be set per session,

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Bill Moran
In response to Gregory Stark <[EMAIL PROTECTED]>: > "Bill Moran" <[EMAIL PROTECTED]> writes: > > > In response to Matthew <[EMAIL PROTECTED]>: > > > >> On Tue, 27 Nov 2007, Pablo Alcaraz wrote: > >> > it would be nice to do something with selects so we can recover a rowset > >> > on huge tables u

Re: [PERFORM] TB-sized databases

2007-11-28 Thread david
On Wed, 28 Nov 2007, Csaba Nagy wrote: On Wed, 2007-11-28 at 08:27 -0500, Bill Moran wrote: Is there something wrong with: set enable_seqscan = off ? Nothing wrong with enable_seqscan = off except it is all or nothing type of thing... if you want the big table to never use seqscan, but a medi

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Gregory Stark
"Bill Moran" <[EMAIL PROTECTED]> writes: > In response to Matthew <[EMAIL PROTECTED]>: > >> On Tue, 27 Nov 2007, Pablo Alcaraz wrote: >> > it would be nice to do something with selects so we can recover a rowset >> > on huge tables using a criteria with indexes without fall running a full >> > sca

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Bill Moran
In response to Csaba Nagy <[EMAIL PROTECTED]>: > On Wed, 2007-11-28 at 08:27 -0500, Bill Moran wrote: > > Is there something wrong with: > > set enable_seqscan = off > > ? > > Nothing wrong with enable_seqscan = off except it is all or nothing type > of thing... If that's true, then I have a bug

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Csaba Nagy
On Wed, 2007-11-28 at 08:27 -0500, Bill Moran wrote: > Is there something wrong with: > set enable_seqscan = off > ? Nothing wrong with enable_seqscan = off except it is all or nothing type of thing... if you want the big table to never use seqscan, but a medium table which is joined in should use

Re: [PERFORM] GiST indexing tuples

2007-11-28 Thread Gregory Stark
"Matthew" <[EMAIL PROTECTED]> writes: > On Tue, 27 Nov 2007, Steinar H. Gunderson wrote: >> On Tue, Nov 27, 2007 at 06:28:23PM +, Matthew wrote: >> > SELECT * FROM table WHERE a > 1 AND b < 4; >> >> This sounds like something an R-tree can do. > > I *know* that. However, Postgres (as far as I

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Bill Moran
In response to Matthew <[EMAIL PROTECTED]>: > On Tue, 27 Nov 2007, Pablo Alcaraz wrote: > > it would be nice to do something with selects so we can recover a rowset > > on huge tables using a criteria with indexes without fall running a full > > scan. > > You mean: Be able to tell Postgres "Don't

Re: [PERFORM] GiST indexing tuples

2007-11-28 Thread Matthew
On Tue, 27 Nov 2007, Steinar H. Gunderson wrote: > On Tue, Nov 27, 2007 at 06:28:23PM +, Matthew wrote: > > SELECT * FROM table WHERE a > 1 AND b < 4; > > This sounds like something an R-tree can do. I *know* that. However, Postgres (as far as I can see) doesn't provide a simple R-tree index t

Re: [PERFORM] TB-sized databases

2007-11-28 Thread Matthew
On Tue, 27 Nov 2007, Pablo Alcaraz wrote: > it would be nice to do something with selects so we can recover a rowset > on huge tables using a criteria with indexes without fall running a full > scan. You mean: Be able to tell Postgres "Don't ever do a sequential scan of this table. It's silly. I w

Re: [PERFORM] Windows XP selects are very slow

2007-11-28 Thread Richard Huxton
Shadkam Islam wrote: Hi All, We are having a table whose data we need to bucketize and show. This is a continuously growing table (archival is a way to trim it to size). We are facing 2 issues here: 1. When the records in the table are in the range of 10K, it works fine for some time after star

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread tmp
The query's spending nearly all its time in the scan of "posts", and I'm wondering why --- doesn't seem like it should take 6400msec to fetch 646 rows, unless perhaps the data is just horribly misordered relative to the index. Which may in fact be the case ... Yes, they probably are. I use the

[PERFORM] Windows XP selects are very slow

2007-11-28 Thread Shadkam Islam
Hi All, We are having a table whose data we need to bucketize and show. This is a continuously growing table (archival is a way to trim it to size). We are facing 2 issues here: 1. When the records in the table are in the range of 10K, it works fine for some time after starting postgres server. B