Re: [PERFORM] postgresql geqo optimization

2006-02-10 Thread Steinar H. Gunderson
On Fri, Feb 10, 2006 at 08:46:14PM -0500, uwcssa wrote: > I am wondering if anyone here ever had complex queries that the GEQO fails > to work properly, i.e., finds a terrible query plan as compared to one > found by DP optimizer (by forcing Postgresql always uses DP).This is > important to me

Re: [PERFORM] [HACKERS] What do the Windows pg hackers out there like for dev

2006-02-10 Thread Mark Kirkwood
Ron wrote: Subject line says it all. I'm going to be testing changes under both Linux and WinXP, so I'm hoping those of you that do M$ hacking will pass along your list of suggestions and/or favorite (and hated so I know what to avoid) tools. Testing only? So you really only need to build

[PERFORM] postgresql geqo optimization

2006-02-10 Thread uwcssa
I have a question with regard to GEQO optimizer of Postgresql.   For complex queries with over 12 tables in a join, (12 is the default value), the Postgresql optimizer by default will not use the dynamic programming style optimizer. Instead, it uses genetic algorithm to compute a sub-optimal query

Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Dave Dutcher
OK, if I'm reading this correctly, it looks like the planner is choosing a sequential scan because it expects 48,000 rows for that patientidentifier, but its actually only getting 3. The planner has the number of rows right for the sequential scan, so it seems like the stats are up to date. I wou

Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Tim Jones
ok here is real db the first query I had seems to make no sense because it is only fast if I limit the rows since almost all rows have status = 'AC' second query tables both have about 10 million rows and it takes a long time as you can see but this person only has approx 160 total documents

Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Tom Lane
"Tim Jones" <[EMAIL PROTECTED]> writes: > QUERY PLAN > 'Hash Join (cost=899.83..4384.17 rows=482 width=1350) (actual > time=0.203..0.203 rows=0 loops=1)' > ... > 'Total runtime: 0.392 ms' Hardly seems like evidence of a performance problem ... regards, tom lane -

Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Scott Marlowe
On Fri, 2006-02-10 at 16:43, Tim Jones wrote: > oops > > QUERY PLAN > 'Hash Join (cost=899.83..4384.17 rows=482 width=1350) (actual > time=0.203..0.203 rows=0 loops=1)' > ' Hash Cond: ("outer".documentidentifier = > "inner".dssdocumentidentifier)' > ' -> Seq Scan on documentversions (cost=0.0

Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Tom Lane
"Tim Jones" <[EMAIL PROTECTED]> writes: > QUERY PLAN > 'Hash Join (cost=899.83..4384.17 rows=482 width=1350)' > ' Hash Cond: ("outer".documentidentifier = > "inner".dssdocumentidentifier)' This is not EXPLAIN ANALYZE output. Also, the rowcount estimates seem far enough off in the other query to

Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Tim Jones
oops QUERY PLAN 'Hash Join (cost=899.83..4384.17 rows=482 width=1350) (actual time=0.203..0.203 rows=0 loops=1)' ' Hash Cond: ("outer".documentidentifier = "inner".dssdocumentidentifier)' ' -> Seq Scan on documentversions (cost=0.00..2997.68 rows=96368 width=996) (actual time=0.007..0.007 row

Re: [PERFORM] Large Database Design Help

2006-02-10 Thread Scott Marlowe
On Fri, 2006-02-10 at 16:39, Ragnar wrote: > On Fri, 2006-02-10 at 11:24 +0100, Markus Schaber wrote: > > > For lots non-read-only database workloads, RAID5 is a performance > > killer. Raid 1/0 might be better, or having two mirrors of two disks > > each, the first mirror holding system, swap, an

Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Scott Marlowe
On Fri, 2006-02-10 at 16:37, Tim Jones wrote: > for first query > > QUERY PLAN > 'Limit (cost=4.69..88.47 rows=10 width=1350) (actual > time=32.195..32.338 rows=10 loops=1)' > ' -> Nested Loop (cost=4.69..4043.09 rows=482 width=1350) (actual > time=32.190..32.316 rows=10 loops=1)' > '-

Re: [PERFORM] Large Database Design Help

2006-02-10 Thread Ragnar
On Fri, 2006-02-10 at 11:24 +0100, Markus Schaber wrote: > For lots non-read-only database workloads, RAID5 is a performance > killer. Raid 1/0 might be better, or having two mirrors of two disks > each, the first mirror holding system, swap, and the PostgreSQL WAL > files, the second one holding

Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Tim Jones
for first query QUERY PLAN 'Limit (cost=4.69..88.47 rows=10 width=1350) (actual time=32.195..32.338 rows=10 loops=1)' ' -> Nested Loop (cost=4.69..4043.09 rows=482 width=1350) (actual time=32.190..32.316 rows=10 loops=1)' '-> Bitmap Heap Scan on documentversions (cost=4.69..1139.40 r

Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Scott Marlowe
On Fri, 2006-02-10 at 16:35, Tim Jones wrote: > OK. I'm gonna make a couple of guesses here: > > 1: clinicaldocuments.patientidentifier is an int8 and you're running > 7.4 or before. > > -- nope int4 and 8.1 > > 2: There are more rows with clinicaldocuments.patientidentifier= 123 > than with

Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Tim Jones
OK. I'm gonna make a couple of guesses here: 1: clinicaldocuments.patientidentifier is an int8 and you're running 7.4 or before. -- nope int4 and 8.1 2: There are more rows with clinicaldocuments.patientidentifier= 123 than with documentversions.documentstatus = 'AC'. -- nope generally spe

Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Scott Marlowe
On Fri, 2006-02-10 at 16:06, Tim Jones wrote: > > I am trying to join two tables and keep getting a sequential scan in > the plan even though there is an index on the columns I am joining > on. Basically this the deal ... I have two tables with docid in them > which is what I am using for the j

Re: [PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Dave Dutcher
What version of postgres are you using?  Can you post the output from EXPLAIN ANALYZE?     -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Jones Sent: Friday, February 10, 2006 4:07 PM To: pgsql-performance@postgresql.org Subject: [PERFORM]

[PERFORM] joining two tables slow due to sequential scan

2006-02-10 Thread Tim Jones
  I am trying to join two tables and keep getting a sequential scan in the plan even though there is an index on the columns I am joining on.  Basically this the deal  ... I have two tables with docid in them which is what I am using for the join.    ClinicalDocs ... (no primary key) thoug

Re: [PERFORM] help required in design of database

2006-02-10 Thread Steinar H. Gunderson
On Fri, Feb 10, 2006 at 12:20:34PM -0800, david drummard wrote: > 1) create a new table every time a new feed file comes in. Create table with > indexes. Use the copy command to dump the data into the table. > 2) rename the current table to some old table name and rename the new table > to current

[PERFORM] help required in design of database

2006-02-10 Thread david drummard
Hi, I have an unique requirement. I have a feed of 2.5 - 3 million rows of data which arrives every 1/2 an hour. Each row has 2 small string values  (about 50 chars each) and 10 int values. I need searcheability and running arbitrary queries on any of these values. This means i have to create an i

Re: [PERFORM] Large Database Design Help

2006-02-10 Thread PFC
was origionally designed for Postgres 7.0 on a PIII 500Mhz and some Argh. 1) The database is very large, the largest table has 40 million tuples. Is this simple types (like a few ints, text...) ? How much space does it use on disk ? can it fit in RAM ? 2) The datab

Re: [PERFORM] Basic Database Performance

2006-02-10 Thread PFC
We are running a prototype of a system running on PHP/Postgresql on an Intel Xeon 2ghz server, 1GB RAM, 40GB hard drive, I think this is a decent server... Now, I guess you are using Apache and PHP like everyone. Know these facts : - A client connection means an apach

[PERFORM] What do the Windows pg hackers out there like for dev tools?

2006-02-10 Thread Ron
Subject line says it all. I'm going to be testing changes under both Linux and WinXP, so I'm hoping those of you that do M$ hacking will pass along your list of suggestions and/or favorite (and hated so I know what to avoid) tools. TiA, Ron ---(end of broadcast)

Re: [PERFORM] 10+hrs vs 15min because of just one index

2006-02-10 Thread Aaron Turner
On 2/10/06, Matthew T. O'Connor wrote: > Aaron Turner wrote: > > So I'm trying to figure out how to optimize my PG install (8.0.3) to > > get better performance without dropping one of my indexes. > > What about something like this: > > begin; > drop slow_index_name; > update; > create index slow_

Re: [PERFORM] 10+hrs vs 15min because of just one index

2006-02-10 Thread Matthew T. O'Connor
Aaron Turner wrote: So I'm trying to figure out how to optimize my PG install (8.0.3) to get better performance without dropping one of my indexes. What about something like this: begin; drop slow_index_name; update; create index slow_index_name; commit; vacuum; Matt

Re: [PERFORM] 10+hrs vs 15min because of just one index

2006-02-10 Thread Aaron Turner
On 2/10/06, hubert depesz lubaczewski <[EMAIL PROTECTED]> wrote: > On 2/10/06, Aaron Turner <[EMAIL PROTECTED]> wrote: > > So I'm trying to figure out how to optimize my PG install (8.0.3) to > > get better performance without dropping one of my indexes. > > Basically, I have a table of 5M records

Re: [PERFORM] pgbench output

2006-02-10 Thread Gourish Singbal
  Hi All,   Here are some of the results i got after performing pgbench marking between postgresql 7.4.5 and postgresql 8.1.2. having parameters with same values in the postgresql.conf file.   [EMAIL PROTECTED]:/newdisk/postgres/data> /usr/local/pgsql7.4.5/bin/pgbench -c 10 -t 1 regressionstart

Re: [PERFORM] Basic Database Performance

2006-02-10 Thread Markus Schaber
Hi, James, James Dey wrote: > Apologies if this is a novice queston, but I think it is a performance > one nevertheless. We are running a prototype of a system running on > PHP/Postgresql on an Intel Xeon 2ghz server, 1GB RAM, 40GB hard drive, > as a test bench. The system will be used for tens o

Re: [PERFORM] Large Database Design Help

2006-02-10 Thread Markus Schaber
Hi, Henry, Orion Henry wrote: > 1) The database is very large, the largest table has 40 million tuples. I'm afraid this doesn't qualify as '_very_ large' yet, but it definitively is large enough to have some deep thoughts about it. :-) > 1) The data is easily partitionable by client ID. In an

Re: [PERFORM] Large Database Design Help

2006-02-10 Thread Markus Schaber
Hi, Greg, Greg Stark wrote: >>(Aside question: if I were to find a way to use COPY and I were loading >>data on a single client_id, would dropping just the indexes for that client_id >>accelerate the load?) > Dropping indexes would accelerate the load but unless you're loading a large > numbe

Re: [PERFORM] Basic Database Performance

2006-02-10 Thread James Dey
Sorry about that James Dey tel +27 11 704-1945 cell+27 82 785-5102 fax +27 11 388-8907 mail[EMAIL PROTECTED] -Original Message- From: Richard Huxton [mailto:[EMAIL PROTECTED] Sent: 10 February 2006 11:50 AM To: James Dey Cc: 'Postgresql Performance' Subject: Re: [PERFOR

Re: [PERFORM] Basic Database Performance

2006-02-10 Thread Richard Huxton
Don't forget to cc: the list. James Dey wrote: Hi Richard, Firstly, thanks a million for the reply. To answer your questions: 1. Are you limited by CPU, memory or disk i/o? I am not limited, but would like to get the most out of the config I have in order to be able to know what I'll get, whe

Re: [PERFORM] Basic Database Performance

2006-02-10 Thread Richard Huxton
James Dey wrote: Apologies if this is a novice queston, but I think it is a performance one nevertheless. We are running a prototype of a system running on PHP/Postgresql on an Intel Xeon 2ghz server, 1GB RAM, 40GB hard drive, as a test bench. The system will be used for tens of thousands of use

Re: [PERFORM] 10+hrs vs 15min because of just one index

2006-02-10 Thread hubert depesz lubaczewski
On 2/10/06, Aaron Turner <[EMAIL PROTECTED]> wrote: > So I'm trying to figure out how to optimize my PG install (8.0.3) to > get better performance without dropping one of my indexes. > Basically, I have a table of 5M records with 3 columns: > pri_key (SERIAL) > data char(48) > groupid integer > th

[PERFORM] Basic Database Performance

2006-02-10 Thread James Dey
Hi Guys,   Apologies if this is a novice queston, but I think it is a performance one nevertheless. We are running a prototype of a system running on PHP/Postgresql on an Intel Xeon 2ghz server, 1GB RAM, 40GB hard drive, as a test bench. The system will be used for tens of thousands of us

[PERFORM] 10+hrs vs 15min because of just one index

2006-02-10 Thread Aaron Turner
So I'm trying to figure out how to optimize my PG install (8.0.3) to get better performance without dropping one of my indexes. Basically, I have a table of 5M records with 3 columns: pri_key (SERIAL) data char(48) groupid integer there is an additional unique index on the data column. The prob