Re: [PERFORM] pgbench Comparison of 7.4.7 to 8.0.2

2005-04-25 Thread Steve Poe
There was some interesting oscillation behavior in both version of postgres that occurred with 25 clients and 1000 transactions at a scaling factor of 100. This was repeatable with the distribution version of pgbench run iteratively from the command line. I'm not sure how to explain this.

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-25 Thread Simon Riggs
On Sat, 2005-04-23 at 16:39 -0700, Josh Berkus wrote: Greg Stark wrote I looked into this a while back when we were talking about changing the sampling method. The conclusions were discouraging. Fundamentally, using constant sized samples of data for n_distinct is bogus. Constant sized

Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-25 Thread Joel Fradkin
Thanks we will try that, we are working on a test suit for the way our app gets data (ODBC). we plan to include updates, inserts, and selects and all three at once with a log of the results. Then we should use a stress test tool to see how it works with multiple instances (I used Microsoft's tool

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-25 Thread Merlin Moncure
In practice, we have watched Windows evolve in such a fashion with respect to multiuser support, and, in effect, it has never really gotten it. Microsoft started by hacking something on top of MS-DOS, and by the time enough applications had enough dependancies on the way that worked, it has

Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-25 Thread Joel Fradkin
Tried changing the settings and saw no change in a test using asp. The test does several selects on views and tables. It actually seemed to take a bit longer. Joel Fradkin Wazagua, Inc. 2520 Trailmate Dr Sarasota, Florida 34243 Tel. 941-753-7111 ext 305 [EMAIL PROTECTED] www.wazagua.com

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-25 Thread Merlin Moncure
I am waiting to here back from Josh on using cursors and trying to flatten long running views. I am a little disappointed I have not understood enough to get my analyzer to use the proper plan, we had to set seqscan off to get the select from response_line to work fast and I had to turn off

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-25 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: My suggested hack for PostgreSQL is to have an option to *not* sample, just to scan the whole table and find n_distinct accurately. ... What price a single scan of a table, however large, when incorrect statistics could force scans and sorts to occur when

Re: [PERFORM] pgbench Comparison of 7.4.7 to 8.0.2

2005-04-25 Thread Thomas F . O'Connell
Interesting. I should've included standard deviation in my pgbench iteration patch. Maybe I'll go back and do that. I was seeing oscillation across the majority of iterations in the 25 clients/1000 transaction runs on both database versions. I've got my box specs and configuration files

Re: [PERFORM] pgbench Comparison of 7.4.7 to 8.0.2

2005-04-25 Thread Steve Poe
Tom, Just a quick thought: after each run/sample of pgbench, I drop the database and recreate it. When I don't my results become more skewed. Steve Poe Thomas F.O'Connell wrote: Interesting. I should've included standard deviation in my pgbench iteration patch. Maybe I'll go back and do that.

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-25 Thread Simon Riggs
On Mon, 2005-04-25 at 11:23 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: My suggested hack for PostgreSQL is to have an option to *not* sample, just to scan the whole table and find n_distinct accurately. ... What price a single scan of a table, however large, when

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-25 Thread Josh Berkus
Simon, Tom: While it's not possible to get accurate estimates from a fixed size sample, I think it would be possible from a small but scalable sample: say, 0.1% of all data pages on large tables, up to the limit of maintenance_work_mem. Setting up these samples as a % of data pages, rather

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-25 Thread Josh Berkus
Guys, While it's not possible to get accurate estimates from a fixed size sample, I think it would be possible from a small but scalable sample: say, 0.1% of all data pages on large tables, up to the limit of maintenance_work_mem. BTW, when I say accurate estimates here, I'm talking about

[PERFORM] half the query time in an unnecessary(?) sort?

2005-04-25 Thread Ron Mayer
If I have a freshly CLUSTERed table and queries that want to do a merge join, it seems to me that quite a bit of time is spent unnecessarily sorting the already-sorted table. An example such query I found in my log files is shown below. If I read the EXPLAIN ANALYZE output correctly, it's saying

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-25 Thread Dave Held
-Original Message- From: Josh Berkus [mailto:[EMAIL PROTECTED] Sent: Sunday, April 24, 2005 2:08 PM To: Andrew Dunstan Cc: Tom Lane; Greg Stark; Marko Ristola; pgsql-perform; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-25 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2005-04-25 at 11:23 -0400, Tom Lane wrote: It's not just the scan --- you also have to sort, or something like that, if you want to count distinct values. I doubt anyone is really going to consider this a feasible answer for large tables.

Re: [PERFORM] half the query time in an unnecessary(?) sort?

2005-04-25 Thread Josh Berkus
Ron, If I have a freshly CLUSTERed table and queries that want to do a merge join, it seems to me that quite a bit of time is spent unnecessarily sorting the already-sorted table. An example such query I found in my log files is shown below. If I read the EXPLAIN ANALYZE output correctly,

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-25 Thread Dave Held
-Original Message- From: Andrew Dunstan [mailto:[EMAIL PROTECTED] Sent: Monday, April 25, 2005 3:43 PM To: josh@agliodbs.com Cc: pgsql-perform; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested? Josh Berkus wrote: Simon,

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-25 Thread Christopher Browne
Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Merlin Moncure) wrote: In practice, we have watched Windows evolve in such a fashion with respect to multiuser support, and, in effect, it has never really gotten it. Microsoft started by hacking something on top of MS-DOS, and by

Re: [PERFORM] half the query time in an unnecessary(?) sort?

2005-04-25 Thread Ron Mayer
Josh Berkus wrote: [quoted out of order] Ron, Looking at your analyze, though, I think it's not the sort that's taking the time as it is that the full sorted entity_id column won't fit in work_mem. Try increasing it? Yup, that indeed fixed this particular query since neither table was