There was some interesting oscillation behavior in both version of
postgres that occurred with 25 clients and 1000 transactions at a
scaling factor of 100. This was repeatable with the distribution
version of pgbench run iteratively from the command line. I'm not sure
how to explain this.
On Sat, 2005-04-23 at 16:39 -0700, Josh Berkus wrote:
Greg Stark wrote
I looked into this a while back when we were talking about changing the
sampling method. The conclusions were discouraging. Fundamentally, using
constant sized samples of data for n_distinct is bogus. Constant sized
Thanks we will try that, we are working on a test suit for the way our app
gets data (ODBC).
we plan to include updates, inserts, and selects and all three at once with
a log of the results.
Then we should use a stress test tool to see how it works with multiple
instances (I used Microsoft's tool
In practice, we have watched Windows evolve in such a fashion with
respect to multiuser support, and, in effect, it has never really
gotten it. Microsoft started by hacking something on top of MS-DOS,
and by the time enough applications had enough dependancies on the way
that worked, it has
Tried changing the settings and saw no change in a test using asp.
The test does several selects on views and tables.
It actually seemed to take a bit longer.
Joel Fradkin
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel. 941-753-7111 ext 305
[EMAIL PROTECTED]
www.wazagua.com
I am waiting to here back from Josh on using cursors and trying to
flatten
long running views.
I am a little disappointed I have not understood enough to get my
analyzer
to use the proper plan, we had to set seqscan off to get the select
from
response_line to work fast and I had to turn off
Simon Riggs [EMAIL PROTECTED] writes:
My suggested hack for PostgreSQL is to have an option to *not* sample,
just to scan the whole table and find n_distinct accurately.
...
What price a single scan of a table, however large, when incorrect
statistics could force scans and sorts to occur when
Interesting. I should've included standard deviation in my pgbench
iteration patch. Maybe I'll go back and do that.
I was seeing oscillation across the majority of iterations in the 25
clients/1000 transaction runs on both database versions.
I've got my box specs and configuration files
Tom,
Just a quick thought: after each run/sample of pgbench, I drop the
database and recreate it. When I don't my results become more skewed.
Steve Poe
Thomas F.O'Connell wrote:
Interesting. I should've included standard deviation in my pgbench
iteration patch. Maybe I'll go back and do that.
On Mon, 2005-04-25 at 11:23 -0400, Tom Lane wrote:
Simon Riggs [EMAIL PROTECTED] writes:
My suggested hack for PostgreSQL is to have an option to *not* sample,
just to scan the whole table and find n_distinct accurately.
...
What price a single scan of a table, however large, when
Simon, Tom:
While it's not possible to get accurate estimates from a fixed size sample, I
think it would be possible from a small but scalable sample: say, 0.1% of all
data pages on large tables, up to the limit of maintenance_work_mem.
Setting up these samples as a % of data pages, rather
Guys,
While it's not possible to get accurate estimates from a fixed size sample,
I think it would be possible from a small but scalable sample: say, 0.1% of
all data pages on large tables, up to the limit of maintenance_work_mem.
BTW, when I say accurate estimates here, I'm talking about
If I have a freshly CLUSTERed table and queries that want to do a
merge join, it seems to me that quite a bit of time is spent
unnecessarily sorting the already-sorted table. An example such
query I found in my log files is shown below. If I read the
EXPLAIN ANALYZE output correctly, it's saying
-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED]
Sent: Sunday, April 24, 2005 2:08 PM
To: Andrew Dunstan
Cc: Tom Lane; Greg Stark; Marko Ristola; pgsql-perform;
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks
suggested?
Simon Riggs [EMAIL PROTECTED] writes:
On Mon, 2005-04-25 at 11:23 -0400, Tom Lane wrote:
It's not just the scan --- you also have to sort, or something like
that, if you want to count distinct values. I doubt anyone is really
going to consider this a feasible answer for large tables.
Ron,
If I have a freshly CLUSTERed table and queries that want to do a
merge join, it seems to me that quite a bit of time is spent
unnecessarily sorting the already-sorted table. An example such
query I found in my log files is shown below. If I read the
EXPLAIN ANALYZE output correctly,
-Original Message-
From: Andrew Dunstan [mailto:[EMAIL PROTECTED]
Sent: Monday, April 25, 2005 3:43 PM
To: josh@agliodbs.com
Cc: pgsql-perform; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks
suggested?
Josh Berkus wrote:
Simon,
Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Merlin Moncure)
wrote:
In practice, we have watched Windows evolve in such a fashion with
respect to multiuser support, and, in effect, it has never really
gotten it. Microsoft started by hacking something on top of MS-DOS,
and by
Josh Berkus wrote: [quoted out of order]
Ron,
Looking at your analyze, though, I think it's not the sort that's taking the
time as it is that the full sorted entity_id column won't fit in work_mem.
Try increasing it?
Yup, that indeed fixed this particular query since neither table was
19 matches
Mail list logo