Re: [PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Tom Lane
Rob Nagler <[EMAIL PROTECTED]> writes: > What I'm not sure is why does it decide to switch modes so "early", > i.e., at about 5% of the table size or less? Given the default cost parameters and cost models, that's the correct place to switch. Since the estimate evidently doesn't match reality for

Re: [PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Ron Johnson
On Sat, 2003-08-30 at 15:42, Rob Nagler wrote: [snip] > We actually have been quite pleased with Postgres's performance > without paying much attention to it before this. When we first set up > Oracle, we got into all of its parameters pretty heavily. With > Postgres, we just tried it and it work

Re: [PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Rob Nagler
Tom Lane writes: > That doesn't really tell me anything. What's the proportion of 21 > records out of the total table? Currently we have about 15 servers so 6% of the data is uniformly distributed with the value 21. > create index fooi on foo (min_date_time) where server_id = 21; > > This

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Russell Garrett
Considering that we'd have to index the random field too, it'd be neater in the long term to re-number the primary key. Although, being a primary key, that's foreign-keyed from absolutely everywhere, so that'd probably take an amusingly long time. ...and no we're not from Micronesia, we're from ev

Re: [PERFORM] PostgreSQL HDD Grow capacity

2003-08-30 Thread Bruce Momjian
Eko Pranoto wrote: > How To calcute PostgreSQL HDD grow capacity for every byte data, start from > installation initialize. First, see the FAQ item about calculating row size. Second, see the chapter in the administration manual talking about computing disk space. -- Bruce Momjian

Re: [PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Ron Johnson
On Sat, 2003-08-30 at 10:47, Rob Nagler wrote: > Tom Lane writes: [snip] > enough. When I add this index, I will slow down inserts (about > 20K/day) and increase data size (this is the second largest table in [snip] Since I gather that this is a web-site, can we presume that they are clumped into

Re: [PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Tom Lane
Rob Nagler <[EMAIL PROTECTED]> writes: > Tom Lane writes: >> The reason the planner does not much like this plan is that it's >> estimating that quite a lot of rows will have to be hit in min_date_time >> order before it finds enough rows with server_id = 21. > Very interesting. How does it know

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Ron Johnson
On Sat, 2003-08-30 at 09:01, Rod Taylor wrote: > > i was hoping there was some trickery with sequences that would allow me to > > easily pick a random valid sequence number..? > > I would suggest renumbering the data. > > ALTER SEQUENCE ... RESTART WITH 1; > UPDATE table SET pkey = DEFAULT; > >

Re: [PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Rob Nagler
Tom Lane writes: > The reason the planner does not much like this plan is that it's > estimating that quite a lot of rows will have to be hit in min_date_time > order before it finds enough rows with server_id = 21. Thus the high > cost estimate for the above step. Thanks for the speedy and usefu

Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-30 Thread Matt Clark
> Just a data point, but on my Dual Xeon 2.4Gig machine with a 10k SCSI > drive I can do 4k inserts/second if I turn fsync off. If you have a > battery-backed controller, you should be able to do the same. (You will > not need to turn fsync off --- fsync will just be fast because of the > disk dr

Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-30 Thread Matt Clark
> SELECT > IF > BEGIN > INSERT >or > UPDATE > COMMIT; > > He says his current h/w peaks at 1/10th that rate. > > My question is: is that current peak rate ("300 inserts/updates > *or* 2500 selects") based upon 1 connection, or many c

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Tom Lane
I said: > 3. Your query now looks like > SELECT * FROM table WHERE random_id >= random() > ORDER BY random_id LIMIT 1; Correction: the above won't give quite the right query because random() is marked as a volatile function. You can hide the random() call inside a user-defined functio

Re: [PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Tom Lane
Rob Nagler <[EMAIL PROTECTED]> writes: > I'm trying to understand how I can get the planner to always do the > right thing with this query: > SELECT > aa_t.min_date_time > FROM > aa_t > , bb_t > , cc_t > WHERE bb_t.bb_id = aa_t.bb_id > AND aa_t.realm_id =

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Tom Lane
Richard Jones <[EMAIL PROTECTED]> writes: >>> i have a table of around 3 million rows from which i regularly (twice a >>> second at the moment) need to select a random row from > i was hoping there was some trickery with sequences that would allow me to > easily pick a random valid sequence numbe

Re: [PERFORM] bad estimates

2003-08-30 Thread Tom Lane
Ken Geis <[EMAIL PROTECTED]> writes: > From what I've learned, we want to convince the optimizer to use a > table scan; that's a good thing. I want it to use hashed aggregates, > but I can't convince it to (unless maybe I removed all of the > statistics.) You probably just need to increase so

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Ron Johnson
On Sat, 2003-08-30 at 08:09, Richard Jones wrote: > Hi, > i have a table of around 3 million rows from which i regularly (twice a second > at the moment) need to select a random row from > > currently i'm doing "order by rand() limit 1" - but i suspect this is > responsible for the large load on

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Rod Taylor
> i was hoping there was some trickery with sequences that would allow me to > easily pick a random valid sequence number..? I would suggest renumbering the data. ALTER SEQUENCE ... RESTART WITH 1; UPDATE table SET pkey = DEFAULT; Of course, PostgreSQL may have trouble with that update due to e

Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-30 Thread Rod Taylor
> My question is: is that current peak rate ("300 inserts/updates > *or* 2500 selects") based upon 1 connection, or many connections? > With 4 CPUs, and a 4 disk RAID10, I wouldn't be surprised if 4 con- > current connections gives the optimum speed. Optimum number of active workers is probably

Re: [PERFORM] bad estimates

2003-08-30 Thread Ken Geis
Bruno Wolff III wrote: I haven't come up with any great ideas for this one. It might be interesting to compare the explain analyze output from the distinct on query with and without seqscans enabled. After digging through planner code, I found that bumping up the sort_mem will make the planner pre

[PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Rob Nagler
I'm trying to understand how I can get the planner to always do the right thing with this query: EXPLAIN ANALYZE SELECT aa_t.min_date_time FROM aa_t , bb_t , cc_t WHERE bb_t.bb_id = aa_t.bb_id AND aa_t.realm_id = cc_t.realm_id AND aa_

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Richard Jones
On Saturday 30 August 2003 1:08 pm, you wrote: > On Sat, 30 Aug 2003, Richard Jones wrote: > > Hi, > > i have a table of around 3 million rows from which i regularly (twice a > > second at the moment) need to select a random row from > > > > currently i'm doing "order by rand() limit 1" - but i sus

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Jeff
On Sat, 30 Aug 2003, Richard Jones wrote: > Hi, > i have a table of around 3 million rows from which i regularly (twice a second > at the moment) need to select a random row from > > currently i'm doing "order by rand() limit 1" - but i suspect this is > responsible for the large load on my db ser

[PERFORM] Selecting random rows efficiently

2003-08-30 Thread Richard Jones
Hi, i have a table of around 3 million rows from which i regularly (twice a second at the moment) need to select a random row from currently i'm doing "order by rand() limit 1" - but i suspect this is responsible for the large load on my db server - i guess that PG is doing far too much work ju

Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-30 Thread Ron Johnson
On Fri, 2003-08-29 at 21:44, Bruce Momjian wrote: > matt wrote: > > > Are you *sure* about that 3K updates/inserts per second xlates > > > to 10,800,000 per hour. That, my friend, is a WHOLE HECK OF A LOT! > > > > Yup, I know! > > Just a data point, but on my Dual Xeon 2.4Gig machine with

Re: [PERFORM] bad estimates

2003-08-30 Thread Ken Geis
Bruno Wolff III wrote: I haven't come up with any great ideas for this one. It might be interesting to compare the explain analyze output from the distinct on query with and without seqscans enabled. Can't do that comparison. Remember, with seqscan it fails. (Oh, and that nested loops solution I

Re: [PERFORM] sourcecode for newly release eRServer?

2003-08-30 Thread Christopher Kings-Lynne
> Does anyone know how and when the actual release will happen? > I would be glad to be an alpha tester and promise to contribute > back bug-reports/patches. I'll take cvs or tar.gz or paper tape or > stone tablets engraved in high Elvish... I think someone should call him on that :P Chris --

Re: [PERFORM] Tests

2003-08-30 Thread Bruce Momjian
What I thought was really interesting in this test was a dramatic difference for ext3 mounted "writeback" in the "h1" test, 1 minute vs. 9 minutes compared to the default "ordered" mount option. This was the "add new constraint" test. -

Re: [PERFORM] bad estimates

2003-08-30 Thread Bruno Wolff III
I haven't come up with any great ideas for this one. It might be interesting to compare the explain analyze output from the distinct on query with and without seqscans enabled. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please

Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-30 Thread Bruce Momjian
matt wrote: > > Are you *sure* about that 3K updates/inserts per second xlates > > to 10,800,000 per hour. That, my friend, is a WHOLE HECK OF A LOT! > > Yup, I know! Just a data point, but on my Dual Xeon 2.4Gig machine with a 10k SCSI drive I can do 4k inserts/second if I turn fsync off

Re: [PERFORM] The results of my PostgreSQL/filesystem performance tests

2003-08-30 Thread Bruce Momjian
Balazs Wellisch wrote: > > Bill, > > Very interesting results. I'd like to command you on your honesty. > Having started out with the intentions of proving that FreeBSD is faster > than Linux only to find that the opposite is true must not have been > rewarding for you. However, these unexpected

Re: [PERFORM] The results of my PostgreSQL/filesystem performance tests

2003-08-30 Thread Bruce Momjian
Christopher Kings-Lynne wrote: > > I'm likely going to make this the default for PostgreSQL on FreeBSD > > starting with 7.4 (just posted something to -hackers about this)f. If > > you'd like to do this in your testing, just apply the following patch. > > > > Right now PostgreSQL defaults to 8K bl

Re: [PERFORM] Indexing question

2003-08-30 Thread Rudi Starcevic
Hi, > I have some tables (which can get pretty large) in which I want to > record 'current' data as well as 'historical' data. Another solution can be using a trigger and function to record every transaction to a 'logging' table. This way you'll have one 'current' table and one 'historical' tabl