Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-30 Thread Bruce Momjian
matt wrote: Are you *sure* about that 3K updates/inserts per second xlates to 10,800,000 per hour. That, my friend, is a WHOLE HECK OF A LOT! Yup, I know! Just a data point, but on my Dual Xeon 2.4Gig machine with a 10k SCSI drive I can do 4k inserts/second if I turn fsync off. If

Re: [PERFORM] bad estimates

2003-08-30 Thread Bruno Wolff III
I haven't come up with any great ideas for this one. It might be interesting to compare the explain analyze output from the distinct on query with and without seqscans enabled. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet,

Re: [PERFORM] sourcecode for newly release eRServer?

2003-08-30 Thread Christopher Kings-Lynne
Does anyone know how and when the actual release will happen? I would be glad to be an alpha tester and promise to contribute back bug-reports/patches. I'll take cvs or tar.gz or paper tape or stone tablets engraved in high Elvish... I think someone should call him on that :P Chris

Re: [PERFORM] bad estimates

2003-08-30 Thread Ken Geis
Bruno Wolff III wrote: I haven't come up with any great ideas for this one. It might be interesting to compare the explain analyze output from the distinct on query with and without seqscans enabled. Can't do that comparison. Remember, with seqscan it fails. (Oh, and that nested loops solution

[PERFORM] Selecting random rows efficiently

2003-08-30 Thread Richard Jones
Hi, i have a table of around 3 million rows from which i regularly (twice a second at the moment) need to select a random row from currently i'm doing order by rand() limit 1 - but i suspect this is responsible for the large load on my db server - i guess that PG is doing far too much work

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Jeff
On Sat, 30 Aug 2003, Richard Jones wrote: Hi, i have a table of around 3 million rows from which i regularly (twice a second at the moment) need to select a random row from currently i'm doing order by rand() limit 1 - but i suspect this is responsible for the large load on my db server - i

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Richard Jones
On Saturday 30 August 2003 1:08 pm, you wrote: On Sat, 30 Aug 2003, Richard Jones wrote: Hi, i have a table of around 3 million rows from which i regularly (twice a second at the moment) need to select a random row from currently i'm doing order by rand() limit 1 - but i suspect this is

Re: [PERFORM] bad estimates

2003-08-30 Thread Ken Geis
Bruno Wolff III wrote: I haven't come up with any great ideas for this one. It might be interesting to compare the explain analyze output from the distinct on query with and without seqscans enabled. After digging through planner code, I found that bumping up the sort_mem will make the planner

Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-30 Thread Rod Taylor
My question is: is that current peak rate (300 inserts/updates *or* 2500 selects) based upon 1 connection, or many connections? With 4 CPUs, and a 4 disk RAID10, I wouldn't be surprised if 4 con- current connections gives the optimum speed. Optimum number of active workers is probably

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Rod Taylor
i was hoping there was some trickery with sequences that would allow me to easily pick a random valid sequence number..? I would suggest renumbering the data. ALTER SEQUENCE ... RESTART WITH 1; UPDATE table SET pkey = DEFAULT; Of course, PostgreSQL may have trouble with that update due to

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Ron Johnson
On Sat, 2003-08-30 at 08:09, Richard Jones wrote: Hi, i have a table of around 3 million rows from which i regularly (twice a second at the moment) need to select a random row from currently i'm doing order by rand() limit 1 - but i suspect this is responsible for the large load on my db

Re: [PERFORM] bad estimates

2003-08-30 Thread Tom Lane
Ken Geis [EMAIL PROTECTED] writes: From what I've learned, we want to convince the optimizer to use a table scan; that's a good thing. I want it to use hashed aggregates, but I can't convince it to (unless maybe I removed all of the statistics.) You probably just need to increase

Re: [PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Tom Lane
Rob Nagler [EMAIL PROTECTED] writes: I'm trying to understand how I can get the planner to always do the right thing with this query: SELECT aa_t.min_date_time FROM aa_t , bb_t , cc_t WHERE bb_t.bb_id = aa_t.bb_id AND aa_t.realm_id =

Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Tom Lane
I said: 3. Your query now looks like SELECT * FROM table WHERE random_id = random() ORDER BY random_id LIMIT 1; Correction: the above won't give quite the right query because random() is marked as a volatile function. You can hide the random() call inside a user-defined function

Re: [PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Rob Nagler
Tom Lane writes: The reason the planner does not much like this plan is that it's estimating that quite a lot of rows will have to be hit in min_date_time order before it finds enough rows with server_id = 21. Thus the high cost estimate for the above step. Thanks for the speedy and useful

Re: [PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Ron Johnson
On Sat, 2003-08-30 at 10:47, Rob Nagler wrote: Tom Lane writes: [snip] enough. When I add this index, I will slow down inserts (about 20K/day) and increase data size (this is the second largest table in [snip] Since I gather that this is a web-site, can we presume that they are clumped into

Re: [PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Rob Nagler
Tom Lane writes: That doesn't really tell me anything. What's the proportion of 21 records out of the total table? Currently we have about 15 servers so 6% of the data is uniformly distributed with the value 21. create index fooi on foo (min_date_time) where server_id = 21; This

Re: [PERFORM] How to force Nested Loop plan?

2003-08-30 Thread Ron Johnson
On Sat, 2003-08-30 at 15:42, Rob Nagler wrote: [snip] We actually have been quite pleased with Postgres's performance without paying much attention to it before this. When we first set up Oracle, we got into all of its parameters pretty heavily. With Postgres, we just tried it and it worked.