Paul J Stevens wrote:
> Thomas Mueller wrote:
> 
>> The main problem is in db.c, db_getmailbox():
>> SELECT COUNT(message_idnr), COUNT(message_idnr) - SUM(seen_flag),
>> SUM(recent_flag) FROM dbmail_messages WHERE mailbox_idnr = '5' AND
>> status < '2'
>> That query takes up to 15 seconds (!!).
> 
> That query is entirely my 'fault' and introduced in the same patch that
> used the botched the MAX().

I'll try to answer all mails in the thread here.

Geo Carncross you said:

> CREATE INDEX dbmail_messages_8 ON dbmail_messages
> (mailbox_idnr,status,recent_flag,seen_flag);
>
> didn't have to touch any records in sqlite

No index will ever solve the problem for PostgreSQL. COUNT() can't use
any index, because indices know nothing about transactions.
That is why every record that is counted using the index has to be
checked to see if it's involved in a transaction (otherwise count()
could return too high or low numbers!) - then it's cheaper to do a full
table scan and that is what PostgreSQL does.

Tom Ivar Helbekkmo you said a speedup is planned - that is planned since
at least 7.1 (that's the first version I used). No solution was found
until now that speeds up COUNT() without slowing everything else down.
That is way Tom Lane rejected every suggestion made.

The last big thread I know is that one one pgsql-hackers:
http://archives.postgresql.org/pgsql-hackers/2005-01/msg00247.php

We (or at least I) definitely can't wait until it's fixed in PostgreSQL.
Probably every mail database grows every day so the situation becomes
even worse.

> Dbmail-1 solved this by splitting the query and doing clientside
> counting. That was too slow all over.

Yes that is no solution.

> I guess your suggestion to undo the database abstraction has some merit,
> but we need to think about this some more before we go that way:
> 
> Most code *is* shared between drivers.
> Drivers should be able to use backend optimized queries.

Hm yes. db.c contains 105 db_query() calls, I'm sure we don't want that
in every backend. So we need something like 'function overloading' I
guess? In that case pgsql.c would have to overload db_getmailbox() ? Is
that possible in C ?
Maybe the granularity has to be reduced - one SQL query one function.


Thomas
-- 
http://www.tmueller.com for pgp key (95702B3B)

Reply via email to