Mark,

On 8/28/07, Mark Martinec <[EMAIL PROTECTED]> wrote:
> Andrzej,
>
> > > Btw, versions of amavisd prior to 2.4.0 (which started suggesting
> > > the use of a foreign key) had the following example in README.sql:
> > >
> > > DELETE FROM msgs WHERE UNIX_TIMESTAMP()-time_num > 7*24*60*60;
> > > DELETE FROM msgs WHERE UNIX_TIMESTAMP()-time_num > 60*60
> > >   AND content IS NULL;
>
> > Mark, that would surely prevent index on time_num from being used, so
> > this query would be the slowest one...
>
> I was quoting the old README.sql.

Ok nothing wrong with it, just that OP wanted high performance out of
the queries, because of huge tables. I looked at it and found some
things we can optimize.

> The current release notes (2.5.2) suggests a swapped order:
>
>   README.sql-mysql:
> DELETE FROM msgs WHERE time_num < UNIX_TIMESTAMP()-14*24*60*60;
> DELETE FROM msgs WHERE time_num < UNIX_TIMESTAMP()-60*60 AND content IS NULL;

For large tables I'd suggest one improvement:
SET @min_time := UNIX_TIMESTAMP()-14*24*60*60;
DELETE FROM msgs WHERE time_num < @mintime;

That would avoid calculating UNIX_TIMESTAMP() and the subtraction for
every row in the table (thousands or millions times).

Of course there's also problem with deleting large batch of records,
they first must be written to the transaction log (and maybe to binary
log used for replication), what surely take much time; it's better to
delete them in batches, say 1000 rows in a batch, to reduce
transaction log contention. This is pretty trivial when there are no
joins and somewhat more sophisticated with joins; in both cases it
needs a stored procedure; I'll write some tomorrow.

>   README.sql-pg:
> DELETE FROM msgs WHERE time_iso < now() - INTERVAL '3 weeks';
> DELETE FROM msgs WHERE time_iso < now() - INTERVAL '1 h' AND content IS NULL;
>
> At least on PostgreSQL it does use an index on time_iso.

They, too, force SQL to calculate now() and the subtraction for every
row in the table (see EXPLAIN VERBOSE), and also exploit the
transaction log problem with large deletes, but I don't know how to
optimize these queries in pure SQL as understood by PgSQL.

Regards,
Andrzej

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Reply via email to