On Wed, 2003-07-02 at 10:40, Michael Mattox wrote:
> > I'd be tempted to bump it up to 2.0 or 2.5 since data is on a single
> > disk (sequential scans *will* be faster than an index scan), but you
> > would need to run a benchmark on your disk to see if that is right.
> 
> I just set it to 2.5.  What kind of benchmark can I run?

Good question. I don't think there is anything official for testing
this...

How about this (not scientific in the least -- but it'll do):

Take your biggest table.  Do a select on an index column:

EXPLAIN ANALYZE
         SELECT *
           FROM <table>
          WHERE <indexed column> != <value it DOES NOT contain>

Take the time (don't run twice, we don't want the data cached anymore
than normal).

Wait an hour, and:

set enable_seqscan = false;

EXPLAIN ANALYZE
         SELECT *
           FROM <table>
          WHERE <indexed column> != <value it DOES NOT contain>;

Take the time (don't run twice in a row, etc.)

The next morning (once your cache is back to looking like production),
try it again but reverse the order (force the index scan first).  Repeat
more often to get a better value.

Average the times.  Reduce the index scan time by about 1/6th (you can
find the additional overhead of the index by looking at filesize --
reading the index pulls more data off disk than just the table itself). 
Divide the index time by the sequential scan time.

This should put you within +-0.5 of the proper value for the random_read
cost for your production system for that specific table :)

If you reconfigure the box, the amount of data increases / decreases by
an order of magnitude, you change the hardware in any way this value may
no longer be the 'one true number'.

> the webapp lets users make the query for the current day.
> the reporting app makes the query for the prior day, the current week, and
> month to date.

I see.  It might be worth while simply to pre-calculate the report. 
That is, every 5 minutes, run the queries and dump the results into a
few tables (one for each result set).

This way, displaying the results will be near instant for the
interface.  Being off by 5 minutes probably doesn't matter for a report
averaging the day, week or month.

-- 
Rod Taylor <[EMAIL PROTECTED]>

PGP Key: http://www.rbt.ca/rbtpub.asc

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to