On Mon, Aug 02, 2010 at 08:07:41PM +0200, Alban Hertroys wrote:
> On 2 Aug 2010, at 19:23, James Cloos wrote:
> 
> >>>>>> "AH" == Alban Hertroys <dal...@solfertje.student.utwente.nl> writes:
> > 
> > AH> I don't think the reindex is necessary at all. That's a corner-case
> > AH> operation that you shouldn't need in normal database operation. I
> > AH> don't see why dspam would need it, at best there'd be some
> > AH> high-volume scenarios where this turns out to be a gain, but for
> > AH> those cases you need custom solutions anyway.
> > 
> > High-volume is dependent on the size of the box.
> > 
> > At least up through pg8 index entries are not re-used.  They accumulate,
> > increasing the size of the index files to the point that they can become
> > too large for the shared_buffers.  When that occurs, performace dies.
> 
> The truth is more complicated than that.
> 
> See:
> http://www.postgresql.org/docs/8.4/interactive/routine-reindex.html
> 
> And to some extent:
> http://www.postgresql.org/docs/8.4/interactive/sql-createindex.html#SQL-CREATEINDEX-STORAGE-PARAMETERS
> 
> > For any temporal data set -- and dspam certainly qualifies -- regular
> > use of reindex is the recomendation one gets from the pg lists.  They
> > will say that vacuum full is needed in only the most extremely temporal
> > cases, but that regular reindexing is vital.
> 
> I don't think so actually.
> 
> Dspam tends to eventually delete everything older than a certain age if I 
> understand the algorithms correctly, so those pages of the index will be 
> empty after that age and they will be deleted. Not much index bloat there; 
> does that warrant putting an exclusive lock on the index while reindexing - 
> which may take some time?
> 
> Of course, for versions before PG 7.4 this doesn't hold true, but people 
> really shouldn't be using those versions anyway.
> 
> > How often, though, should vary based on the size of the tables, the size
> > of the purge and the size of the box.  And it probably should do vacuum,
> > reindex, analyze rather than the current vacuum, analyze, reindex.
> 
> 
> VACUUM ANALYSE (as one command it's only one parse of the tables) for the 
> tables that see much change should suffice. It marks rows that got deleted 
> (and an update is an insert+delete) for reclaiming and updates the table 
> statistics so that the planner can make better choices.
> 
> On very large dspam installations it could be beneficial to run REINDEX every 
> now and then, but as part of the nightly purge script is probably too 
> frequent.
> 
> Of course most of this is conjecture without hard numbers to back this up. In 
> my experience people who're experiencing performance problems often tune the 
> wrong part of the problem, without even looking at output from, for example, 
> EXPLAIN ANALYSE. That's like buying a new couch and then "fixing" the door 
> frame because it doesn't fit through.
> 
> Alban Hertroys
> 
I would have to agree with Alban and Stevan. The comment about reindexing to
clean-up index bloat does not apply since the indexes defined in the schema
do not index values with ranges that change permanently over time. The indexes
that I have from the pgsql_objects.sql are:

dspam_token_data:
UNIQUE (uid, token)

dspam_signature_data:
UNIQUE (uid, signature)

dspam_stats:
UNIQUE (uid)

dspam_preferences:
UNIQUE (uid, preference)

None of these meet the criteria for index bloat. I think the most useful result 
of
a reindex is to distribut free-space for new entries throughout the index which 
should
help the locality of reference for new entries/updates. Although adding local 
free space
to the tables is needed as well for HOT updates to work so you probably need to 
run a
CLUSTER after the database reaches a production steady state. I have not yet 
tested,
but I would think that a yearly maintenance period would suffice.

Regards,
Ken

------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to