On Mon, Aug 02, 2010 at 08:07:41PM +0200, Alban Hertroys wrote: > On 2 Aug 2010, at 19:23, James Cloos wrote: > > >>>>>> "AH" == Alban Hertroys <dal...@solfertje.student.utwente.nl> writes: > > > > AH> I don't think the reindex is necessary at all. That's a corner-case > > AH> operation that you shouldn't need in normal database operation. I > > AH> don't see why dspam would need it, at best there'd be some > > AH> high-volume scenarios where this turns out to be a gain, but for > > AH> those cases you need custom solutions anyway. > > > > High-volume is dependent on the size of the box. > > > > At least up through pg8 index entries are not re-used. They accumulate, > > increasing the size of the index files to the point that they can become > > too large for the shared_buffers. When that occurs, performace dies. > > The truth is more complicated than that. > > See: > http://www.postgresql.org/docs/8.4/interactive/routine-reindex.html > > And to some extent: > http://www.postgresql.org/docs/8.4/interactive/sql-createindex.html#SQL-CREATEINDEX-STORAGE-PARAMETERS > > > For any temporal data set -- and dspam certainly qualifies -- regular > > use of reindex is the recomendation one gets from the pg lists. They > > will say that vacuum full is needed in only the most extremely temporal > > cases, but that regular reindexing is vital. > > I don't think so actually. > > Dspam tends to eventually delete everything older than a certain age if I > understand the algorithms correctly, so those pages of the index will be > empty after that age and they will be deleted. Not much index bloat there; > does that warrant putting an exclusive lock on the index while reindexing - > which may take some time? > > Of course, for versions before PG 7.4 this doesn't hold true, but people > really shouldn't be using those versions anyway. > > > How often, though, should vary based on the size of the tables, the size > > of the purge and the size of the box. And it probably should do vacuum, > > reindex, analyze rather than the current vacuum, analyze, reindex. > > > VACUUM ANALYSE (as one command it's only one parse of the tables) for the > tables that see much change should suffice. It marks rows that got deleted > (and an update is an insert+delete) for reclaiming and updates the table > statistics so that the planner can make better choices. > > On very large dspam installations it could be beneficial to run REINDEX every > now and then, but as part of the nightly purge script is probably too > frequent. > > Of course most of this is conjecture without hard numbers to back this up. In > my experience people who're experiencing performance problems often tune the > wrong part of the problem, without even looking at output from, for example, > EXPLAIN ANALYSE. That's like buying a new couch and then "fixing" the door > frame because it doesn't fit through. > > Alban Hertroys > I would have to agree with Alban and Stevan. The comment about reindexing to clean-up index bloat does not apply since the indexes defined in the schema do not index values with ranges that change permanently over time. The indexes that I have from the pgsql_objects.sql are:
dspam_token_data: UNIQUE (uid, token) dspam_signature_data: UNIQUE (uid, signature) dspam_stats: UNIQUE (uid) dspam_preferences: UNIQUE (uid, preference) None of these meet the criteria for index bloat. I think the most useful result of a reindex is to distribut free-space for new entries throughout the index which should help the locality of reference for new entries/updates. Although adding local free space to the tables is needed as well for HOT updates to work so you probably need to run a CLUSTER after the database reaches a production steady state. I have not yet tested, but I would think that a yearly maintenance period would suffice. Regards, Ken ------------------------------------------------------------------------------ The Palm PDK Hot Apps Program offers developers who use the Plug-In Development Kit to bring their C/C++ apps to Palm for a share of $1 Million in cash or HP Products. Visit us here for more details: http://p.sf.net/sfu/dev2dev-palm _______________________________________________ Dspam-user mailing list Dspam-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-user