Martha Stewart called it a Good Thing when [EMAIL PROTECTED] ("Casey Allen
> I posted about this a couple days ago on dspam-dev...
> I am using DSpam with PostgreSQL, and like you discovered the horrible
> performance. The reason is because the default PostgreSQL query planner
> settings determine that a sequence scan will be more efficient than an
> index scan, which is wrong. To correct this behavior, adjust the query
> planner settings for the appropriate table/column with this command:
> alter table "dspam_token_data" alter "token" set statistics 200; analyze;
> Let me know if it help you. It worked wonders for me.
That makes a great deal of sense; the number of tokens are likely to
be rather larger than 10, and are likely to be quite unevenly
distributed. That fits with the need you found to collect more
statistics on that column.
Other cases where it seems plausible that it would be worthwhile to do
alter table dspam_signature_data alter signature set statistics 200;
alter table dspam_neural_data alter node set statistics 200;
alter table dspam_neural_decisions alter signature set statistics 200;
Lionel's suggestion of having a functional index on dspam_token_data
(innocent_hits + spam_hits) also seems likely to be helpful. Along
with that, it might prove necessary to alter stats on dspam_token_data
alter table dspam_token_data alter innocent_hits set statistics 200;
alter table dspam_token_data alter spam_hits set statistics 200;
None of these changes are likely to make things materially worse; if
they do help, they'll help rather a lot.
(format nil "[EMAIL PROTECTED]" "cbbrowne" "gmail.com")
Rules of the Evil Overlord #112. "I will not rely entirely upon
"totally reliable" spells that can be neutralized by relatively
inconspicuous talismans." <http://www.eviloverlord.com/>
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?