On Tue, 2007-11-13 at 11:30 -0500, Micheal Espinola Jr wrote: > William L. Thomson Jr. wrote: > > I have some questions about the how the Bayesian filter chooses the > > numbers for spam file names. The numbers it uses in a given day are all > > over the place. That is fine, and I really don't care. > > Its a randomly assigned number. There is no order or logic in how it > is chosen, other than it is a number between 0 and the MaxFiles size. > It is intentionally done this way to keep the corpus randomly diverse.
Is the date not considered at all? Seems like it would be more effective to toss older spam in favor of keeping more current spam. Otherwise one might say rebuild the bayesian db mostly off older spam. Instead of current patterns. Unless that's also part of the random design. Part of what I am getting at is stuff being false rejected as spam. Disappearing or etc before I have a chance to move/sort it. In the mean time, most times I have re-built the bayesian db. So kinda screwed myself with no way to correct now that the email is MIA :) Just have to hope another similar one arrives, that I can catch in time to move to like errors/notspam or etc. -- William L. Thomson Jr. Gentoo/Java
signature.asc
Description: This is a digitally signed message part
------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________ Assp-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/assp-user
