I'm starting a new thread here - leaving the other one going where I'll
hopefully get more feedback on my ideas of reducing the corpus size.

This question has to do with what I >>think<< are inconsistencies in the
descriptions in the admin interface of 2.0.1 rc 0.4.13.

Again, consider that I am using bayesian filtering, subject filename
collecting with Move2Numb off.

MaintBayesCollection says:
Set this to on, if you want ASSP to run a maintenance tasks on the bayesian
collection folders ( spamlog , notspamlog , correctedspam , correctednotspam
). ASSP will delete the oldest files until the number of files per folder
reaches MaxFiles. If you want ASSP to delete files because of there age
instead of the number of files ( MaxFiles ), setup MaxBayesFileAge to your
needs.
This option is usefull, if UseSubjectsAsMaillogNames is set to on and
doMove2Num is set to off, because in this case the number of files in every
collection folder will grow infinite.

Breaking this down:
a) Set this to on, if you want ASSP to run a maintenance tasks on the
bayesian collection folders ( spamlog , notspamlog , correctedspam ,
correctednotspam ). [FINE]

b) ASSP will delete the oldest files until the number of files per folder
reaches MaxFiles. [OH? so if maxfiles is 10,000 and 10,000 spam files are
stored in a day, no file in the spam corpus will be older than 1 day? ]

c) If you want ASSP to delete files because of there age instead of the
number of files ( MaxFiles ), setup MaxBayesFileAge to your needs. [ Fine,
except according to b), which files ar deleted based on maxfiles is also by
age isn't it?]

This option is usefull, if UseSubjectsAsMaillogNames is set to on and
doMove2Num is set to off, because in this case the number of files in every
collection folder will grow infinite.  [ so does this mean that with
MaintBayesCollection checked, subject logging on, move2numb off, that b)
won't run? ]

MaxBayesFileAge says:
The maximum file age in days of every file in every bayesian collection
folder ( spamlog , notspamlog ). If MaintBayesCollection is set to on and a
file is older than this number in days, the file will be deleted. Default is
0. A value of 0 disables this feature and no file will be deleted because of
its age.
Do not define this option, if you use the bayesian engine of ASSP. Deleting
files because of there age, is wrong in this case!!!!!

Again, breaking it down:

a) The maximum file age in days of every file in every bayesian collection
folder ( spamlog , notspamlog ). [ fine ]

b) If MaintBayesCollection is set to on and a file is older than this number
in days, the file will be deleted. Default is 0. A value of 0 disables this
feature and no file will be deleted because of its age. [ except, if
mainbayescollection is on, then won't files be deleted based on their age,
oldest first, to keep the number below maxfiles ? ]

c) Do not define this option, if you use the bayesian engine of ASSP.
Deleting files because of there age, is wrong in this case!!!!! [ OK, then
without using my method as described in the other threads, how do those of
us using baysian filtering, subject file names, move2numb off keep the
number of files in the corpus to an acceptable number ? ]
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Assp-test mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to