> I'll explain a bit more: > > - all folders are processed : "the youngest files first" > - both error folders are fully processed up to MaxFiles > > As the result of processing the first two folders we get a weight > (spam/ham). Now we know were we are: we have a current weight, a > wanted weight, and we now how many files are in the spam and notspam > folders. Now assp calculates the maximum of files in the spam folder > that could be apx. used , if we assume that at least all files in the > notspam folder will be enougth to get the wanted target norm. > The spam folder is processed. > Now we know the new spam/ham weigth and can more exactly calculate, > how many of the files in the notspam folder are required to reach the > wanted target norm. > > I'm expressed, how exact it was working in my case.
mumble (thinking loud); our problem (if we want to call it so) is that we may have multiple spam/ham files with the same contents but different headers or even with slightly different content... now, let's leave the latter alone for the moment; let's try thinking about those "similar" files (same body, different headers); in such a case we may consider some mechanism so that, whenever (storing ? rebuilding ?) processing them, ASSP will extract the headers and body and perform some checks to see if it already "saw" that file (e.g. using a DB table containing hashes or the like) and, if so, ASSP may just avoid processing the whole "additional file"; for example it may just process (consider) the headers and skip the body (since it already saw it); I'm not sure it makes sense, again, I'm just thinking loud here... ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Assp-test mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/assp-test
