the 'spamdb file' is simply a 100% backup of the mysql database table if the mysql db goes away while assp is running, assp switches using the file (failover)
>whenever I rebuild the spam database a spamdb file still gets created in the ASSP folder NO - this is not the case - the file is created by the dbbackup task - setup the 'collecting' , 'Bayesian & HMM' and 'rebuild spamdb' sections of the config to your needs - enable 'newReportedInterval' - keep the content of the VERY important long time corpus in 'assp/errors/' 100% correct - advice your users and admins to report false detected spam and ham - don't try to manipulate the corpus manualy (except you know 100% what you do!) - in doubt - run a rebuildspamdb every day Keep in mind: the spamDB and the HMMdb are simply mirrors of your corpus after passing its content through a heuristic, mathematical, logical and linguistic sieve. The functions of the sieve are defined in the code and the assp configuration. If (spam/ham) reporting is not used for any reason, switch BOTH - Bayesian and HMM check off or give them a very low priority (score). IF reporting is not used - the correctness of detection, of the Bayesian and HMM check, will become more and more and even more bad than longer assp runs! Thomas Von: Jay <[email protected]> An: [email protected] Datum: 26.08.2015 22:35 Betreff: Re: [Assp-user] ASSP spam database question - MYSQL Also another question I have and I apologize for so many, since we are using MYSQL for the database back end, whenever I rebuild the spam database a spamdb file still gets created in the ASSP folder. Is ASSP using both of the spamdb file and MYSQL in conjunction with each other? I noticed in the MYSQL database the listings are just a bunch of numbers and make no sense to me, while the spamdb file has keywords in it with a number afterwards. In my config, I have the following set: Spam/HMM Bayesian Database Files (spamdb) = DB: database driver name (DBdriver) = mysql database name (mydb) = ASSP_DB So my thinking is this with my previous question, rebuild spamdb after collecting enough spam and not spam. Then I forgot that importing the spamdb file all you need to do is set a file extension of .rpl or .add depending what you would like to do. So I won't need to delete the MySQL database then. But still hazy on the spamdb and MySQL connection. Thank you. On 7/24/2015 2:01 AM, Thomas Eckardt wrote: >> So if I grab a copy maybe from a >> previous backup and start the database over again I would think I would >> be good right? > If 'ReplaceOldSpamdb' is set, the rebuildspamdb process will overwrite the > complete spamdb database. > The only way to get better detection results by the HMM and Bayesian is to > maintain the corpus. Advice your users to report SPAM. Select > 'DelResendSpam' to keep assp reacting on blockreport resends. > >> I would think if the database was way out of wack I would not getting a > confidence rating like I am. > > The corpus norm and confidence values shown in the rebuild report are > mathematical values based on the COUNT of the processed HMM sequences and > Bayesian pairs, > They do NOT show the logical sense of the corpus from a human point of > view. If your corpus is not maintained in a proper way, but contains > enough files (words), the norm and confidence will be fine, but the > detection rate will be bad. > >> I am just wondering if some >> of this might be poisoned information from when we had an issue with a >> user's email account sending spam. > Yes, if outgoing mails are stored in the corpus. > > > > Thomas > > > > > Von: Jay <[email protected]> > An: [email protected] > Datum: 24.07.2015 00:09 > Betreff: Re: [Assp-user] ASSP spam database question - MYSQL > > > > We have been running ASSP for quite a few years now, but only started > using MYSQL as the database in the past year. So why all of the sudden > now are we seeing this issue. I do a rebuild once a week and the corpus > confidence comes out at 1.00000000. > > Here's a snippet from last rebuild 2 days ago: > > Jul-21-15 11:42:31 Spam Weight: 2,333,146 > Jul-21-15 11:42:31 Not-Spam Weight: 2,333,802 > > Jul-21-15 11:42:31 Corpus norm: 0.9997 - (very good - balanced) > Jul-21-15 11:42:31 Corpus confidence: 1.00000000 > > I would think if the database was way out of wack I would not getting a > confidence rating like I am. Looking at the row count from PHPMyAdmin > the spamdb 1,237,310 rows in the database. I am just wondering if some > of this might be poisoned information from when we had an issue with a > user's email account sending spam. I know our whitelist database had to > be recovered from a backup because it grew 3 times the size from what it > was when the spam issue was going on. Hence my question about taking > care of the MySQL database now. So if I grab a copy maybe from a > previous backup and start the database over again I would think I would > be good right? > > On 7/23/2015 4:11 PM, Data Packet Networks wrote: >> You may wish to set Bayesian filtering to "monitor" until you get it >> under control. >> >> Go through your spam and not spam folders and be sure each has >> appropriate messages for Bayesian training. If a user sent spam be sure >> you remove those spam messages from the not spam folder and rebuild the >> db. White listing would not hurt in your case as those messages that >> were incorrectly identified would be placed in your not spam folder for >> future Bayesian training. >> >> You could clear the Bayesian table of your MySQL database, but i would >> just white-list to gather more non spam messages and then rebuild. >> >> On 7/23/2015 3:09 PM, Jay wrote: >>> Hey all. I have a question. Recently I have noticed that the spam > filter >>> has been blocking emails that otherwise would seem legitimate. So for >>> instance I see a lot of weird behavior from the blocked report: >>> >>> spam reason: (Bayesian) [I need a quote for laminate flooring install > in >>> Lakeland FL] >>> spam reason: (Bayesian) [Inv 5895] >>> spam reason: (Bayesian) [Invoice 10735] >>> spam reason: (Bayesian) [Aflac Policyholder Services Email Address >>> Verification Required] >>> spam reason: (Bayesian) [Payment Confirmation for Support] >>> spam reason: (Bayesian) [Your Progressive auto quote confirmation] >>> >>> These are just a few examples I found. There are others from banks, >>> insurance companies, customers, payroll services etc. Looking at some > of >>> the emails that got blocked it doesn't seem to me that they should > have. >>> If I use the built in analyzer ASSP thinks they are spam. I know I can >>> just whitelist these, but if the users don't inform me or they don't >>> check the blocked report on a daily basis they have no idea. We have > the >>> spam filter set up that if a email gets rejected it sends a response >>> email back to the user with instructions on how to get safe listed. >>> Basically they contact an email address at my office that is unfiltered >>> and request to be whitelisted. The biggest issue we deal with is that >>> either the originating email address is a unmonitored email address or >>> the user just ignores it. I know I will run into this from time to time >>> still, but what concerns me is that our spam database is poisoned and >>> flagging emails that would have otherwise gone through previously. It >>> seems within the last 2 months this issue has cropped up. We did have > an >>> issue where one of our users email got compromised and sent out spam. > So >>> I am thinking this is where the issue started and poisoned the spam >>> database. >>> >>> We currently use MYSQL as the database for the spam database. Is there > a >>> easy way to just wipe the database clean and start over from a previous >>> spamdb file? Is this as easy as just grabbing the spamdb from one of > the >>> ASSP installations and dropping it into the import folder for MYSQL? Or >>> is this the case where I have to purge the database in MYSQL and >>> recreate it fresh. Any help/advice is greatly appreciated. Thanks. >>> >>> >>> > ------------------------------------------------------------------------------ >>> _______________________________________________ >>> Assp-user mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/assp-user >> > ------------------------------------------------------------------------------ >> _______________________________________________ >> Assp-user mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/assp-user >> >> >> ----- >> No virus found in this message. >> Checked by AVG - www.avg.com >> Version: 2015.0.6081 / Virus Database: 4392/10291 - Release Date: > 07/23/15 >> > > > ------------------------------------------------------------------------------ > _______________________________________________ > Assp-user mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/assp-user > > > > > > > DISCLAIMER: > ******************************************************* > This email and any files transmitted with it may be confidential, legally > privileged and protected in law and are intended solely for the use of the > > individual to whom it is addressed. > This email was multiple times scanned for viruses. There should be no > known virus in this email! > ******************************************************* > > ------------------------------------------------------------------------------ > _______________________________________________ > Assp-user mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/assp-user > > > ----- > No virus found in this message. > Checked by AVG - www.avg.com > Version: 2015.0.6081 / Virus Database: 4392/10291 - Release Date: 07/23/15 > > ------------------------------------------------------------------------------ _______________________________________________ Assp-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/assp-user DISCLAIMER: ******************************************************* This email and any files transmitted with it may be confidential, legally privileged and protected in law and are intended solely for the use of the individual to whom it is addressed. This email was multiple times scanned for viruses. There should be no known virus in this email! ******************************************************* ------------------------------------------------------------------------------ _______________________________________________ Assp-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/assp-user
