On Thu, 28 Jan 2010 19:05:13 -0500 Roman Gelfand <[email protected]> wrote:
> I don't think I have sent you the following message I found in syslog. > Actually, this message appears in spurts of may be 50 lines. I am > not sure why dspam is looking for hash tables. > > mail dspam[2272]: hash table > /usr/local/var/dspam/data/[email protected]/[email protected] full > Could it be that you use the cron job from contrib? If so, then you should use the one currently available in GIT HEAD. > On Thu, Jan 28, 2010 at 6:55 PM, Stevan Bajić <[email protected]> wrote: > > On Thu, 28 Jan 2010 18:26:15 -0500 > > Roman Gelfand <[email protected]> wrote: > > > >> I haven't really dealt with the utilities much. When you say drop the > >> old data, do you mean physically go into the db delete data on all > >> dspam tables or use a utility? If use a utility, which one? > >> > > TRUNCATE `dspam_signature_data`; > > TRUNCATE `dspam_stats`; > > TRUNCATE `dspam_token_data`; > > > > > >> > >> On Thu, Jan 28, 2010 at 6:14 PM, Stevan Bajić <[email protected]> wrote: > >> > On Thu, 28 Jan 2010 11:55:48 -0500 > >> > Roman Gelfand <[email protected]> wrote: > >> > > >> >> # > >> >> # Training Mode: The default training mode to use for all operations, > >> >> when > >> >> # one has not been specified on the commandline or in the user's > >> >> preferences. > >> >> # Acceptable values are: > >> >> # toe Train on Error (Only) > >> >> # teft Train Everything (Trains on every message) > >> >> # tum Train Until Mature (Train only tokens without enough data) > >> >> # notrain Do not train or store signatures (large ISP systems, > >> >> post-train) > >> >> # > >> >> TrainingMode teft > >> >> > >> > Please switch that to "toe"! Using "teft" is old school and one part of > >> > your problem. > >> > > >> > > >> >> # > >> >> # Features: Specify features to activate by default; can also be > >> >> specified > >> >> # on the commandline. See the documentation for a list of available > >> >> features. > >> >> # If _any_ features are specified on the commandline, these are ignored. > >> >> # > >> >> #Feature noise > >> >> Feature whitelist > >> >> > >> > Enable "noise". It's a good thing that will help you. > >> > > >> > > >> >> # Training Buffer: The training buffer waters down statistics during > >> >> training. > >> >> # It is designed to prevent false positives, but can also dramatically > >> >> reduce > >> >> # dspam's catch rate during initial training. This can be a number from > >> >> 0 > >> >> # (no buffering) to 10 (maximum buffering). If you are paranoid about > >> >> false > >> >> # positives, you should probably enable this option. > >> >> # > >> >> #Feature tb=5 > >> >> > >> > Depending on the data you already have learned, it could be beneficial > >> > to enable this option. > >> > > >> > > >> >> # > >> >> # Tokenizer: Specify the tokenizer to use. The tokenizer is the piece > >> >> # responsible for parsing the message into individual tokens. Depending > >> >> on > >> >> # how many resources you are willing to trade off vs. accuracy, you may > >> >> # choose to use a less or more detailed tokenizer: > >> >> # word uniGram (single word) tokenizer > >> >> # Tokenizes message into single individual words/tokens > >> >> # example: "free" and "viagra" > >> >> # chain biGram (chained tokens) tokenizer (default) > >> >> # Single words + chains adjacent tokens together > >> >> # example: "free" and "viagra" and "free viagra" > >> >> # sbph Sparse Binary Polynomial Hashing tokenizer > >> >> # Creates sparse token patterns across sliding window of > >> >> 5-tokens > >> >> # example: "the quick * fox jumped" and "the * * fox jumped" > >> >> # osb Orthogonal Sparse biGram tokenizer > >> >> # Similar to SBPH, but only uses the biGrams > >> >> # example: "the * * fox" and "the * * * jumped" > >> >> # > >> >> Tokenizer chain > >> >> > >> > That is the main part of your problem. It is no surprise that you > >> > retrain and retrain and retrain and still don't get the data to flip the > >> > state. Please use "osb". It's way better for your situation. > >> > > >> > > >> >> # > >> >> # Preferences: Specify any preferences to set by default, unless > >> >> otherwise > >> >> # overridden by the user (see next section) or a default.prefs file. > >> >> # If user or default.prefs are found, the user's preferences will > >> >> override any > >> >> # defaults. > >> >> # > >> >> Preference "trainingMode=TEFT" # { TOE | TUM | TEFT | > >> >> NOTRAIN } -> default:teft > >> >> > >> > Set this to "TOE" > >> > > >> > ------------------------------------------------------------------------------ > >> > The Planet: dedicated and managed hosting, cloud storage, colocation > >> > Stay online with enterprise data centers and the best network in the > >> > business > >> > Choose flexible plans and management services without long-term contracts > >> > Personal 24x7 support from experience hosting pros just a phone call > >> > away. > >> > http://p.sf.net/sfu/theplanet-com > >> > _______________________________________________ > >> > Dspam-user mailing list > >> > [email protected] > >> > https://lists.sourceforge.net/lists/listinfo/dspam-user > >> > > >> > > > > ------------------------------------------------------------------------------ > > The Planet: dedicated and managed hosting, cloud storage, colocation > > Stay online with enterprise data centers and the best network in the > > business > > Choose flexible plans and management services without long-term contracts > > Personal 24x7 support from experience hosting pros just a phone call away. > > http://p.sf.net/sfu/theplanet-com > > _______________________________________________ > > Dspam-user mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/dspam-user > > > ------------------------------------------------------------------------------ The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com _______________________________________________ Dspam-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspam-user
