On Thu, 28 Jan 2010 19:05:13 -0500
Roman Gelfand <[email protected]> wrote:

> I don't think I have sent you the following message I found in syslog.
>   Actually, this message appears in spurts of may be 50 lines.  I am
> not sure why dspam is looking for hash tables.
> 
> mail dspam[2272]: hash table
> /usr/local/var/dspam/data/[email protected]/[email protected] full
> 
Could it be that you use the cron job from contrib? If so, then you should use 
the one currently available in GIT HEAD.

> On Thu, Jan 28, 2010 at 6:55 PM, Stevan Bajić <[email protected]> wrote:
> > On Thu, 28 Jan 2010 18:26:15 -0500
> > Roman Gelfand <[email protected]> wrote:
> >
> >> I haven't really dealt with the utilities much.  When you say drop the
> >> old data, do you mean physically go into the db delete data on all
> >> dspam tables or use a utility?  If use a utility, which one?
> >>
> > TRUNCATE `dspam_signature_data`;
> > TRUNCATE `dspam_stats`;
> > TRUNCATE `dspam_token_data`;
> >
> >
> >>
> >> On Thu, Jan 28, 2010 at 6:14 PM, Stevan Bajić <[email protected]> wrote:
> >> > On Thu, 28 Jan 2010 11:55:48 -0500
> >> > Roman Gelfand <[email protected]> wrote:
> >> >
> >> >> #
> >> >> # Training Mode: The default training mode to use for all operations, 
> >> >> when
> >> >> # one has not been specified on the commandline or in the user's 
> >> >> preferences.
> >> >> # Acceptable values are:
> >> >> #     toe     Train on Error (Only)
> >> >> #     teft    Train Everything (Trains on every message)
> >> >> #     tum     Train Until Mature (Train only tokens without enough data)
> >> >> #     notrain Do not train or store signatures (large ISP systems, 
> >> >> post-train)
> >> >> #
> >> >> TrainingMode teft
> >> >>
> >> > Please switch that to "toe"! Using "teft" is old school and one part of 
> >> > your problem.
> >> >
> >> >
> >> >> #
> >> >> # Features: Specify features to activate by default; can also be 
> >> >> specified
> >> >> # on the commandline. See the documentation for a list of available 
> >> >> features.
> >> >> # If _any_ features are specified on the commandline, these are ignored.
> >> >> #
> >> >> #Feature noise
> >> >> Feature whitelist
> >> >>
> >> > Enable "noise". It's a good thing that will help you.
> >> >
> >> >
> >> >> # Training Buffer: The training buffer waters down statistics during 
> >> >> training.
> >> >> # It is designed to prevent false positives, but can also dramatically 
> >> >> reduce
> >> >> # dspam's catch rate during initial training. This can be a number from > >> >> 0
> >> >> # (no buffering) to 10 (maximum buffering). If you are paranoid about 
> >> >> false
> >> >> # positives, you should probably enable this option.
> >> >> #
> >> >> #Feature tb=5
> >> >>
> >> > Depending on the data you already have learned, it could be beneficial 
> >> > to enable this option.
> >> >
> >> >
> >> >> #
> >> >> # Tokenizer: Specify the tokenizer to use. The tokenizer is the piece
> >> >> # responsible for parsing the message into individual tokens. Depending 
> >> >> on
> >> >> # how many resources you are willing to trade off vs. accuracy, you may
> >> >> # choose to use a less or more detailed tokenizer:
> >> >> #   word    uniGram (single word) tokenizer
> >> >> #           Tokenizes message into single individual words/tokens
> >> >> #           example: "free" and "viagra"
> >> >> #   chain   biGram (chained tokens) tokenizer (default)
> >> >> #           Single words + chains adjacent tokens together
> >> >> #           example: "free" and "viagra" and "free viagra"
> >> >> #   sbph    Sparse Binary Polynomial Hashing tokenizer
> >> >> #           Creates sparse token patterns across sliding window of 
> >> >> 5-tokens
> >> >> #           example: "the quick * fox jumped" and "the * * fox jumped"
> >> >> #   osb     Orthogonal Sparse biGram tokenizer
> >> >> #           Similar to SBPH, but only uses the biGrams
> >> >> #           example: "the * * fox" and "the * * * jumped"
> >> >> #
> >> >> Tokenizer chain
> >> >>
> >> > That is the main part of your problem. It is no surprise that you 
> >> > retrain and retrain and retrain and still don't get the data to flip the 
> >> > state. Please use "osb". It's way better for your situation.
> >> >
> >> >
> >> >> #
> >> >> # Preferences: Specify any preferences to set by default, unless 
> >> >> otherwise
> >> >> # overridden by the user (see next section) or a default.prefs file.
> >> >> # If user or default.prefs are found, the user's preferences will 
> >> >> override any
> >> >> # defaults.
> >> >> #
> >> >> Preference "trainingMode=TEFT"                # { TOE | TUM | TEFT | 
> >> >> NOTRAIN } -> default:teft
> >> >>
> >> > Set this to "TOE"
> >> >
> >> > ------------------------------------------------------------------------------
> >> > The Planet: dedicated and managed hosting, cloud storage, colocation
> >> > Stay online with enterprise data centers and the best network in the 
> >> > business
> >> > Choose flexible plans and management services without long-term contracts
> >> > Personal 24x7 support from experience hosting pros just a phone call 
> >> > away.
> >> > http://p.sf.net/sfu/theplanet-com
> >> > _______________________________________________
> >> > Dspam-user mailing list
> >> > [email protected]
> >> > https://lists.sourceforge.net/lists/listinfo/dspam-user
> >> >
> >>
> >
> > ------------------------------------------------------------------------------
> > The Planet: dedicated and managed hosting, cloud storage, colocation
> > Stay online with enterprise data centers and the best network in the 
> > business
> > Choose flexible plans and management services without long-term contracts
> > Personal 24x7 support from experience hosting pros just a phone call away.
> > http://p.sf.net/sfu/theplanet-com
> > _______________________________________________
> > Dspam-user mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/dspam-user
> >
> 

------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to