I don't think I have sent you the following message I found in syslog.
  Actually, this message appears in spurts of may be 50 lines.  I am
not sure why dspam is looking for hash tables.

mail dspam[2272]: hash table
/usr/local/var/dspam/data/[email protected]/[email protected] full

On Thu, Jan 28, 2010 at 6:55 PM, Stevan Bajić <[email protected]> wrote:
> On Thu, 28 Jan 2010 18:26:15 -0500
> Roman Gelfand <[email protected]> wrote:
>
>> I haven't really dealt with the utilities much.  When you say drop the
>> old data, do you mean physically go into the db delete data on all
>> dspam tables or use a utility?  If use a utility, which one?
>>
> TRUNCATE `dspam_signature_data`;
> TRUNCATE `dspam_stats`;
> TRUNCATE `dspam_token_data`;
>
>
>>
>> On Thu, Jan 28, 2010 at 6:14 PM, Stevan Bajić <[email protected]> wrote:
>> > On Thu, 28 Jan 2010 11:55:48 -0500
>> > Roman Gelfand <[email protected]> wrote:
>> >
>> >> #
>> >> # Training Mode: The default training mode to use for all operations, when
>> >> # one has not been specified on the commandline or in the user's 
>> >> preferences.
>> >> # Acceptable values are:
>> >> #     toe     Train on Error (Only)
>> >> #     teft    Train Everything (Trains on every message)
>> >> #     tum     Train Until Mature (Train only tokens without enough data)
>> >> #     notrain Do not train or store signatures (large ISP systems, 
>> >> post-train)
>> >> #
>> >> TrainingMode teft
>> >>
>> > Please switch that to "toe"! Using "teft" is old school and one part of 
>> > your problem.
>> >
>> >
>> >> #
>> >> # Features: Specify features to activate by default; can also be specified
>> >> # on the commandline. See the documentation for a list of available 
>> >> features.
>> >> # If _any_ features are specified on the commandline, these are ignored.
>> >> #
>> >> #Feature noise
>> >> Feature whitelist
>> >>
>> > Enable "noise". It's a good thing that will help you.
>> >
>> >
>> >> # Training Buffer: The training buffer waters down statistics during 
>> >> training.
>> >> # It is designed to prevent false positives, but can also dramatically 
>> >> reduce
>> >> # dspam's catch rate during initial training. This can be a number from 0
>> >> # (no buffering) to 10 (maximum buffering). If you are paranoid about 
>> >> false
>> >> # positives, you should probably enable this option.
>> >> #
>> >> #Feature tb=5
>> >>
>> > Depending on the data you already have learned, it could be beneficial to 
>> > enable this option.
>> >
>> >
>> >> #
>> >> # Tokenizer: Specify the tokenizer to use. The tokenizer is the piece
>> >> # responsible for parsing the message into individual tokens. Depending on
>> >> # how many resources you are willing to trade off vs. accuracy, you may
>> >> # choose to use a less or more detailed tokenizer:
>> >> #   word    uniGram (single word) tokenizer
>> >> #           Tokenizes message into single individual words/tokens
>> >> #           example: "free" and "viagra"
>> >> #   chain   biGram (chained tokens) tokenizer (default)
>> >> #           Single words + chains adjacent tokens together
>> >> #           example: "free" and "viagra" and "free viagra"
>> >> #   sbph    Sparse Binary Polynomial Hashing tokenizer
>> >> #           Creates sparse token patterns across sliding window of 
>> >> 5-tokens
>> >> #           example: "the quick * fox jumped" and "the * * fox jumped"
>> >> #   osb     Orthogonal Sparse biGram tokenizer
>> >> #           Similar to SBPH, but only uses the biGrams
>> >> #           example: "the * * fox" and "the * * * jumped"
>> >> #
>> >> Tokenizer chain
>> >>
>> > That is the main part of your problem. It is no surprise that you retrain 
>> > and retrain and retrain and still don't get the data to flip the state. 
>> > Please use "osb". It's way better for your situation.
>> >
>> >
>> >> #
>> >> # Preferences: Specify any preferences to set by default, unless otherwise
>> >> # overridden by the user (see next section) or a default.prefs file.
>> >> # If user or default.prefs are found, the user's preferences will 
>> >> override any
>> >> # defaults.
>> >> #
>> >> Preference "trainingMode=TEFT"                # { TOE | TUM | TEFT | 
>> >> NOTRAIN } -> default:teft
>> >>
>> > Set this to "TOE"
>> >
>> > ------------------------------------------------------------------------------
>> > The Planet: dedicated and managed hosting, cloud storage, colocation
>> > Stay online with enterprise data centers and the best network in the 
>> > business
>> > Choose flexible plans and management services without long-term contracts
>> > Personal 24x7 support from experience hosting pros just a phone call away.
>> > http://p.sf.net/sfu/theplanet-com
>> > _______________________________________________
>> > Dspam-user mailing list
>> > [email protected]
>> > https://lists.sourceforge.net/lists/listinfo/dspam-user
>> >
>>
>
> ------------------------------------------------------------------------------
> The Planet: dedicated and managed hosting, cloud storage, colocation
> Stay online with enterprise data centers and the best network in the business
> Choose flexible plans and management services without long-term contracts
> Personal 24x7 support from experience hosting pros just a phone call away.
> http://p.sf.net/sfu/theplanet-com
> _______________________________________________
> Dspam-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspam-user
>

------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to