[SNIP]
>>>> Hi there,
>>>>
>>>>
>>>> This is the table I have in mysql, and the one I intend to populate with
>>>> data:-
>>>>
>>>> mysql> describe bayes_vars;
>>>> +--------------------+--------------+------+-----+------------+----------------+
>>>> | Field              | Type         | Null | Key | Default    |
>>>> Extra          |
>>>> +--------------------+--------------+------+-----+------------+----------------+
>>>> | id                 | int(11)      | NO   | PRI | NULL       |
>>>> auto_increment |
>>>> | username           | varchar(200) | NO   | UNI |           
>>>> |                |
>>>> | spam_count         | int(11)      | NO   |     | 0         
>>>> |                |
>>>> | ham_count          | int(11)      | NO   |     | 0         
>>>> |                |
>>>> | token_count        | int(11)      | NO   |     | 0         
>>>> |                |
>>>> | last_expire        | int(11)      | NO   |     | 0         
>>>> |                |
>>>> | last_atime_delta   | int(11)      | NO   |     | 0         
>>>> |                |
>>>> | last_expire_reduce | int(11)      | NO   |     | 0         
>>>> |                |
>>>> | oldest_token_age   | int(11)      | NO   |     | 2147483647
>>>> |                |
>>>> | newest_token_age   | int(11)      | NO   |     | 0         
>>>> |                |
>>>> +--------------------+--------------+------+-----+------------+----------------+
>>>> 10 rows in set (0.00 sec)
>>>>
>>>>
>>>> The configuration I intend to use for Bayes is:
>>>>
>>>> -------------------- START local.cf -------------------------------
>>>> rewrite_header Subject *****SPAM*****
>>>> report_safe 0
>>>> report_hostname xxx.xxx.com
>>>> dns_available yes
>>>> use_dcc 1
>>>> dcc_path /usr/local/bin/dccproc
>>>> dcc_home /var/dcc
>>>> use_pyzor 1
>>>> pyzor_path /usr/bin/pyzor
>>>> pyzor_timeout 5
>>>> use_razor2 1
>>>> razor_config /etc/razor/razor-agent.conf
>>>> razor_timeout 5
>>>>
>>>> required_score 6.0
>>>>
>>>> use_bayes 1
>>>> skip_rbl_checks 1
>>>> bayes_auto_learn 0
>>>> # bayes_auto_learn_threshold_nonspam    0.1
>>>> # bayes_auto_learn_threshold_spam       13.0
>>>>
>>>> bayes_expiry_max_db_size                300000
>>>> bayes_auto_expire                       1
>>>>
>>>> bayes_sql_override_username postfix 
>>>> # I don't understand what this setting does, nor why its postfix.
>>>> Postfix has no intereaction with SA in my set-up as postfix pipes the
>>>> mail into dovecot,and dovecot handles the spamc portion before filing
>>>> the email.
>>>>
>>>> |bayes_store_module              Mail::SpamAssassin::BayesStore::MySQL
>>>> bayes_sql_dsn                   DBI:mysql:spamassassin:localhost
>>>> bayes_sql_username              |shamster_user
>>>> |bayes_sql_password              shamster||_password|
>>>>
>>>> ifplugin Mail::SpamAssassin::Plugin::Shortcircuit
>>>> shortcircuit USER_IN_WHITELIST       on
>>>> shortcircuit SUBJECT_IN_WHITELIST    on
>>>> shortcircuit USER_IN_BLACKLIST       on
>>>> shortcircuit SUBJECT_IN_BLACKLIST    on
>>>>
>>>> loadplugin Mail::SpamAssassin::Plugin::Rule2XSBody
>>>> endif
>>>>
>>>> score RDNS_DYNAMIC 2.639 0.363 1.663 1.700
>>>> meta __PILL_PRICE_1  (0)
>>>> meta __PILL_PRICE_2  (0)
>>>> meta __PILL_PRICE_3  (0)
>>>> -------------------- END local.cf -------------------------------
>>>>
>>>> N.B Yes, I know there are some custom rules in the local.cf and these'll
>>>> be lost after an upgrade of SA, but I have reasonable backups.
>>>>
>>>> * Questions
>>>> Does the configuration above look correct?
>>>> Will SA only write into the table bayes_vars, or will it touch other 
>>>> tables?
>>> Seems that some process butchered part of the config by discovering some
>>> pipe characters.
>>> [SNIP]
>>>
>>> Other question: If the above looks correct, is that somethin else that I
>>> ought to enable?  e.g plugins for mysql, or a particular perl module
>>> that I might have omitted?
>>>
>>> Regards, S.
>> Regarding local.cf
>>
>> Should the password be quoted such as in single quotes?
>>
>> The password has many strange chars in it e.g
>>     bayes_sql_password    fg$%-)_()(Wsuisrt{^%TEST
> RTFM problem... Apologies.
>
>     Jun 30 16:10:11.628 [2220] dbg: bayes: found bayes db version 3
>     Jun 30 16:10:11.628 [2220] dbg: bayes: Using userid: 186
>     Jun 30 16:10:11.628 [2220] dbg: bayes: not available for scanning,
> only 0 spam(s) in bayes DB < 200
>
> Solved by feeding one piece of spam to init the database:
>     sa-learn --spam gtube.txt
>
> However, I added some messages, but the detail from --dump magic shows
> nothing:
> # sa-learn --ham cur/
> Learned tokens from 25 message(s) (26 message(s) examined)
> # sa-learn --dump magic
> 0.000          0          3          0  non-token data: bayes db version
> 0.000          0          0          0  non-token data: nspam
> 0.000          0          0          0  non-token data: nham
> 0.000          0          0          0  non-token data: ntokens
> 0.000          0 2147483647          0  non-token data: oldest atime
> 0.000          0          0          0  non-token data: newest atime
> 0.000          0          0          0  non-token data: last journal
> sync atime
> 0.000          0          0          0  non-token data: last expiry atime
> 0.000          0          0          0  non-token data: last expire
> atime delta
> 0.000          0          0          0  non-token data: last expire
> reduction count
>
> I checked if the postfix entry was created in bayes_vars;
> | postfix                       |          0 |         0 |
> +-------------------------------+------------+-----------+
>
> Does this look correct?
>
>
>
>
I loaded a substantial number of messages via sa-learn :


mysql> select * from bayes_vars where username='postfix';
+-----+----------+------------+-----------+-------------+-------------+------------------+--------------------+------------------+------------------+
| id  | username | spam_count | ham_count | token_count | last_expire |
last_atime_delta | last_expire_reduce | oldest_token_age |
newest_token_age |
+-----+----------+------------+-----------+-------------+-------------+------------------+--------------------+------------------+------------------+
| 186 | postfix  |          0 |         0 |           0 |           0
|                0 |                  0 |       2147483647
|                0 |
+-----+----------+------------+-----------+-------------+-------------+------------------+--------------------+------------------+------------------+
1 row in set (0.00 sec)


# sa-learn --dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0          0          0  non-token data: nspam
0.000          0          0          0  non-token data: nham
0.000          0          0          0  non-token data: ntokens
0.000          0 2147483647          0  non-token data: oldest atime
0.000          0          0          0  non-token data: newest atime
0.000          0          0          0  non-token data: last journal
sync atime
0.000          0          0          0  non-token data: last expiry atime
0.000          0          0          0  non-token data: last expire
atime delta
0.000          0          0          0  non-token data: last expire
reduction count


Still the data was not put into it. I would be intersted to know where
it did store the data, because there might well be a file on the disc
that is growing for no real reason?
Does anyone know where sa-learn would put the data, if its not loading
it into mysql?

Regards

Reply via email to