http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4052
[EMAIL PROTECTED] changed:
What|Removed |Added
Status|NEW |RESOLVED
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-08-23 20:54 ---
We released a dataset of email written in Chinese for the purpose of research:
http://www.ccert.edu.cn/spam/sa/datasets.htm
best,
qa
--- You are
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-07-17 19:10 ---
Created an attachment (id=3026)
-- (http://bugzilla.spamassassin.org/attachment.cgi?id=3026action=view)
mass-check results
Attached are the results of my
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-07-17 20:29 ---
Created an attachment (id=3027)
-- (http://bugzilla.spamassassin.org/attachment.cgi?id=3027action=view)
Masscheck for the Chinese_rules.cf updated Jul. 13
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-07-17 20:34 ---
Here is our masscheck results (Spam set is within 6 months.)
I wonder if your spam set is the latest ?
--- You are receiving this mail because: ---
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
[EMAIL PROTECTED] changed:
What|Removed |Added
Target Milestone|Undefined |3.2.0
--- Additional
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-03-20 17:21 ---
hi, guys in CERNET, there is one thing you've fogot: the chinese subject
header is in QB encoded format, which is looks like: Subject: =?gb2312?B?zbdt=-
sa=2
so
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-03-20 19:59 ---
hi, guys in CERNET, there is one thing you've fogot: the chinese subject
header is in QB encoded format, which is looks like: Subject: =?gb2312?B?
zbdt=-
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-01-22 00:20 ---
CLA received
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
[EMAIL PROTECTED] changed:
What|Removed |Added
CC|[EMAIL PROTECTED]|
--- Additional Comments From
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-01-09 23:39 ---
Would it be possible for you to sign and fax an Apache CLA so that we can
incorporate these (or at least test them)?
We have faxed a signed CCLA (Duan Haixin
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-01-10 00:47 ---
Subject: Re: SpamAssassin rules file: Chinese subject and body tests
We have faxed a signed CCLA (Duan Haixin signed) to the ASF. Please
notify me when you
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-01-08 06:36 ---
Hi,
two questions:
1. Each rule in SpamAssasin have 4 scores. How does SpamAssassin set the last 3
scores ?
2. If the last 3 scores are absent, will
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-01-06 17:45 ---
'In future, if SpamAssassin converts all text strings in mails to UTF-8 before
applying the ruleset, the current version of Chinese_rules.cf will not work.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-01-06 17:53 ---
Subject: Re: SpamAssassin rules file: Chinese subject and body tests
Minor thought Justin - would it be feasible to have a utf option that
could be set in the
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-01-06 18:12 ---
yeah, I think tflags would be the best option -- it's a feature of the rule, not
of the user or the scanning host.
but no need to worry too much about it right
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-01-05 02:59 ---
1. In our experience, patterns which span 4 or more words, are often more
effective at catching a small set of spam, but with very low false positive
rates,
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-01-04 02:31 ---
Dear Colleagues,
Thank you very much for your attention to the Chinese_rules.cf.
Yes, the recall/error rates have much improved since I use the perceptron code
http://bugzilla.spamassassin.org/show_bug.cgi?id=4052
--- Additional Comments From [EMAIL PROTECTED] 2005-01-03 11:04 ---
Hi --
these look very interesting, and I like the methodology! (I also notice that
the recall/error rates have improved from the figures quoted in the
19 matches
Mail list logo