Re: [SAtalk] unlearning a lot of ham

2003-06-19 Thread Tony Earnshaw
jeff covey wrote:

on Wed, Jun 18, 2003 at 10:50:22AM -0400%, Ross Vandegrift said:

ross if you retrain Bayes with the messages in question

apparently, this thread has wandered far enough that my original
message has been forgotten.  it's here:
http://marc.theaimsgroup.com/?l=spamassassin-talkm=105586966928433w=2

here's the question again:

jeff 75 spam messages have been recorded as ham.  i don't have
jeff the original messages, so i can't use sa-learn to unlearn
jeff them.  how can i keep the spam i've given the filter but
jeff wipe out all the ham, so i can start training ham from
jeff scratch?
AFAIK you can't. Either wipe out your Bayes database and begin again, or 
wait for the token bias to cure itself - which, given time, it will, of 
course. Depending on how much spam and non-spam you have in the database 
and the daily volume of mail, the bias for 75 false spam messages 
shouldn't take too long to decrease.

Best,

Tony

--
Tony Earnshaw
Working to get a life

http://j-walk.com/blog/docs/conference.htm
http://www.billy.demon.nl
Mail: [EMAIL PROTECTED]


---
This SF.Net email is sponsored by: INetU
Attention Web Developers  Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] unlearning a lot of ham

2003-06-18 Thread Ross Vandegrift
On Tue, Jun 17, 2003 at 09:43:55PM -0400, jeff covey wrote:
 while the side issues are interesting, does anyone have answers for
 the actual questions that started this thread?  :)

LOL, yes - if you retrain Bayes with the messages in question and
specifically mention that it's spam, they should be removed from the ham
database.



-- 
Ross Vandegrift
[EMAIL PROTECTED]

A Pope has a Water Cannon.   It is a Water Cannon.
He fires Holy-Water from it.It is a Holy-Water Cannon.
He Blesses it. It is a Holy Holy-Water Cannon.
He Blesses the Hell out of it.  It is a Wholly Holy Holy-Water Cannon.
He has it pierced.It is a Holey Wholly Holy Holy-Water Cannon.
He makes it official.   It is a Canon Holey Wholly Holy Holy-Water Cannon.
Batman and Robin arrive.   He shoots them.


---
This SF.Net email is sponsored by: INetU
Attention Web Developers  Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] unlearning a lot of ham

2003-06-18 Thread jeff covey
on Wed, Jun 18, 2003 at 10:50:22AM -0400%, Ross Vandegrift said:

ross if you retrain Bayes with the messages in question

apparently, this thread has wandered far enough that my original
message has been forgotten.  it's here:

http://marc.theaimsgroup.com/?l=spamassassin-talkm=105586966928433w=2

here's the question again:

jeff 75 spam messages have been recorded as ham.  i don't have
jeff the original messages, so i can't use sa-learn to unlearn
jeff them.  how can i keep the spam i've given the filter but
jeff wipe out all the ham, so i can start training ham from
jeff scratch?

thanks,

-- 
++
| jeff covey [EMAIL PROTECTED] http://pobox.com/~jeff.covey/ 410-869-8088 |
||
| I expect you will send me a great hep.-- freshmeat.net contributor |
++


pgp0.pgp
Description: PGP signature


Re: [SAtalk] unlearning a lot of ham

2003-06-17 Thread Jim Ford
On Tue, Jun 17, 2003 at 12:53:57PM -0400, jeff covey wrote:
 hello, all.

 # train spamassassin's bayesian filter
 macro index S | sa-learn --single --spam -D
 macro pager S | sa-learn --single --spam -D
 # report spam to razor
 macro index Z | spamassassin -r -D
 macro pager Z | spamassassin -r -D

From what I understand 'spamassassin -r' updates the bayesian database as
well as reporting to Razor, so the first 2 macros are redundant. I don't think
that this has anything to do with your problem, though.

Regards: Jim Ford


-- 
Spam poison - don't use! --- [EMAIL PROTECTED] ---


---
This SF.Net email is sponsored by: INetU
Attention Web Developers  Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] unlearning a lot of ham

2003-06-17 Thread Jim Ford
On Tue, Jun 17, 2003 at 02:26:14PM -0400, jeff covey wrote:
 on Tue, Jun 17, 2003 at 07:06:52PM +0100%, Jim Ford said:

 jim so the first 2 macros are redundant.
 
 i don't understand what you're trying to say.  one learns spam (for
 spam i don't want to report) and one learns and reports it.  you'll
 need to explain what's redundant.

I didn't realise you'd want to learn spam but not report it!
Why would you want to do this - I always report spam that slips through to
Razor?

Regards: Jim Ford

-- 
Spam poison - don't use! --- [EMAIL PROTECTED] ---


---
This SF.Net email is sponsored by: INetU
Attention Web Developers  Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] unlearning a lot of ham

2003-06-17 Thread jeff covey
on Tue, Jun 17, 2003 at 09:22:18PM +0100%, Jim Ford said:

jim I didn't realise you'd want to learn spam but not report it!
jim Why would you want to do this - I always report spam that
jim slips through to Razor?

i assume that other people could use reports of generic spam i've
received, but that it won't help them to have a checksum of a message
which is customized with my name, email address, etc.  it's not going
to help me to know that you received a spam which says dear jim ford,
here's a hot stock tip just for you...; i doubt that i'll ever
receive a copy with your name.  am i not understanding something about
how this works?

-- 
++
| jeff covey [EMAIL PROTECTED] http://pobox.com/~jeff.covey/ 410-869-8088 |
||
|   You have reached the freshmeat.net staff, and we don't understand you.   |
|  -- Ray Shaw   |
++


pgp0.pgp
Description: PGP signature


Re: [SAtalk] unlearning a lot of ham

2003-06-17 Thread Jim Ford
On Tue, Jun 17, 2003 at 06:44:06PM -0400, jeff covey wrote:

 i assume that other people could use reports of generic spam i've
 received, but that it won't help them to have a checksum of a message
 which is customized with my name, email address, etc.  it's not going
 to help me to know that you received a spam which says dear jim ford,
 here's a hot stock tip just for you...; i doubt that i'll ever
 receive a copy with your name.  am i not understanding something about
 how this works?

Interestng point! I understand DCC uses checksums, so the above point should
apply. I wonder if Razor would also be affected?

Regards: Jim Ford

-- 
Spam poison - don't use! --- [EMAIL PROTECTED] ---


---
This SF.Net email is sponsored by: INetU
Attention Web Developers  Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] unlearning a lot of ham

2003-06-17 Thread Balam Willemsen
?!? 

Both razor and DCC are supposed to be largely insensitive to this 

e.g. From http://www.rhyolite.com/anti-spam/dcc/FAQ.html

Do the fuzzy checksums ignore personalizations? 

Yes, they ignore many so called personalizations. 

Balam

 -Original Message-
 From: Jim Ford [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, June 17, 2003 5:08 PM
 To: [EMAIL PROTECTED]
 Subject: Re: [SAtalk] unlearning a lot of ham

 On Tue, Jun 17, 2003 at 06:44:06PM -0400, jeff covey wrote:
  i assume that other people could use reports of generic spam i've
  received, but that it won't help them to have a checksum of 
 a message
  which is customized 
 
 Interestng point! I understand DCC uses checksums, so the 
 above point should
 apply. I wonder if Razor would also be affected?


---
This SF.Net email is sponsored by: INetU
Attention Web Developers  Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] unlearning a lot of ham

2003-06-17 Thread jeff covey
while the side issues are interesting, does anyone have answers for
the actual questions that started this thread?  :)

most importantly:  can i remove all the (bogus) ham data in my bayes
database without removing any of the (valid) spam data i've collected?

as for reporting leading to learning as ham, i was directed to:

   http://bugzilla.spamassassin.org/show_bug.cgi?id=2027

and replied to theo privately with:

   http://jeffcovey.net/tmp/sa/

i hope that helps.

thanks,

-- 
++
| jeff covey [EMAIL PROTECTED] http://pobox.com/~jeff.covey/ 410-869-8088 |
||
|Hello there, I looked up on your website and I saw many good product there  |
|and I think I interest on your product : -do you have in stock Casio|
|Calculators CASIO GRAPH CALC? -CASIO FX7400G ?  -- freshmeat.net contributor|
++


pgp0.pgp
Description: PGP signature


Re: [SAtalk] unlearning a lot of ham

2003-06-17 Thread jeff covey
on Tue, Jun 17, 2003 at 05:30:52PM -0700%, Balam Willemsen said:

q  Do the fuzzy checksums ignore personalizations?

a  Yes, they ignore many so called personalizations.

err.  thanks, but that's not really enough information to tell me
anything.  what is considered a personalization?  does that just
mean it will ignore email addresses and dear $name,, or will it also
deal well with something like:

http://www.DirectPartners.net/rotation.asp?RefId=78868OptinId=117181814

?

well, i don't really need to know the details.  if it's true that
razor, pyzor, etc. somehow do the right thing(tm) with all spam, no
matter how it's tailored, just knowing that would be enough for me,
and i'll be happy to report it all.  i just don't want to abuse
reporting services by submitting information no one else can use.

thanks,

-- 
++
| jeff covey [EMAIL PROTECTED] http://pobox.com/~jeff.covey/ 410-869-8088 |
||
| Hi!  I think that this site, would be more photos about them.  I hope for  |
| answer!!!...  ...If you could send some photo... I\264ll thank\264s|
|-- freshmeat.net contributor|
++


pgp0.pgp
Description: PGP signature