Re: What's the best way to set up POPFile with The Bat!

2004-09-06 Thread MAU
Hello Doug,

 I'm thinking particularly about whether it is best to have it edit the
 subject line or add an x-header, but any other tips will be welcome!

Use x-header. You can filter on it just as well and, if you reply, you
don't bother anyone with a modified subject.

-- 
Best regards,

Miguel A. Urech (El Escorial - Spain)
Using The Bat! v3.0.0.7





Current version is 3.00.00 | 'Using TBUDL' information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: What's the best way to set up POPFile with The Bat!

2004-09-06 Thread admin

 Hello,

   I'm thinking particularly about whether it is best to have it edit the
   subject line or add an x-header, but any other tips will be wel

I make it alter the subject line so that:

a) I know what I am sending if I forward or reply to the email
(manually delete any addition)...

b) I can click on it to get directly to the change classificatiion
page of POPFile.

   Can I expect it to be faster then the current default plugin in V3.0?

In my experience you can expect it to be 100 per cent faster. I./e:
BayesIt kills The Bat every time, POPFile doesn't. And I find POPFile
so much easier to control - I didn't have aclue what was going on with
BayesIt.

-- 
Marten Gallagher
Annery Kiln Web Design
www.annerykiln.co.uk
Using The Bat! 3.0
with POPFile 0.21.1
on Windows XP 5.1 






Current version is 3.00.00 | 'Using TBUDL' information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: What's the best way to set up POPFile with The Bat!

2004-09-06 Thread Alexander S. Kunz
Hello Doug Weller,

06-Sep-2004 20:37, you wrote:

 I'm thinking particularly about whether it is best to have it edit the
 subject line or add an x-header, but any other tips will be welcome!

I don't like the subjects to be altered (you have to think of it each time
when you reply/forward unless you clean the subject in TB! with a
regex-enhanced template for replies and forwards) - obviously, I'm using
the x-text-classification header. :)

This has other advantages: you can configure TB! to show the header, as
well as the reclassification URL of PopFile in the header pane of the
preview window, thus checking wether the classification is correct and, in
case PopFile made a wrong classification, reclassify is only one click
away. *nice*


 Can I expect it to be faster then the current default plugin in V3.0?

Depending on how much mail you get (well, being on this list alone brings
quite some each day ATM...) it may be slower (but I used BayesIt only a
very short time). After all PopFile is written in Perl, and Bayes analysis
requires a database (mine is 1.4MB right now).

And be aware that PopFile can be quite a resource hog - using concurrent
POP connections results in 100% CPU-load peaks during mail retrieval, and
memory usage is something like 20MB here, always.

Another note: of course, every Bayes filter requires training - it needs to
know which messages is spam, which is ham, at least (but keeping only spam
and ham apart is not all PopFile will do for you, of course). That is so
much easier in PopFile with the use of magnets. For example, set up a
magnet that will put every mail from this list into a bucket genuine (or
whatever you want to call it) - PopFile learns from every magneted message
and you have a big ham database in no time.

JFYI: I have seven buckets in PopFile (english, german, spam, invoices,
newsletters, ebay, orders) and run the program since 21-Jul-2004. Accuracy
is 98.51% at the moment, with a total of 4313 messages classified and 64
classification errors.

-- 
Best regards,
 Alexander (http://www.neurowerx.de - ICQ 238153981)

I'm afraid of losing my obscurity. Genuineness only thrives in the dark.
Like celery. (Aldous Huxley, 1894-1963, British author)



Current version is 3.00.00 | 'Using TBUDL' information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: What's the best way to set up POPFile with The Bat!

2004-09-06 Thread MAU
Hello Alexander,

 JFYI: I have seven buckets in PopFile (english, german, spam, invoices,
 newsletters, ebay, orders) and run the program since 21-Jul-2004. Accuracy
 is 98.51% at the moment, with a total of 4313 messages classified and 64
 classification errors.

Similarly to you I use 9 buckets, two of them for Spanish and English,
and this my performance:

Classification Accuracy
Messages classified: 192,991 
Classification errors: 323
Accuracy: 99.83% 

I use one of the buckets as a virus _detector_  and, including the
training period, this are the results so far:

Bucket virus
Messages classified: 8,219 (4.25%)
False Positives: 21
False negatives: 56

And it hasn't had any wrong classification in many months now.

-- 
Best regards,

Miguel A. Urech (El Escorial - Spain)
Using The Bat! v3.0.0.7





Current version is 3.00.00 | 'Using TBUDL' information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: What's the best way to set up POPFile with The Bat!

2004-09-06 Thread Doug Weller
Hi Alexander,


Monday, September 6, 2004, 9:45:09 PM, you wrote:

Alexander Hello Doug Weller,

Alexander 06-Sep-2004 20:37, you wrote:

 I'm thinking particularly about whether it is best to have it edit the
 subject line or add an x-header, but any other tips will be welcome!

Alexander I don't like the subjects to be altered (you have to think of it each time
Alexander when you reply/forward unless you clean the subject in TB! with a
Alexander regex-enhanced template for replies and forwards) - obviously, I'm using
Alexander the x-text-classification header. :)

I agree.  My ISP filters spam and adds a header which I can configure,
but I think I shall turn that off as I hate having ham which has
[probable spam] in the subject line!

Alexander This has other advantages: you can configure TB! to show the header, as
Alexander well as the reclassification URL of PopFile in the header pane of the
Alexander preview window, thus checking wether the
Alexander classification is correct and, in
Alexander case PopFile made a wrong classification,
Alexander reclassify is only one click
Alexander away. *nice*

Good idea.


 Can I expect it to be faster then the current default plugin in V3.0?

Alexander Depending on how much mail you get (well, being
Alexander on this list alone brings
Alexander quite some each day ATM...) it may be slower (but I used BayesIt only a
Alexander very short time). After all PopFile is written in Perl, and Bayes analysis
Alexander requires a database (mine is 1.4MB right now).

I hope it's not slower, filtering seems to be slowing down email
delivery right now for me.  And my wife's is terrible, 23 emails taking
3 minutes, I have no idea why as she has a very small database. She's
using v.3 also but that made no difference, nor did getting rid of the
dozen .tmp files, compressing, etc.

Alexander And be aware that PopFile can be quite a resource hog - using concurrent
Alexander POP connections results in 100% CPU-load peaks
Alexander during mail retrieval, and
Alexander memory usage is something like 20MB here, always.

Alexander Another note: of course, every Bayes filter
Alexander requires training - it needs to
Alexander know which messages is spam, which is ham, at
Alexander least (but keeping only spam
Alexander and ham apart is not all PopFile will do for you, of course). That is so
Alexander much easier in PopFile with the use of magnets. For example, set up a
Alexander magnet that will put every mail from this list
Alexander into a bucket genuine (or
Alexander whatever you want to call it) - PopFile learns
Alexander from every magneted message
Alexander and you have a big ham database in no time.

I hall do that, good idea.

[SNIP]

Thanks to everyone who replied.

Doug


-- 
Doug Weller  Moderator, sci.archaeology.moderated
The Bat! 3.0
Doug and Helen's Dogs: http://www.dougandhelen.com
Doug's Archaeology Site: http://www.ramtops.co.uk



Current version is 3.00.00 | 'Using TBUDL' information:
http://www.silverstones.com/thebat/TBUDLInfo.html