TBDEV Mission Statement

2003-03-31 Thread Marck D Pearlstone
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Greetings Dev Listers,

This is your monthly message from the moderation team to remind you
of the primary purpose of this discussion list.

To review the list rules, follow the link at the end of this message.


 TBDEV Mission statement

The  TBDEV  list  has  been  set  up  for  the purpose of discussing
programming  issues  relating  to The Bat!, particularly issues that
deal with writing plug-ins.

Simply send messages to [EMAIL PROTECTED] to send it to the
whole list.

See  the  notes  at the end of this message for details about how to 
manage your list membership or to leave the list.


It  would probably be a good idea for you to set up a folder to keep
TBDEV  messages  in.  If  you do this, the next most useful thing to
have  in  place  is an automatic filter to move mail from your inbox
into your TBDEV folder.

Set up a filter for incoming mail which looks for 
Strings LocationPresence
Reply-to:.*TBDEV@   Kludges Yes
Options: Regular expressions (Checked).


TBOT - The Bat off topic discussion list

One of our members has created a list for those occasional off topic
discussions  of public interest. Please feel free to join this list,
where many of our readership currently participate.

Addresses:
Post message: [EMAIL PROTECTED] 
Subscribe:[EMAIL PROTECTED]
Unsubscribe:  [EMAIL PROTECTED]


Important disclaimer

The  moderators  and list administrators are not affiliated with RIT
Labs  or  the  development  of  TB,  although  the  developers  are,
themselves  members  of  the list and will sometimes chip in. If you
wish  to contact the developers, please use The Bat! main menu Help
..  Feedback  options  or,  if you need to write to the programmers
directly, send mail to:

   Stefan Tanurkov [EMAIL PROTECTED] or
   Max Masyutin [EMAIL PROTECTED]

Thank you for joining, and we hope that you find this list of use.


Contacts

 THE FOLLOWING E-MAIL ADDRESSES SHOULD ONLY BE USED WHEN YOU NEED TO
 CONTACT  THE  **  LIST MODERATORS OR LIST ADMINISTRATORS **. DO NOT
 SEND THE BAT! RELATED QUESTIONS HERE; THEY WILL *NOT* GET ANSWERED.

If you are having difficulties with the list, or one of its members,
please  contact  one  of  the list moderators. If you need to send a
message   to   the  list  moderation  team,  please  send  mail  to:
[EMAIL PROTECTED] or [EMAIL PROTECTED]


This list is moderated by the following persons of ill repute:

  Marck D Pearlstone [EMAIL PROTECTED]
  Leif Gregory [EMAIL PROTECTED]

Primary list administration is performed by
  Marck D Pearlstone [EMAIL PROTECTED]


   The TBDEV list is hosted free of charge by Johannes Posel.

For a full list of the rules for the use of this list please refer 
to: http://www.silverstones.com/thebat/subtbdev.txt or click here:
mailto:[EMAIL PROTECTED]


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.1rc1-nr1 (Windows 2000) - GPGshell v2.60

iD8DBQE+JstkOeQkq5KdzaARAg46AJ0YPxoifRrafTbUxfmJAZQL/OBYhQCeO2oA
WoLFsOAycdoz7Rx/EaiW1/Q=
=IPCM
-END PGP SIGNATURE-




Current version is 1.62 | Using TBDEV information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Baesyan filter - bug fixed (still test pre-release!)

2003-03-31 Thread Alexey N. Vinogradov
Hello, tbdev.

One bug has been fixed in filter baesyan.
The bug was that if a letter contain token consists whole from !
then during degeneration an error occured and the filter failed. So,
any letter includes this kind of tokens seemed to be non-spam
because of this fail.

Fixed version you can download here:

http://klirik.narod.ru/arc/baesnolog.tbp

http://klirik.narod.ru/arc/baes.tbp

(I still recommend you to use last (logged) version to send me a log
if any bug arises).

For this moment no other serious errors found.

In my own testing: since the first build I received 92 spam letters
and about 25 non-spam (understand now, why I began to write
the filter :). From these letters I has no false positives (i.e. none
of my good mail was accidentally deleted as spam) and 1 false negative
(i.e. one spam letter came to my mailbox). Also it were about 10 false
positives raised because of the just fixed bug. I refiltered these
letters after now and all of them were regarded as spam. So, total
effectivity (for the moment) is:

0%  (0 of 25) false positives and
1.1%(1 of 92) false negative.

I use the regarding base of 650 spam and about 800 non-spam letters.

In future:

   1. New rbd-generating engine (principle is same, but will be
   changed user interface and some options added). Also it seems to be
   good to automatically recognize and do something with PGP- or
   S-MIME- encrypted messages - throw them at all or at least keep
   them as hash values due to reduce a dictionary.

   2. Filter settings will be stored in the registry. Or - I found
   that if TBP_NeedConfig returns -1 then The Bat! himself adds a
   section [Filterdata] in TBPlugin.INI. Now this section is empty but
   I think in future The Bat! developers will give a possibility to
   store a settings locally for every mailbox (in registry it will be
   global settings).

   3. Adapt rbd-generating to other mailbase formats - because as I
   know SecureBat is also exist and has his mailbases encrypted.
   This problem for this very program can be solved by other mailbase
   imported formats, for example, unix-mailbox.

   4. Self-training feature. Now I guess it can be like a question to
   a user after every 50 received letters (for example) with asking
   him to confirm the grade of all letters - or, as a case - to
   confirm only questionable letters automatically regarded in some
   definite interval of spaminess (21-80% for example). After that
   new grade will be appended to regard.rbd. So, the base will be
   always fresh and it wouldn't be necessary to use rbd-generating
   engine to refresh it.

   This is my own ideas. If anyone else has some?
   

-- 
Sincerely,
 Alexey.
Using TB 1.63b7 on WinXP SP1 Corp + MUI RU, spelling by ORFO2002
  mailto:[EMAIL PROTECTED]



Current version is 1.62 | Using TBDEV information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re[2]: problems with bayesian filter

2003-03-31 Thread Alexey N. Vinogradov
Hello, rhabib001. 
You wrote in mid:[EMAIL PROTECTED]


ryc One thing I don't understand about the log is that if multiple
ryc messages are downloaded, the log doesn't reflect this.  Is this a bug?

As I understand myself, The Bat can call multiple instances of
filtering procedure in a time. For this reason it give to every call a
number, usually begins from 1001. By this reason I write into log also
this unique number (it is unique in the bounds of current The Bat!
session). You can see this numbers in the beginning of every line
logged during mailcheck. And there are no such lines for global plugin
functions like getname or getversion. Your log has combined the
logs of first version and last, so there are no such numbers at all
at the beginning of the log.

 If any of them includes a spam then the really problem is in your
 regarding base - either you confused spam and non-spam corpuses when
 you create your regarding base, either your regarding base is not
 enough yet.

ryc I have trained the good dictionary on 745 letters, but the spam
ryc dictionary on only 35 letters.  Could this be the problem (I have
ryc attached the regard.rdb file).

This is the feature of method itself - you can investigate it from
mathematically viewpoint - the numbers of spam and non-spam base
(counted in letters) ought to be equal. Simple speaking, your base
very well known what is not-spam, but has a relative hazy idea of
what is spam. You need more spam to work, - but this is total problem
with this method of filtering! You can, of course, download somewhere
a base with spam, but the problem is that in different countries spam
is different. Main grain of this method is that all user's regarding
bases are different, because their grades includes also knowledge of
concrete private user mail. So, it is very hard for spammers to cheat
many of such filters simultaneously. From the other hand, spam base
seems not to be such different from user to user, because spam is a
mass mailing. So you can ask a friend to send you many (real) spam and
make a better base. Or you can just take some good letters and make
a new base with relatively equal quantity of spam and non-spam.

-- 
Sincerely,
 Alexey.
Using TB 1.63b7 on WinXP SP1 Corp + MUI RU, spelling by ORFO2002
   mailto:[EMAIL PROTECTED]



Current version is 1.62 | Using TBDEV information:
http://www.silverstones.com/thebat/TBUDLInfo.html