-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>2. The code is very monolithic and tangled. That makes it hard to extend
>the app with "plug-ins". I personally (and I think somebody else said so
>too) would like to see SA not only as "the application" SpamAssassin with
>its stable codebase but as something like a framework on which other
>people might easily develop (eg. without reimplementing all that
>mail-parsing stuff) their own anti-spam solutions. That would also fit
>together with us becoming an own toplevel project (or what that's called)
>in the ASF.

Yes!  I have been planning to work on this.  There's a bug open ;)

I will post a spec on the proposed plugin API, for eval tests at
least.  I may not be able to include new learner methods in *that*
API just yet.  let's see.

> As a first concrete step I propose (something like) this move of modules 
> (annotations on the line below the module, you should have tabs of eight 
> spaces):
> ArchiveIterator.pm    Util/ArchiveIterator.pm
>                               The backend is split into the following
>                               modules:
>                       Util/Archive.pm
>                       Util/Archive/MBox.pm
>                       Util/Archive/Maildir.pm

+1 on that. although see below re "split-level".

> AutoWhitelist.pm      Rules/Autowhateverlist.pm
>                               We really need to rename that one :)

not sure if this is still required.

> Bayes.pm              Learner/Bayes.pm
> BayesStore.pm         Learner/Bayes/Store.pm
>                               The above is a factory for the correct 
>                               Storage module.
>                       Learner/Bayes/StoreDBM.pm
>                               That's DB_File or whatever we currently 
>                               use.
>                       Learner/Bayes/StoreSQL.pm
>                               Not yet available :)
>                       Learner/Fitz.pm
>                               Just a sample for possible future 
>                               plug-ins

OK -- one piece of feedback here.  Having two levels:

        Learner/Bayes.pm
        Learner/Bayes/Foo.pm

I really do *not* like this split-level approach -- that causes lots of
trouble already with the existing Mail::SpamAssassin class, which is only
done like that because CPAN requires it.

It should be e.g.

        Learner/Bayes/Bayes.pm
        Learner/Bayes/Foo.pm

this is more amenable to grepping and comprehension IMO.

I haven't bothered commenting on the remaining instances of "split-level"
below, but consider them all objected-to on these grounds ;)

> CmdLearn.pm           App/Learner.pm
>                               This isn't really a lib but an application
>                               of its own. I don't like this, some stuff 
>                               really ought to go into sa-learn.pl while
>                               some "lib" stuff might stay here and 
>                               shareable code should go under Util/ or so.

the policy was to keep as much perl code as possible outside of the .pl's
in future.   I like the "App" prefix though.

>                       App/Daemon/*.pm
>                               spamd is currently very monolithic, I'd 
>                               like to put stuff like the qmail and BSMTP 
>                               features into their own modules so spamd 
>                               is smaller and easier to maintain.

+1

> Conf.pm                       Conf.pm
>                               The rewritten Conf.pm.
>                       Conf/ConfSQL.pm
>                       Conf/Conf*.pm
>                               "optional" features/plug-ins should get 
>                               their own config module which define the
>                               options available.
>                       Conf/Store.pm
>                               A factory for the config storage backend.
>                       Conf/StoreFile.pm
>                               Implementation of the standard config file
>                               format.
> ConfSourceSQL.pm      Conf/StoreSQL.pm
>                               And for SQL.

storage? eek.

How will we support storage of Conf settings?  we don't yet.

> DBBasedAddrList.pm    Rules/AddrList.pm
>                               Anybody got a better name for this?
>                       Rules/AddrList/StoreDBM.pm
>                       Rules/AddrList/StoreSQL.pm
>                               The Backends.
> Dns.pm                        Rules/DNSDBs.pm

Rule/DNSBL.pm I suggest.

An approach: we should avoid plurals if possible.

>                       Rules/SPF.pm
>                       Rules/Habeas.pm
>                       Rules/?.pm
>                               "DNS" is too generic, those all use DNS.
>                               The first name is jst a suggestion, see [3].
>                       Util/DNS.pm
>                               Split rules and helper routines. (If still
>                               necessary, I didn't look at that file for 
>                               some time.)

+1.  

> EvalTests.pm          Rules.pm
>                               I think we need a factory or some general 
>                               management module for all the rules.
>                       Rules/Default.pm
>                               Anybody a better name for this?
> HTML.pm                       Parser/HTML.pm
> Locales.pm            Rules/Locales.pm
> Locker.pm             Util/Locker.pm
>                               This is probably just a factory or 
>                               something, see below.
> MIME.pm                       Message.pm
>                               No PerMsg* or Msg* stuff please. Just 
>                               Message.

ok, agreed on this name I think.  But regarding the PerMsg prefix;
that exists for a good reason -- see PerMsgStatus below.

> MIME/Parser.pm                Parser/MIME.pm
> MailingList.pm                Rules/Mailinglists.pm

should be gone by now!

> NetSet.pm             Util/NetSet.pm
>                               WTF is this? Better name? Other location?

no, that's right. +1
> NoMailAudit.pm
>                       Will end in Message.pm, I think?

> PerMsgLearner.pm      Learner.pm
>                               A factory? I'm not sure.
> PerMsgStatus.pm               Status.pm
>                               Shorter is better :)

Now -- the issue with the PerMsg prefix is this.   Both of these are
created by the Mail::SpamAssassin factory on a per-message basis.
I would *strongly* like to keep that clear in the object name.

In fact, IMO we should *not* change this as it's a *very* user-visible
change and will require code changes on their behalf, for little
benefit.

Suggestions?

> PersistentAddrList.pm
>                               Dunno. Looks like a factory but don't ask 
>                               me where to put it.

now that we only support DBBasedAddrList, I don't think we need to
keep this as a factory interface -- and can drop this class.

> Received.pm           Parser/Trace.pm
>                               I think we parse other trace headers (like 
>                               Delivered-To) here, too.

nope ;)  just Received hdrs, so Parser/Received.pm is better.

> Reporter.pm           
>                               Ummm... dunno.

Report/Reporter.pm?  and split out the pyzor/razor reporting into
their own classes.

> SHA1.pm                       Util/DigestSHA1.pm
> TextCat.pm            Rules/TextCat.pm
>                               or Util/TextCat.pm?

Rules IMO

> UnixLocker.pm         Util/Locker/Unix.pm
>                               See above.
ok

> Util.pm                       Util.pm
>                               Some general routines?
>                       Util/*
>                               There might be other stuff...

Util/Util.pm ;)

> Win32Locker.pm                Util/Locker/Windows.pm
>                               See above.
>                       Store.pm
>                       Store/DBM.pm
>                       Store/SQL.pm
>                               And finally the backends for general storage
>                               access.
>                                 
> Whew. If that above confused you a bit, it would gave us a nice and clean 
> structure/API like
>       Mail/
>               SpamAssassin.pm
>               SpamAssassin/
>                       Status.pm
>                       Message.pm
>                       Conf.pm
>                       Conf/Conf*.pm
>                       Conf/Store*.pm
>                       Learner.pm
>                       Learner/
>                               Bayes.pm
>                               Bayes/Store*.pm
>                       Rules.pm
>                       Rules/*.pm
>                       Parser.pm
>                       Parser/
>                               MIME.pm
>                               Trace.pm
>                               HTML.pm
>                       Util.pm
>                       Util/*.pm
>                       Store.pm
>                       /Store/*.pm
>                       App/*.pm
> 
> 
> To continue the original discussion, one would do a
> my $engine  = new Mail::SpamAssassin->new();
> my $message = new Mail::SpamAssassin::Message->new(
>                     data => [EMAIL PROTECTED],
>                   );
> my $status  = $engine->check($message); # Mail::SpamAssassin::Status
> $engine->destroy(); # It should be possible to free some mem.

finish().   That's the std we've already chosen, let's not
change that ;)

> Hm. I think that was it. Comments, flames, patches? :)

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFADwQwQTcbUG5Y7woRAgxhAKCapOtJl8zhjucNhAMaOtUxGfvW1gCffijm
jJJUlfd/79bo+/2I4dU0yww=
=Ebew
-----END PGP SIGNATURE-----

Reply via email to