-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
>2. The code is very monolithic and tangled. That makes it hard to extend
>the app with "plug-ins". I personally (and I think somebody else said so
>too) would like to see SA not only as "the application" SpamAssassin with
>its stable codebase but as something like a framework on which other
>people might easily develop (eg. without reimplementing all that
>mail-parsing stuff) their own anti-spam solutions. That would also fit
>together with us becoming an own toplevel project (or what that's called)
>in the ASF.
Yes! I have been planning to work on this. There's a bug open ;)
I will post a spec on the proposed plugin API, for eval tests at
least. I may not be able to include new learner methods in *that*
API just yet. let's see.
> As a first concrete step I propose (something like) this move of modules
> (annotations on the line below the module, you should have tabs of eight
> spaces):
> ArchiveIterator.pm Util/ArchiveIterator.pm
> The backend is split into the following
> modules:
> Util/Archive.pm
> Util/Archive/MBox.pm
> Util/Archive/Maildir.pm
+1 on that. although see below re "split-level".
> AutoWhitelist.pm Rules/Autowhateverlist.pm
> We really need to rename that one :)
not sure if this is still required.
> Bayes.pm Learner/Bayes.pm
> BayesStore.pm Learner/Bayes/Store.pm
> The above is a factory for the correct
> Storage module.
> Learner/Bayes/StoreDBM.pm
> That's DB_File or whatever we currently
> use.
> Learner/Bayes/StoreSQL.pm
> Not yet available :)
> Learner/Fitz.pm
> Just a sample for possible future
> plug-ins
OK -- one piece of feedback here. Having two levels:
Learner/Bayes.pm
Learner/Bayes/Foo.pm
I really do *not* like this split-level approach -- that causes lots of
trouble already with the existing Mail::SpamAssassin class, which is only
done like that because CPAN requires it.
It should be e.g.
Learner/Bayes/Bayes.pm
Learner/Bayes/Foo.pm
this is more amenable to grepping and comprehension IMO.
I haven't bothered commenting on the remaining instances of "split-level"
below, but consider them all objected-to on these grounds ;)
> CmdLearn.pm App/Learner.pm
> This isn't really a lib but an application
> of its own. I don't like this, some stuff
> really ought to go into sa-learn.pl while
> some "lib" stuff might stay here and
> shareable code should go under Util/ or so.
the policy was to keep as much perl code as possible outside of the .pl's
in future. I like the "App" prefix though.
> App/Daemon/*.pm
> spamd is currently very monolithic, I'd
> like to put stuff like the qmail and BSMTP
> features into their own modules so spamd
> is smaller and easier to maintain.
+1
> Conf.pm Conf.pm
> The rewritten Conf.pm.
> Conf/ConfSQL.pm
> Conf/Conf*.pm
> "optional" features/plug-ins should get
> their own config module which define the
> options available.
> Conf/Store.pm
> A factory for the config storage backend.
> Conf/StoreFile.pm
> Implementation of the standard config file
> format.
> ConfSourceSQL.pm Conf/StoreSQL.pm
> And for SQL.
storage? eek.
How will we support storage of Conf settings? we don't yet.
> DBBasedAddrList.pm Rules/AddrList.pm
> Anybody got a better name for this?
> Rules/AddrList/StoreDBM.pm
> Rules/AddrList/StoreSQL.pm
> The Backends.
> Dns.pm Rules/DNSDBs.pm
Rule/DNSBL.pm I suggest.
An approach: we should avoid plurals if possible.
> Rules/SPF.pm
> Rules/Habeas.pm
> Rules/?.pm
> "DNS" is too generic, those all use DNS.
> The first name is jst a suggestion, see [3].
> Util/DNS.pm
> Split rules and helper routines. (If still
> necessary, I didn't look at that file for
> some time.)
+1.
> EvalTests.pm Rules.pm
> I think we need a factory or some general
> management module for all the rules.
> Rules/Default.pm
> Anybody a better name for this?
> HTML.pm Parser/HTML.pm
> Locales.pm Rules/Locales.pm
> Locker.pm Util/Locker.pm
> This is probably just a factory or
> something, see below.
> MIME.pm Message.pm
> No PerMsg* or Msg* stuff please. Just
> Message.
ok, agreed on this name I think. But regarding the PerMsg prefix;
that exists for a good reason -- see PerMsgStatus below.
> MIME/Parser.pm Parser/MIME.pm
> MailingList.pm Rules/Mailinglists.pm
should be gone by now!
> NetSet.pm Util/NetSet.pm
> WTF is this? Better name? Other location?
no, that's right. +1
> NoMailAudit.pm
> Will end in Message.pm, I think?
> PerMsgLearner.pm Learner.pm
> A factory? I'm not sure.
> PerMsgStatus.pm Status.pm
> Shorter is better :)
Now -- the issue with the PerMsg prefix is this. Both of these are
created by the Mail::SpamAssassin factory on a per-message basis.
I would *strongly* like to keep that clear in the object name.
In fact, IMO we should *not* change this as it's a *very* user-visible
change and will require code changes on their behalf, for little
benefit.
Suggestions?
> PersistentAddrList.pm
> Dunno. Looks like a factory but don't ask
> me where to put it.
now that we only support DBBasedAddrList, I don't think we need to
keep this as a factory interface -- and can drop this class.
> Received.pm Parser/Trace.pm
> I think we parse other trace headers (like
> Delivered-To) here, too.
nope ;) just Received hdrs, so Parser/Received.pm is better.
> Reporter.pm
> Ummm... dunno.
Report/Reporter.pm? and split out the pyzor/razor reporting into
their own classes.
> SHA1.pm Util/DigestSHA1.pm
> TextCat.pm Rules/TextCat.pm
> or Util/TextCat.pm?
Rules IMO
> UnixLocker.pm Util/Locker/Unix.pm
> See above.
ok
> Util.pm Util.pm
> Some general routines?
> Util/*
> There might be other stuff...
Util/Util.pm ;)
> Win32Locker.pm Util/Locker/Windows.pm
> See above.
> Store.pm
> Store/DBM.pm
> Store/SQL.pm
> And finally the backends for general storage
> access.
>
> Whew. If that above confused you a bit, it would gave us a nice and clean
> structure/API like
> Mail/
> SpamAssassin.pm
> SpamAssassin/
> Status.pm
> Message.pm
> Conf.pm
> Conf/Conf*.pm
> Conf/Store*.pm
> Learner.pm
> Learner/
> Bayes.pm
> Bayes/Store*.pm
> Rules.pm
> Rules/*.pm
> Parser.pm
> Parser/
> MIME.pm
> Trace.pm
> HTML.pm
> Util.pm
> Util/*.pm
> Store.pm
> /Store/*.pm
> App/*.pm
>
>
> To continue the original discussion, one would do a
> my $engine = new Mail::SpamAssassin->new();
> my $message = new Mail::SpamAssassin::Message->new(
> data => [EMAIL PROTECTED],
> );
> my $status = $engine->check($message); # Mail::SpamAssassin::Status
> $engine->destroy(); # It should be possible to free some mem.
finish(). That's the std we've already chosen, let's not
change that ;)
> Hm. I think that was it. Comments, flames, patches? :)
- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
Comment: Exmh CVS
iD8DBQFADwQwQTcbUG5Y7woRAgxhAKCapOtJl8zhjucNhAMaOtUxGfvW1gCffijm
jJJUlfd/79bo+/2I4dU0yww=
=Ebew
-----END PGP SIGNATURE-----