On Thursday 22 January 2004 00:02 CET Michael Parker wrote:
>[...]
> > Another candidate for this are the storage backends -- it should be
> > possible to store (all) your stuff into an SQL database or wherever you
> > want. The SQL stuff currently scattered thorugh the whole codebase and
> > some parts are AFAICS heavily outdated.
>
> My recent Bayes Storage work has greatly improved this.  I was able to
> add a new RPC based storage module in a very short time.  Not to
> mention the fact that it can be made even better, it was too great a
> leap IMO for the first pass at making it easier to extend.  My plan is
> that once I've got things stable and working well (close) and
> hopefully merged in, I'll continue to extract implementation specific
> methods and what not out of the storage code.
>
> Interestingly, the AWL stuff was VERY easy to expand, but it is much
> simpler than the bayes storage code.

Where is that stuff? If it's in a bug it must have slipped through under my 
radar. Could you give me a pointer?

>[...]
> > * Some cleanup of the frontends (like getting rid of some command line
> > parameters and moving them to the config files).
>
> One thing in this area I wouldn't mind seeing is the ability to
> specify a username on the command line for tools such as sa-learn and
> expanding of the ablity to fetch user config data from SQL. There are
> a couple of bugs that cover this sort of thing.

Might be worth thinking about. What I wanted to get rid of are spamd 
switches like --syslog-facility and --socketpath Btw. every config option 
will be available via something like
  spamd -Olog:facility=mail
or something like this.

> > * Rename the Autowhitelist :)
>
> Something like NormalizeAddr? AddrAvgScore? BalanceAddrScore?
> BalanceScore? I know they are all TOO long, I stink at coming up with
> module names.
>
> No idea, agree with a rename, it's confusing to folks who don't
> understand.

I like the word "Balance" :)

> > * ... other ideas?
>
> I've been contemplating an apache/httpd RPC/SOAP based implemenation
> of spamd, but haven't taken it very far.  It would be nice to be able
> to leverage some of the other Apache projects in the server area.  Of
> course, that assumes you don't lose the speed that the current
> spamc/spamd combination gives.

Sounds interesting.

> I like this layout for the most part.  A few notes below.
>
> > Bayes.pm            Learner/Bayes.pm
> > BayesStore.pm               Learner/Bayes/Store.pm
> >                             The above is a factory for the correct
> >                             Storage module.
> >                     Learner/Bayes/StoreDBM.pm
> >                             That's DB_File or whatever we currently
> >                             use.
> >                     Learner/Bayes/StoreSQL.pm
> >                             Not yet available :)
>
> This can be done now, assuming my changes are folded in.

Cool.

> > DBBasedAddrList.pm  Rules/AddrList.pm
> >                             Anybody got a better name for this?
> >                     Rules/AddrList/StoreDBM.pm
> >                     Rules/AddrList/StoreSQL.pm
> >                             The Backends.
>
> This can also be done now.
>
> >                     Store.pm
> >                     Store/DBM.pm
> >                     Store/SQL.pm
> >                             And finally the backends for general storage
> >                             access.
>
> I've also been thinking about this.  In general we've got two general
> ways we access data, key/value pairs and more specialized access (ie
> BayesStorage).  

Here I thought about very lowlevel function like opening the database, 
errorhandling and "raw" reading stuff. Everything else should go into the 
other Store*.pm.

> It shouldn't be too difficult to make a quick and easy 
> Store module that assume key/value pairs and stores them in a DBM file
> (BTW what was the downsides to only supporting DB_File and not the
> other DBM implementations?) and also in a generic table in a SQL
> database.  The table might be something like:
>
> subsystem  username  key                value
> conf       parker    use_bayes          1
> conf       parker    whitelist_from     [EMAIL PROTECTED]
> hashcash   parker    foo                blah (sorry not familar with
> hashcash) So on and so forth.  Even the AWL (or whatever it's renamed to)
> could use this and not need the custom SQL implemenation although I think
> it might be a tad bit faster if you had it.

I'm no database expert, but it looks good, visually :)

> > Hm. I think that was it. Comments, flames, patches? :)
>
> Sorry for the length of the reply.  I will say that I like your ideas
> and am more than happy to do what I can to help.

Cheers,
Malte

-- 
[SGT] Simon G. Tatham: "How to Report Bugs Effectively"
      <http://www.chiark.greenend.org.uk/~sgtatham/bugs.html>
[ESR] Eric S. Raymond: "How To Ask Questions The Smart Way"
      <http://www.catb.org/~esr/faqs/smart-questions.html>

Reply via email to