Re: [Bug 4603] RFE: Apache::SpamD module, to run spamd from httpd

Justin Mason Tue, 08 Aug 2006 06:47:09 -0700

Radoslaw Zielinski writes:
> Justin Mason <[EMAIL PROTECTED]> [27-07-2006 19:43]:
> > Radoslaw Zielinski writes:
> >>> http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4603
> >>> ------- Additional Comments From [EMAIL PROTECTED]  2006-07-26 18:17 
> >>> -------
> >> I dislike the idea of using Bugzilla as a replacement for a mailing list
> >> (bleh, why doesn't ASF use RT); let's move here, if you don't mind...
> > OK, as long as you find the thread on nabble and post a pointer to the
> > bug; it's a *lot* easier to track down a BZ discussion 6 months down the
> > line, than find a mailing list thread.
> 
> Done.


cool -- thanks.

> >> [...]
> >>> Using IPC::Open3 is a nightmare for portability, btw -- I'm pretty sure it
> >>> doesn't work on win32 at least -- but maybe there are other issues there 
> >>> anyway?
> >> I avoided using shell... well, this can be easily changed.
> > Yep, perl's own 'open "...|"' shell escapes are actually more portable.
> > sa-update's code is worth looking at, for an example.
> 
> But that's still shell -- source of bugs and nasty suprises...
> How about IPC::Run?

Shell is actually pretty sane, once you know the little issues to watch
out for (or have someone reviewing who knows them ;).  In this case, there
are no user-supplied command-line switches to sanitise, so that makes
things a lot simpler.

Speaking from experience -- new CPAN dependencies are generally *more*
trouble to deal with, than shell escapes.

> >>> how does it compare to current spamd, in speed terms?
> >> 174%, crushes the hacky 0.0002s optimizations like cockroaches.
> > ha! I suspect these numbers are without any ruleset, though ;) Also, worth
> > noting that spamd does some time-consuming tasks that apache-spamd doesn't
> > (like log via syslog).
> 
> Only default rules indeed.  I've implemented logging; recent results
> (spamd >file 2>&1; no syslog):
> 
>     Apache, prefork
>   parsed 2000 messages in 00:04:28 (268.642257 s),
>   7.4448 msgs/s (447 msgs/min, 26801 msgs/h)
> 
>     spamd
>   parsed 2000 messages in 00:07:02 (422.312356 s),
>   4.7358 msgs/s (284 msgs/min, 17049 msgs/h)

That is *very* nice ;)  excellent!

> >   - open ">>" filehandles have atomic writes for inter-process contention,
> [...]
> 
> I have separated four functions:
> 
>   $ grep sub\ log Spamd.pm 
>   sub log_connection {
>   sub log_start_work {
>   sub log_end_work {
>   sub log_result {
> 
> They're creating the string to be logged (just like spamd) and calling
> info().  I'd change all but log_result() to dbg(), but that's your call.
> 
> So, all of it ends up in the error_log, along with the startup stuff.
> Can be changed easily just by tweaking these subroutines.

OK, that sounds great.

> >> Right now, this is written as a PerlProcessConnectionHandler (mod_perl
> >> handler for custom protocols).  I just figured out it *can* be done
> >> using the more popular HTTP handlers (PerlResponseHandler and friends)
> >> and I'm experimenting with it right now.
> [...]
> 
> Update: I have wasted days for pushing mp2 to its limits... and I sort
> of bounced off these limits; details on the mp users list (in short:
> connection filters happen after the core filter which reads headers,
> TransHandler can't really change what Apache thinks about the protocol
> used for the connection).  The tricky part happens to be the compat
> layer; it can be done in a clumsy way with performance hit for 1.x
> clients.
> 
> [...]
> >>   POST /?method=PROCESS HTTP/1.1
> >> The more I think about it, the more I like the idea.
> > Wow.  That's scary. ;) I'll have to think about that one.
> 
> > I'm not sure I see *sufficient* benefit, in terms of the other parts of
> > the code, though.  The two protocols are both very, very simple; I think
> > there'd be more code needing to be written to support HTTP (with a new
> > URL-based, CGI-style parameter-passing scheme), than the existing lines of
> > code for supporting SPAMD!
> 
> There is plenty of efficient, tested code implementing HTTP clients.
> neon, libcurl, libghttp, w3c-libwww -- that's what I have installed.
> Just do a library call, no need to write anything fancy.  CGI-style
> parameter-passing would be handy for future extensibility; for now,
> /[?&;]method=([A-Z]+)/ would do for the parsing.
> 
> What do we have for SPAMD/1.3?  Just the undocumented libspamc and
> M::SA::Client (alpha version, for robustness seek elsewhere).
> 
> 
> 
> I thought mod_perl is substantially more powerful.  Considering the
> technical difficulties and the fact that two protocols in use is worse
> than having only one, rolling this out would only have a point if v2.0
> was strongly pushed, marking v1.x as obsolete and soon-to-be unsupported.
> 
> Since there's no enthusiasm, I doubt you would do that.  Therefore,
> I'll stay with the PerlProcessConnectionHandler, abstracting things
> when I see opportunity.

ok.  sorry about that, but I don't think there's enough enthusiasm for
such a major change -- myself included.

--j.

Re: [Bug 4603] RFE: Apache::SpamD module, to run spamd from httpd

Reply via email to