> http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4603 > ------- Additional Comments From [EMAIL PROTECTED] 2006-07-26 18:17 -------
I dislike the idea of using Bugzilla as a replacement for a mailing list (bleh, why doesn't ASF use RT); let's move here, if you don't mind... [...] > Using IPC::Open3 is a nightmare for portability, btw -- I'm pretty sure it > doesn't work on win32 at least -- but maybe there are other issues there > anyway? I avoided using shell... well, this can be easily changed. > how does it compare to current spamd, in speed terms? 174%, crushes the hacky 0.0002s optimizations like cockroaches. $ tail -n1 *.log ==> prefork.log <== parsed 2000 messages in 00:04:32 (272.930377 s), 7.3279 msgs/s (440 msgs/min, 26380 msgs/h) ==> spamd.log <== parsed 2000 messages in 00:08:00 (480.140767 s), 4.1654 msgs/s (250 msgs/min, 14996 msgs/h) ==> worker.log <== parsed 2000 messages in 00:04:35 (275.170448 s), 7.2682 msgs/s (436 msgs/min, 26166 msgs/h) Apache-spamd / spamd run with -x -m 5, Bench-spamd.pl with -c 3 -m 2000. Hardware: Athlon 1.7xp, 700MB RAM. > Regarding logging. What's the issue? (I couldn't actually spot any logging > in > that tarball.) Apache redirects stderr to error_log, I don't know how to capture it (OTOH, I haven't been looking for it, but I don't think it's a good idea). The ErrorLog directive doesn't support redirecting to syslog. So, all the debug messages from SA and some startup errors detected at the config phase are logged. This isn't: [5273] info: spamd: connection from localhost [127.0.0.1] at port 2347 [5273] info: spamd: checking message <[EMAIL PROTECTED]> for (unknown):500 [5273] info: spamd: clean message (0.0/5.0) for (unknown):500 in 0.2 seconds, 5978 bytes. [5273] info: spamd: result: . 0 - scantime=0.2,size=5978,user=(unknown),uid=500,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=2347,mid=<[EMAIL PROTECTED]>,autolearn=disabled I have not attained enlightement about the correct way to do it yet. That would require opening a file to write at some state, passing the filehandle somehow (global var probably), locking... If a syslog socket has been requested, I guess separate connections are needed... Complex and error prone. Adding complexity is easy, keeping it simple and obvious makes a worthy challenge. > Should it be integrated into the main distro, or kept as a separate module > with its own Makefile.PL, do you think? (I think I'd prefer to integrate, if > possible.) If it's not integrated... will be lost, in time. > And finally, I think it could do with more documentation and tests ;) a lot > of > that would probably make more sense after the integration-into-distro question > is resolved (e.g. "what README does it go into"). I'd go for separate README.apache to keep things transparent. Right now, this is written as a PerlProcessConnectionHandler (mod_perl handler for custom protocols). I just figured out it *can* be done using the more popular HTTP handlers (PerlResponseHandler and friends) and I'm experimenting with it right now. That would have two benefits I see right now (I doubt it'd change anything regarding performance). First one is possibility to use mod_log_config (the CustomLog directive). If wee agree to compress that four log lines per connection to one, it would be a clean and efficient way to get the access logging done. Second one... Well, here it is; try to keep an open mind. ;-) I'm reading http://catb.org/esr/writings/taoup/ right now; around the chapter about protocol design it bugged me: why isn't the spamd protocol based on HTTP? Gain: forget the fancy libspamc, forget Mail::SpamAssassin::Client, get over with parts of spamd network-related code ("sysread not ready" anyone?), reduce trash code in various spamc implementations (exim, whatever)... Just use a HTTP library to do a simple POST (and make sure the library allows you to read the Spam header after a 2xx response). So. If I used the mod_perl HTTP handlers, that would get us very close to rolling out the SPAMD/2.0 protocol [1]. After some code refactoring, it'd be possible to use spamd as FastCGI (or regular CGI, if someone wishes) with any HTTP server. Authentication? Just get a mod_auth* module. Compression? mod_deflate. Whatever? mod_whatever. POST /?method=PROCESS HTTP/1.1 The more I think about it, the more I like the idea. [1] Actually, it would probably be easier to implement SPAMD/2.0 and add the compatibility layer. -- Radosław Zieliński <[EMAIL PROTECTED]>
pgp3TbF2tPBYV.pgp
Description: PGP signature
