Re: Trie optimisation of simple alternations for blead perl.

2005-02-15 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Yitzchak Scott-Thoennes writes:
 On Tue, Feb 15, 2005 at 12:40:50PM -0800, Justin Mason wrote:
  FWIW, this looks like it'd be excellent for SpamAssassin ;)
 Maybe somebody with SpamAssassin could do some benchmarks?

I'm a little concerned it might be misleading -- we've optimised our code
quite a lot around the *existing*, sans-trie, re engine, like so:

foreach $line (@body) {
  foreach $rule (@ruleset) {
next unless ($scores{$rule} != 0);  # skip zero-scoring rules
/$rule/ and $score+=$scores{$rule};
  }
}


If this patch were included in perl, we'd rewrite our code to optimise
it's behaviour for *this* engine, like so:

my $allrules = eval 'qr/(?:' . join(|, @ruleset) . ')/';
# probably a bit more sophisticated than that, but you
# get the idea

foreach $rule (@ruleset) {
foreach $line (@body) {
  next unless $line =~ $allrules;
  foreach $rule (@ruleset) { .. }   # as above
}

So a simple run of SpamAssassin may not illustrate the speedups
without more changes on our side.

Plus (as far as I know) none of us have ever built bleadperl ;)
I can try it out if there's a tarball though, and give some
rough ideas of the speedup without our code changes?

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCEmV5MJF5cimLx9ARAtHqAKCM9bcEd781a3NNDuhqIldOQuneCACfb9kj
3AHm4LPLpVO0dIIkvDtUZV0=
=Qplv
-END PGP SIGNATURE-



Re: svn commit: r153972 - spamassassin/trunk/masses/mass-check

2005-02-15 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


I would suggest just running perl parse-rules-for-masses -d $(RULES) -s
$(SCORESET) directly, just in case make dependencies make things
go haywire.

- --j.

[EMAIL PROTECTED] writes:
 Author: quinlan
 Date: Tue Feb 15 15:10:09 2005
 New Revision: 153972
 
 URL: http://svn.apache.org/viewcvs?view=revrev=153972
 Log:
 generate tmp/rules.pl if and only if it doesn't exist
 
 Modified:
 spamassassin/trunk/masses/mass-check
 
 Modified: spamassassin/trunk/masses/mass-check
 URL: 
 http://svn.apache.org/viewcvs/spamassassin/trunk/masses/mass-check?view=diffr1=153971r2=153972
 =---
  spamassassin/trunk/masses/mass-check (original)
 +++ spamassassin/trunk/masses/mass-check Tue Feb 15 15:10:09 2005
 @@ -99,7 +99,10 @@
  use constant HAS_TIME_PARSEDATE = eval { require Time::ParseDate; };
  use Config;
  
 -# for reuse, score set doesn't matter
 +if (! -f 'tmp/rules.pl') {
 +  system(make tmp/rules.pl);
 +}
 +# note: for reuse, score set doesn't matter
  require rules.pl;
  
  # default settings
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCEoH2MJF5cimLx9ARAkSBAJ9uZt/2LhkKw8mC7D1iAqGA8cIdkgCgjCJ3
qokEqUN8S/tgbjVIrvTSsFo=
=ZQKT
-END PGP SIGNATURE-



Re: Trie optimisation of simple alternations for blead perl. (Updated patch, supports /i modifier)

2005-02-16 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


[de-cc'ing p5p on this one since it's just sa-related.]

demerphq writes:
 Incidentally I did notice that many of the patterns used could be
 reworked to be more efficient assuming the patch is applied. Also I
 was kinda wondering about the code generated. It seems odd that each
 regex rule gets its own subroutine, wouldnt it better to precompile
 the regexes into a single routine? There appears to be low hanging
 optimization fruit in the SA code generator stuff.  Ie:
 
   if ($self-{conf}-{scores}-{q{__NIGERIAN_BODY_18}}) {
 # call procedurally as it is faster.
 __NIGERIAN_BODY_18_body_test($self,@_);
   }
 
 is an example of low hanging fruit. That should read:
 
 my $lu=$self-{conf}-{scores};
 if ($lu-{q{__NIGERIAN_BODY_18}}) {
 # call procedurally as it is faster.
 __NIGERIAN_BODY_18_body_test($self,@_);
 }
 
 which would save two redundant hash lookups per rule.

That's very true ;)

 A further
 optimization would be to outright eliminate the subroutine call so
 that this would look like:
 
 my $lu=$self-{conf}-{scores};
 if ($lu-{q{__NIGERIAN_BODY_18}}) {
   #dont call a sub at all, as its faster
   foreach (@_) {
  if (/\bSEVERAL ATTEMPTS HAVE BEEN MADE WITH OUT SUCCESS\b/i) { 
 $self-got_pattern_hit (q{__NIGERIAN_BODY_18}, BODY: ); 
 dbg (Ran body-text regex rule __NIGERIAN_BODY_18
 == got hit: match='$', rulesrun, 2);
 # Ok, we hit, stop now.
   last;
  }
   }
 }
 
 Since each subroutine gets called once per line per mail the reduction
 in call stack overhead should represent a pretty clear run time
 improvement. I assume this logic is duplicated in the other code
 generators and not just in the one I was trying to debug.

yep, this is deliberate -- although suboptimal.  The idea is so
that slow-running regexps can be identified with Devel::DProf.

that's a good point btw.  both of those should be reconsidered...
they certainly don't help the runtime speed.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCEuy7MJF5cimLx9ARAnpNAJ0ULfqF2iKKggm7a6I+KeHG6KHJpgCfeW3g
3wcxRN0itRIDoa6750x1ck0=
=+KHT
-END PGP SIGNATURE-



Re: [Bug 4124] New: New spamassassin script doesn't work due to tainting

2005-02-16 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Sidney Markowitz writes:
 Daniel Quinlan wrote:
  We support nmake?
 
 That's the Microsoft nmake, not to be confused with any other make
 program of the same name. It's what is available on Windows. For
 compatibility we have to put all the fancy logic in the perl of
 Makefile.PL so the resulting makefile is written to a dumbed down common
 denominator.

ExtUtils::MakeMaker uses nmake (or make on UNIX platforms).

FWIW, Module::Build eschews using any external make-ish tool
at all, instead going for a Python-ish write everything
in native perl approach.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCE6ByMJF5cimLx9ARAstjAJ4zO8lehyMETiw2A6Me43TIb4iBvgCgtFor
DiWxDgDEj9xRFeAlz8gXJhc=
=NYRX
-END PGP SIGNATURE-



Re: svn commit: r153377 - in spamassassin/trunk: MANIFEST Makefile.PL build/do spamassassin spamassassin.pod

2005-02-22 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 Michael Parker [EMAIL PROTECTED] writes:
 
  Now you're adding in a whole other subsystem.  Using this logic, it is
  pretty easy to argue that everything should be in one executable, and
  not split out.  A little cross use is allowed.  Here is my proposed
  set of commands and their arguments:
 
  [...usages removed...]
  sa-filter
  sa-report
  sa-learn (maybe sa-bayes)
  sa-update
  sa-history/sa-awl (maybe sa-whitelist)
  sa-lint
 
 Hmmm... my thinking is that we have several basic actions people
 perform.
 
  - filtering and removal of markup
  - learning/forgetting
  - reporting/revoking
  - plugin-specific stuff that cannot be generalized
 
 I'd strongly favor move the basic learn functionality (learn, forget)
 into a single command regardless of the learning subsystem and the
 plugin-specific stuff like bayes, whitelist, and history into
 plugin-specific commands.

I agree with Daniel that we should model the command line UI around the
use cases, not around the code subsystems involved.

But then plugin-specific stuff like bayes -- Daniel, I'm not sure what
you mean here, this seems to contradict the previous idea (UI based on
use-cases).  

Regarding the idea of splitting plugin-specific stuff into its own
command-line UI, I'm not sure I can go for that.  For one thing, having
two command-line APIs to control Bayes seems confusing; for another, the
plugins should hook into the *existing* user interface, not create new,
plugin-specific UIs that'll only work with that plugin.  If we do the
latter, we'll wind up down the road with multiple, conflicting scripts
that are nearly the same in UI, but not quite, for each different plugin
of a given type.


On another aspect --- I like Michael's idea of leaving the spamassassin
script's UI intact as it is now for a few 3.x releases. There is a lot of
third-party code that hooks into SpamAssassin that will need to be updated
before that interface can be removed safely without causing a lot
of needless pain.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCG3MiMJF5cimLx9ARAo9SAKCjP2+ePv4nC7Zj9v7v+g0UcMkiRACdFL+Z
NpfYw6kTNCyUIldvM5Giq2A=
=rnNj
-END PGP SIGNATURE-



Re: DK/IIM/MAILSIG and header ordering

2005-02-28 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


regarding the answer to this question -- I let this slip. here's Jim and
Mark's responses. I'd still prefer to shift to prepending, as discussed,
for similar reasons as to what Mark outlines.

- --j.

Jim Fenton writes:
 I agree pretty much with what Mark said.  If it's easy to get 
 SpamAssassin to prepend headers, why not do it.  But I suspect this is 
 only a short-term problem because you need h= for a lot of other things 
 besides SpamAssassin (many virus checkers, for example).
 
 On the other hand, [and I will show my ignorance here about how 
 SpamAssassin is run, because I have been fortunate to have my mail 
 providers manage it for me], if you're depending on the Sendmail milter 
 capability to prepend headers, you need to be running sendmail 
 8.13.0.Beta3 or later to get that functionality, and from what I have 
 seen most of the world is still at 8.12 or earlier (even fairly recent 
 Linux distributions like Fedora).
 
 You should already be OK with messages from gmail and Earthlink, because 
 they sign specific headers with h=.  I'm hoping Yahoo! will change over 
 to that; as Mark said the direction we are going is to get everyone to 
 use h=.
 
 I'm not sure I understood your last question about moving SA headers to 
 the top, but after a From line.  A From header is one of the first 
 that you do want to sign, if you're depending on header ordering, having 
 it before the DK signature means it's outside the signed content.  So I 
 don't think this is a good idea.
 
 Hope this helps and isn't too rambling.
 
 -Jim
 
 Mark Delany wrote:
 
 On Fri, Feb 11, 2005 at 04:18:01PM -0800, Justin Mason allegedly wrote:
   
 Mark, Jim --
 
 hi, Justin Mason here from SpamAssassin.  I have a quick question
 regarding SpamAssassin's behaviour in how it rewrites message
 headers, that I hope you can help with.
 
 
 
 Justin.
 
 The general consensus seems to be that we should make h= mandatory.
 
 Having said that I encourage you to pre-pend rather than post-pend
 headers. The next version of sendmail's milter supports this, in part
 because of DK.
 
 My philosophy is this. While we live with a messy email environment, I
 want to try and encourage/allow behaviour that moves us to a clean
 email environment. To me, clean means don't mess with content, don't
 intersperse headers, etc.
 
 Heck we manage to send data bit-for-bit via http and ftp and tcp, why
 not smtp?
 
 To that end, we would like to persist with c=simple and similar for no
 other reason than to define desired the end-goal. Even if that
 end-goal is ten years out!
 
 
 Mark.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCIlDtMJF5cimLx9ARAkylAJsFiCo5OBKARSltlKn6/nqdhSbN/wCeKPMy
4kDFcqB0YXx1TAGQQZmh3ic=
=0uHg
-END PGP SIGNATURE-



Re: svn commit: r153377 - in spamassassin/trunk: MANIFEST Makefile.PL build/do spamassassin spamassassin.pod

2005-02-28 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 Justin Mason [EMAIL PROTECTED] writes:
  On another aspect --- I like Michael's idea of leaving the
  spamassassin script's UI intact as it is now for a few 3.x
  releases. There is a lot of third-party code that hooks into
  SpamAssassin that will need to be updated before that interface can be
  removed safely without causing a lot of needless pain.
 
 Well, it might be cleaner to junk the sa-filter idea and leave
 sa-filter as a 100% clean future program.  Move the code back into
 spamassassin, but document it under the meta document or in a separate
 pod.

resurrecting this thread --

actually that does strike me as a good idea; and new features (flags
etc.) go only into sa-filter et al.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCIkBwMJF5cimLx9ARAmaiAJwK4P4e9jhfTYstPMOqUboDMjO7igCeIbXq
jyeOY+n9hCRwpNea/3y9De8=
=Jcdf
-END PGP SIGNATURE-



Re: make test failures

2005-02-28 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Sidney Markowitz writes:
 I fixed the test failure in t/debug.t checking in to r155617.
 
 The test was just missing a new dbg message tag, replacetags, so I added
 it to the list.
 
 I'm less sure about what is the correct thing to do for the failure in
 t/spf.t. In that case there is a test for SPF_HELO_FAIL in the test
 spam. But as far as I can tell, spamassassin.org has a ?all in its SPF
 record, which should mean that the result code of 'neutral' for the helo
 test is correct. Should it fail? Do we need a new test case that
 generates an SPF HELO failure?

Oops -- that was me.   According to the SPF people, we shouldn't
be using -all on a domain that may possible emit mail.   So I changed
the record...

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCIjVdMJF5cimLx9ARAsy3AKCg5fkOwhV22Xeq8A/lyky7YwseNACeIHDY
o6JzH7NeHrbVBGauXIVA5C4=
=RST9
-END PGP SIGNATURE-



Re: make test failures

2005-03-01 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Sidney Markowitz writes:
 Justin Mason wrote:
   According to the SPF people, we shouldn't
  be using -all on a domain that may possible emit mail
 
 Even if, as I think, ~all is correct if you can enumerate all legal
 senders for the domain, there still is a problem with making our test
 depend on the current configuration of something that is being used for
 some other purpose. There is always the risk that there will be a reason
 for changing the configuration.
 
 I got an idea from the tests in Mail::SPF::Query. How about if you
 define a spf-test.spamassassin.org domain with an SPF record with ~all.
 Then you are guaranteed that it will generate a fail but it can't mess
 up any real email.

+1
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCJDO7MJF5cimLx9ARAo6mAJ0S0fxuVsaqo4ipR7d0iEZooi5OfgCeLH2g
6z3LUKYeey4uIkiTKk2hMQg=
=Y3zB
-END PGP SIGNATURE-



Re: make test failures

2005-03-02 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daryl C. W. O'Shea writes:
 Sidney Markowitz wrote:
  Justin Mason wrote:
  
 According to the SPF people, we shouldn't
 be using -all on a domain that may possible emit mail. So I changed
 the record...
  
 snip
 
  If you can list all sending domains, sending ip addresses, and ISP mail
  servers that are allowed to send mail from a spamassassin.org address,
  then you can use ~all and we can use from spamassassin.org in the SPF
  test for a failed HELO. If you can't list all of them in the record, we
  are forced to use ?all and we need a different domain to use for the test.
 
 It's more like:
 
 ?all if you don't think you've listed all the hosts that may send mail
 
 ~all if you *think* you've listed all the hosts that may send mail
 
 -all if you *know* you've listed all the hosts that may send mail
 
 The wizard doesn't give you the option -all since they don't want to 
 'wizardize' you having your mail rejected.  If you don't list all your 
 hosts and the record contains ~all, it'll generate a soft fail... which 
 means the receiving server should still accept the mail.
 
 If you forget to list all your hosts and your record contains -all, it 
 generates a hard fail... which means the receiving server should feel 
 free to reject, or drop the message.
 
 I've got many domains using -all (with all of their sending hosts 
 listed) and have had no problems.

Yes, that's how it was *supposed* to work ;)

However the SPF mavens nowadays are taking the forwarding problem into
account, and recommending that -all not be used even if you've listed all
the hosts that may send mail, since a recipient address may forward
to another, SPF-checking, address without rewriting the env sender.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCJYyYMJF5cimLx9ARAguZAJwI0gU2Vr45Xi6IyYEb1Lanf4damACfTlne
tPF1LgtptTgIheiZ3NEe2Zg=
=0WSW
-END PGP SIGNATURE-



Re: svn commit: r156398 - spamassassin/trunk/lib/Mail/SpamAssassin/PerMsgStatus.pm

2005-03-07 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 [EMAIL PROTECTED] writes:
 
  plugins can now set tags to subroutine references, to return dynamic
  data easily
 
 Excellent.
 
 Could/should we require them to be named starting with PLUGIN_ ?

in my opinion, no -- it'd mean we couldn't move builtin tags into
plugins [*] without renaming them, which seems nasty.

(*: like that Spammy-Tokens stuff for example, we should really move
that out)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCK/m/MJF5cimLx9ARAloIAKCY3M+Im0JXif4p/zo7EDA7uaZBAQCfWBhz
vsvYkuNupRnAa0sFTFp2J+s=
=i2S7
-END PGP SIGNATURE-



recommending SQL on large sites in 3.1.0?

2005-03-07 Thread Justin Mason
Hey all --

looking at users@ traffic recently, there's been a *lot* of reports of
people running into various problems with DB_File: spamd memory usage
ballooning, deep recursion errors, hangs, etc.

I'm thinking we should rewrite some of our doco for 3.1.0 to suggest that
most sites running spamd use SQL for storage if possible. No changes to
defaults, and no removal of features, however; just doco changes to point
people at SQL as a more reliable, safer way to implement bayes and AWL.
It really sounds like DB_File isn't quite reliable.  There haven't
been that kind of report from SQL, and in fact SQL users haven't
been reporting many problems at all as far as I can see.

What do you all think?

--j.


Re: client SMTP authorization

2005-03-10 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Tony Finch writes:
 On Thu, 10 Mar 2005, Daniel Quinlan wrote:
  Tony Finch [EMAIL PROTECTED] writes:
 
   Is anyone planning to implement CSA for SpamAssassin?
   http://mipassoc.org/csv/
 
  I have not heard of any plans to implement Comma Separated Values.
 
 Ha :-) I'll add it to my todo list after Net::DNS then, which will allow
 plenty of time for someone else to do it first...

I think [EMAIL PROTECTED] may have just opened a bug on the BZ about
it, or something related to it.

Sadly I haven't been tracking CSA/CSV's list so I have no idea
what it is ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCMIwxMJF5cimLx9ARApGAAJ9iMp5+MJKr0McLt2CzfbLyXGg9EgCfWm0i
VzO7NWN+VGnu1g9Ew2BajH8=
=hM9z
-END PGP SIGNATURE-



Re: BC in libspamc (was: svn commit: r158029 [...])

2005-03-21 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Michael Parker writes:
 On Mon, Mar 21, 2005 at 01:18:48AM +0100, Malte S. Stretz wrote:
  On Friday 18 March 2005 08:46 CET [EMAIL PROTECTED] wrote:
   Author: parker
   Date: Thu Mar 17 23:46:45 2005
   New Revision: 158029
  
   URL: http://svn.apache.org/viewcvs?view=revrev=158029
   Log:
   Bug 1201: Add learning support to spamd/spamc
  
  [...]
   +++ spamassassin/trunk/spamc/libspamc.c Thu Mar 17 23:46:45 2005
  [...]
   -int message_filter(struct transport *tp, const char *username,
   +int message_filter(struct transport *tp, const char *username, int
   learntype, int flags, struct message *m)
  [...]
   -int message_process(struct transport *trans, char *username, int
   max_size,
   +int message_process(struct transport *trans, char *username, 
   int learntype, int max_size, int in_fd, int out_fd, const int flags)
  [...]
/* Aug 14, 2002 bj: Obsolete! */
   -int process_message(struct transport *tp, char *username, int max_size,
   -int in_fd, int out_fd, const int my_check_only,
   -const int my_safe_fallback)
   +int process_message(struct transport *tp, char *username, int learntype,
   + int max_size, int in_fd, int out_fd,
   + const int my_check_only, const int my_safe_fallback)
  
  message_foo are our public routines in libspamc, aren't they?  And changing 
  the parameter list is not Binary Compatible (especially in C), isn't it?  
  The same is true for the structs I guess (am no C expert).
 
 *grumble*

oops -- good point Malte!  yes, that'd break ABI compatibility, which 
is a big deal with C (like coredumps).

 I'm also no C expert and really wasn't focused much on the
 spamc/libspamc code.  I struggled to change how the options are given
 for spamc.
 
 Possibly the best course of action would be to add a message_learn
 call to libspamc and call that from spamc instead of passing through
 the learntype in message_filter/process.  In extension, should we ever
 add the ability to report we can add message_report.

+1

  I think we did not define yet how public the libspamc interface is, but 
  IMO should we try to keep BC in libspamc.  If we tend to break it, we 
  should introduce some kind of versioning for libspamc.
  
  Keeping BC is a real PITA, but breaking all third party apps which link 
  against libspamc just because the user updated his version of SpamAssassin 
  is really rude.  (Think OpenSSL if you want a bad example.)
  
 
 Agreed, it was unintentional, and on further reflection, probably done
 in a better way.  I'll see if I can get something workable.
 
  Oh, and I guess we should just remove old stuff like process_message as it 
  was marked obsolete since 2002 and changing the fingerprint will break 
  backwards compatibility anyway ;~)
 
 How do people feel about removing it?

well, 3 years is plenty of time by now ;)  but if it's painless to
leave it in, might as well leave it in in my opinion.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCPoSQMJF5cimLx9ARArvyAKC0Ml67fJ0qfryDSP9fm1flODpqPACggUq+
hjTzBXRmi7SqS5zwFxtvnKM=
=h9hU
-END PGP SIGNATURE-



Re: svn commit: r158541 - spamassassin/trunk/lib/Mail/SpamAssassin.pm spamassassin/trunk/lib/Mail/SpamAssassin/PerMsgStatus.pm

2005-03-22 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Theo Van Dinter writes:
 On Tue, Mar 22, 2005 at 06:00:36AM -, [EMAIL PROTECTED] wrote:
  bug 3409: modify header ordering for DomainKeys compatibility, by placing 
  markup headers at the top of the message
 
 Hrm.  I don't think we want our headers at the very top, since adding Received
 headers will get broken up, etc?

that's part of the idea of treating them as possibly tracking headers.
in other words a recipient can then see where the X-Spam- headers were
inserted.

 DK doesn't sign Received headers, right?

yes, it can, assuming the DK signing machine is a step beyond the machines
that inserted the Received headers.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCQGC+MJF5cimLx9ARAtovAJ4mDzVmGeNslatruWc/IEd0RneumQCeMD5d
qFV2aHjAhA5kdgTtv+tkkgY=
=fIbD
-END PGP SIGNATURE-



Re: svn commit: r158541 - spamassassin/trunk/lib/Mail/SpamAssassin.pm spamassassin/trunk/lib/Mail/SpamAssassin/PerMsgStatus.pm

2005-03-22 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Malte S. Stretz writes:
 On Tuesday 22 March 2005 19:15 CET Justin Mason wrote:
  Theo Van Dinter writes:
   On Tue, Mar 22, 2005 at 06:00:36AM -, [EMAIL PROTECTED] wrote:
bug 3409: modify header ordering for DomainKeys compatibility, by
placing markup headers at the top of the message
  
   Hrm.  I don't think we want our headers at the very top, since adding
   Received headers will get broken up, etc?
 
  that's part of the idea of treating them as possibly tracking headers.
  in other words a recipient can then see where the X-Spam- headers were
  inserted.
 
 Why do you check against /^Return-[pP]ath:/ instead of /^Return-Path:/i?  I 
 think these headers were case sensitive in RFC 822 but since 2822 all 
 headers should be matched case insensitive (even if not, we should do so 
 IMO).

good point.  will fix that now...

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCQIlNMJF5cimLx9ARAsOZAKC7Ksw10MNuL/VXkyv8Jr5y7mdfDwCfeo/d
twDIbSCtZZ6h4NJSZ0PxXiQ=
=1ars
-END PGP SIGNATURE-



Re: svn commit: r158011 - in spamassassin/trunk/lib/Mail/SpamAssassin: Conf.pm PerMsgLearner.pm

2005-03-23 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 Daryl C. W. O'Shea [EMAIL PROTECTED] writes:
 
  Should I remove this from trunk?  I'm indifferent on the option... I 
  just saw the open bug and finished the code someone at one time 
  apparently thought was a good idea.
 
 Well, my inclination would be to revert unless we're sure about the
 destination location.  I'm -0.7 for now.

I'd say a good way to do it might just be to put it in a plugin,
even if that plugin is just a holder for the option right now.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCQbrxMJF5cimLx9ARAsOFAJ9kFCMqhVUcH5+X/gQUjZLbvchdaQCgqVzi
ggPokaGajL3OWyueEBz+cHo=
=HnWa
-END PGP SIGNATURE-



Re: closing in on 3.1

2005-04-04 Thread Justin Mason
On Mon, Apr 04, 2005 at 12:35:34AM -0700, Daniel Quinlan wrote:
 We only have about 40 bugs remaining, a large number of which can be
 moved forward to 3.2.0, 3.1.1, or Future.  I whacked about a dozen
 bugs over the weekend.  ;-)
 
 A few open issues and questions:
 
  * History plugin: I think we can wait to add this, right?
 
  * SURBL changes: we should definitely finish the SURBL-related bugs
related to efficacy (except the mix-up problem, unfortunately).
 
  * spamd instability: the new spamd code does not seem as reliable as
the old code

BTW, I'm out of action until next weekend at least, due to a
sudden bereavement. :(  

Getting access to email has been pretty tricky since I'm in rural
Ireland at the mo -- as such I probably won't be much help until post-
next weekend.

However I'd like to get someone to look at those spamd issues; IMO,
the new preforking system is a whole lot more stable and reliable
than 3.0.0's, due to the greatly improved memory profile it gives
us, and I'd be seriously negative about releasing 3.1.0 without
it.

--j.


Re: ArchiveIterator fault handling

2005-04-12 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 I'm cleaning up the fault-handling in mass-check a bit for some of my
 own scripting.  My main question: should ArchiveIterator return a fatal
 error (that would propagate up) if a target is not accessible?  All of
 those reading/scanning functions just return; regardless of errors.

in my opinion, if a target listed by the user (e.g. a directory or mbox
name listed on the command line) does not exist, then a reasonable way to
deal with that would be to die() and let the calling code deal with it.

Why have the caller deal with it? In some cases, this would not be
appropriate as a fatal error, as I've often run into it myself when
removing an old mail folder and I don't want the entire
mass-check/sa-learn/etc. to fail; however, if I'm removing markup
using spamassassin, I *would* want the script to fail with a clear
exit code in that case.

(It might be appropriate to carry on and return an exit code in the
spamassassin/sa-learn cases.)

However, if it's just that something *inside* one of those user-listed
targets disappears, that's definitely non-fatal.  This happens regularly
if you scan live mail folder Maildirs, and delete a messge between
mass-check start time and when it gets to scanning that file.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCXDoRMJF5cimLx9ARAkenAKC3YK+hmCoBhlQQFxJa336ijXbQKwCgkZI2
r4xSPlgcPPLp1Gk3rO9CzrA=
=iz6S
-END PGP SIGNATURE-



Storable and 3.1.0

2005-04-14 Thread Justin Mason
BTW, I think we should try to get the code that relies on
Storable out of SpamAssassin for 3.1.0. why?

Because as far as I can see we still have a lot of users reporting
problems with spamd hanging, and (last I heard) it looked like Storable
was the issue.

We've hacked around it with some alarm() timeouts, but we're not sure if
they'll work or not in all cases.

as far as I can see, we are possibly the biggest perl project using
Storable for such an essential and complex piece of code, and I think we
may be running up against the poorly-debugged edge case that nobody else
has had to deal with.

On top of that, I don't see exactly *why* Storable is required to
implement what it's doing (keeping a copy of the basic system-wide Conf
object's data).  as far as I can see, we can do that a la

%{$conf-{tests}} = %{$basic_conf-{tests}}

ie. direct copying and assignment, in pretty much exactly the same
way we use Storable, but without the worries.

We have spamd reliability issues, I think, in the field -- and
cutting out one possible cause of that seems like a very good idea.
thoughts?

--j.


Re: Storable and 3.1.0

2005-04-15 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Theo Van Dinter writes:
 On Thu, Apr 14, 2005 at 12:07:00PM -0700, Justin Mason wrote:
  On top of that, I don't see exactly *why* Storable is required to
  implement what it's doing (keeping a copy of the basic system-wide Conf
  object's data).  as far as I can see, we can do that a la
  
  %{$conf-{tests}} = %{$basic_conf-{tests}}
  
  ie. direct copying and assignment, in pretty much exactly the same
  way we use Storable, but without the worries.
 
 Well, the short version is that it works fine for simple scalars, but any
 form of reference breaks things horribly.  For example:
 
 # perl -e 'use Data::Dumper; $conf{a}={foo=bar}; $conf{b}={bar=baz}; %back 
 = %conf; $conf{b}-{baz} = schmoo; print Dumper(\%back);'
 $VAR1 = {
   'a' = {
'foo' = 'bar'
  },
   'b' = {
'bar' = 'baz',
'baz' = 'schmoo'
  }
 };
 
 copy_config() goes through %conf one at a time and copies, but there's no
 guarantee the lower data structures are simple.  We wanted dclone() since
 it'll copy complex structures recursively.  We don't need to worry about
 references, etc.
 
  We have spamd reliability issues, I think, in the field -- and
  cutting out one possible cause of that seems like a very good idea.
  thoughts?
 
 I'm +1 for getting rid of Storable in general, and we've even discussed
 how to do this before so as to get rid of Storable, increase memory
 sharing between parent and client, decrease usage (via not needing a
 backup conf hash), etc.
 
 I think that's going to be a bit of work for 3.1, especially if we want
 it out sooner rather than later.

I think we can do a quick fix that just replaces use of Storable with a
similar non-Storable copying system, keeping the backup conf hash model.
the full hog of increasing RAM sharing etc. is certainly a longer-term
thing that shouldn't be put into 3.1.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCXv1SMJF5cimLx9ARArkLAJoDv8UGyo6pyA4Q6WlHaIEJh+20NgCZAQY6
2c8yJwAJR2zrAK1NigudVao=
=Vb+E
-END PGP SIGNATURE-



Re: uridnsbl: bogus rr run ...

2005-04-26 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Sidney Markowitz writes:
 Matt Sergeant wrote:
  I didn't think you could do that because in newer versions of Net::DNS
  the id is a lexical variable. The only way to reinitialise it is to
  reload the module.
 
 If I remember it correctly, Justin's code keeps its own counter and sets the
 packet ID after creating the packet, making it independent of Net:DNS's 
 counter.

Yep, our code just uses its own counter, and avoids using Net::DNS'
counter code as much as possible.

 I guess that would break down if there are any uses of Net::DNS by the same
 process that do not go through his code. If that is what is happening and it
 results in ID collision, the fix would be to use code like yours to reload
 the module and rely on its own counter. I'll try that now that I can
 reproduce the problem myself (painful as it is).

That should not be a problem, as long as Net::DNS uses a different part
of the ID space and increments separately:

  Net::DNS:   123456789...
  our code: 123456789...

I have a hard time figuring out how Net::DNS could be using the same
counter value, since it just uses rand() to seed it and counts
consecutively from there by default -- plus there's very few places
we still use Net::DNS::Resolver::search() or bgsend().

I think collisions with Net::DNS are a red herring.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCbpe8MJF5cimLx9ARArW6AJ9cla2VSn6TKzomdMrBptXhdUPl/wCeLmJi
qmt5sNEb2NKbvIuV+H7NlR8=
=nJpD
-END PGP SIGNATURE-



Re: uridnsbl: bogus rr run ...

2005-04-26 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Theo Van Dinter writes:
 On Tue, Apr 26, 2005 at 12:34:20PM -0700, Justin Mason wrote:
  I have a hard time figuring out how Net::DNS could be using the same
  counter value, since it just uses rand() to seed it and counts
  consecutively from there by default -- plus there's very few places
  we still use Net::DNS::Resolver::search() or bgsend().
 
 Unless I missed something, no it doesn't.  It sets the initial counter
 based on pid (no rand), then increments each time.

no, Net::DNS uses rand() -- as you can see ;), *we* don't.

 $DNS_ID_COUNTER = (($$  10) ^ (($$  6)  0x));
 
 My only theory right now is that there are problems once the counter wraps
 around the full 16-bits.  The current code may work fine in spamd, whose
 children don't tend to have a huge lifespan.  However, in mass-check,
 the children (unless --restart is used) live for the entirety of the
 run so looping is much more likely.

 I'm trying a small patch which basically calls the reinit function when
 the counter wraps to 0, as well as using rand when initializing.  This way
 it'll get a new random starting point and a new socket occasionally.

deffo worth a try.  I'm also doing a run here, logging all query IDs
and responses using warn(), and logging IDs for uridnsbl collisions:

Index: lib/Mail/SpamAssassin/DnsResolver.pm
===
- --- lib/Mail/SpamAssassin/DnsResolver.pm  (revision 164864)
+++ lib/Mail/SpamAssassin/DnsResolver.pm(working copy)
@@ -206,7 +206,7 @@
 $packet-header()-id($DNS_ID_COUNTER);
 
 # a bit noisy, so commented by default...
- -# dbg(dns: new DNS packet h=$host t=$type id=$DNS_ID_COUNTER);
+warn(dns: new DNS packet h=$host t=$type id=$DNS_ID_COUNTER);
   };
 
   if ($@) {
@@ -297,7 +297,7 @@
 
 my $cb = delete $self-{id_to_callback}-{$id};
 if (!$cb) {
- -  dbg(dns: no callback for id number: $id, ignored; packet: .
+  warn(dns: no callback for id=$id, ignored; packet: .
 $packet-string);
   return 0;
 }
Index: lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm
===
- --- lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm  (revision 164864)
+++ lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm(working copy)
@@ -620,7 +620,8 @@
   # this zone is a simple rule, not a set of subrules
   # skip any A record that isn't on 127/8
   if ($rr-type eq 'A'  $rr-rdatastr !~ /^127\./) {
- - warn(uridnsbl: bogus rr for domain=$dom, rule=$rulename, rr= . 
$rr-string);
+   warn(uridnsbl: bogus rr for domain=$dom, rule=$rulename, id= .
+$packet-header-id. rr=.$rr-string);
next;
   }
   $self-got_dnsbl_hit($scanstate, $ent, $rdatastr, $dom, $rulename);
@@ -631,7 +632,8 @@
   if ($rr-type eq 'A'  $rr-rdatastr !~ /^127\./ 
  !($uridnsbl_subs_bits  0xff00))
   {
- - warn(uridnsbl: bogus rr: domain=$dom, zone=$ent-{zone}, rr= . 
$rr-string);
+   warn(uridnsbl: bogus rr: domain=$dom, zone=$ent-{zone}, id= .
+$packet-header-id. rr=.$rr-string);
next;
   }
   foreach my $subtest (keys (%{$uridnsbl_subs}))
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCbqM1MJF5cimLx9ARAmixAKCKlpam9q1V5vhi01Nv2n0+8+UkwACfUrAI
Ink3yZy7ycNFvDZkTOMh8OM=
=ZK2r
-END PGP SIGNATURE-



Re: Bayes scores for 3.0.3

2005-04-27 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Michael Parker writes:
 This is a MIME-formatted message.  If you see this text it means that your
 E-mail software does not support MIME-formatted messages.
 
 --=_mail-4390-1114614472-0001-2
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable
 
 On Wed, Apr 27, 2005 at 02:06:12AM -0700, Daniel Quinlan wrote:
  I propose we change this:
  
  score BAYES_50 0 0 1.567 0.001
  score BAYES_60 0 0 3.515 0.372
  score BAYES_80 0 0 3.608 2.087
  score BAYES_95 0 0 3.514 2.063
  score BAYES_99 0 0 4.070 1.886
  
  to
  
  score BAYES_50 0 0 1.567 0.001
  score BAYES_60 0 0 3.515 1.0
  score BAYES_80 0 0 3.608 2.0
  score BAYES_95 0 0 3.514 3.0
  score BAYES_99 0 0 4.070 3.5
  
  trivial enough?
 
 +1, I've been running with similar scores for awhile now.

+1
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCb8pSMJF5cimLx9ARAreAAJ9tWl/mGcs3Z59RwYoIYlkHcIghLACcDd8s
mSSyf4xKrJQmK2fzNUk1j24=
=Sfoo
-END PGP SIGNATURE-



Re: Bayes scores for 3.0.3

2005-04-27 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 Theo Van Dinter [EMAIL PROTECTED] writes:
 
  Arguably, then, shouldn't we do a new score generation run for 3.0.3?
 
 As long as hell has frozen over, I have no problem with that.

yeah, I think we can afford a shortcut here ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCb/CDMJF5cimLx9ARAgUDAJ9uU5o9PBR6NZmRIY5+A1BA6b3UiACfXVFl
mt+W02/z9p8OjC1cN85lByM=
=S+Pl
-END PGP SIGNATURE-



Re: Please Test 3.0.3 For Release

2005-04-27 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


+1 -- both look good to me.

- --j.

Michael Parker writes:
 Howdy,
 
 I've generated the tarballs for the 3.0.3 release.  Please download
 and test for release.  Once we have three +1s I'll move the files over
 to dist and upload to CPAN, then announce the release 12 or so hours
 later to allow for mirror propagation.  Also, I've included a draft of
 the release announcement, please review and suggest any changes you
 would like to see.
 
 You can pick up the files here:
 http://people.apache.org/~parker/release/
 
 Here is the draft announcement:
 
 SpamAssassin 3.0.3 is released!  SpamAssassin 3.0.3 contains some
 important bug fixes, and is recommended for use over previous
 versions.
 
 SpamAssassin is a mail filter which uses advanced statistical and
 heuristic tests to identify spam (also known as unsolicited bulk email).
 
 Highlights of the release
 -
 
  - Fixed possible memory bloat from large AutoWhitelist db files
 
  - Fixed where user defined rules scores became ignored
 
  - Updated parsing code for several Received: header formats
 
  - Increased some BAYES_* scores for the network+bayes score set
 
  - Document set_tag for Plugin API and added get_tag
 
  - Additional bug fixes.
 
 Downloading
 ---
 
 Pick it up at http://spamassassin.apache.org/
 
 You can also find it on your favorite CPAN mirror (you may need to
 wait a day or so for the release to propagate).
 
 md5sum of archive files:
 c9028e72958909285e43feb806d948dc  Mail-SpamAssassin-3.0.3.tar.bz2
 ca96f23cd1eb7d663ab55db98ef8090c  Mail-SpamAssassin-3.0.3.tar.gz
 d7292ec75eb61e0fa2ceb6aa5b20fed9  Mail-SpamAssassin-3.0.3.zip
 
 sha1sum of archive files:
 324763dd7b344b68ad9ab73fd68b8f779c801aab  Mail-SpamAssassin-3.0.3.tar.bz2
 e31407b68bf362dfe53814c0af867e8134c9808b  Mail-SpamAssassin-3.0.3.tar.gz
 c1aa1583eebc0771ee053b8a484a42fc22b8630c  Mail-SpamAssassin-3.0.3.zip
 
 The release files also have a .asc accompanying them.  The file serves
 as an external GPG signature for the given release file.  The signing
 key is available via the wwwkeys.pgp.net key server, as well as
 http://spamassassin.apache.org/released/GPG-SIGNING-KEY
 
 The key information is:
 
 pub  1024D/265FA05B 2003-06-09 SpamAssassin Signing Key [EMAIL PROTECTED]
  Key fingerprint = 26C9 00A4 6DD4 0CD5 AD24  F6D7 DEE0 1987 265F A05B
 
 Note:  GnuPG 1.4.0, and possibly 1.3.x versions, seem to have problems
 verifying certain signature files, including the type as used for
 SpamAssassin releases. If you are running an affected version, please
 verify the code using both MD5 and SHA1 sum values instead.
 
 The SpamAssassin Developers
 
 --=_mail-13108-1114637303-0001-2
 Content-Type: application/pgp-signature
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline
 
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.0.7 (GNU/Linux)
 
 iD8DBQFCcAP2G4km+uS4gOIRAoEYAJ9VeVX2njPCzpa3zVtOnIa0NbybugCfU4YV
 9rgkgguz9bcrXm1AjTopMnkÜH6
 -END PGP SIGNATURE-
 
 --=_mail-13108-1114637303-0001-2--
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCcAf/MJF5cimLx9ARAmLBAJ9zuiNFf4KorI0fylHpNpbKMWY/kgCgk4KY
4wl3Cm4kmgHMo9Qk0xbFmKk=
=E0TF
-END PGP SIGNATURE-



Re: Please Test 3.0.3 For Release

2005-04-28 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Theo Van Dinter writes:
 On Wed, Apr 27, 2005 at 06:24:40PM -0500, Michael Parker wrote:
  There actually have been 3 +1s, quinlan's just hasn't made it to the
  list yet.  I've already moved the files over to dist.
 
 So yeah, this is what happened with 3.0.2 and it's annoying.  For 3.0.2,
 everything was decided on IRC, people posted to dev@ to make it official
 within a 5 minute span, and a release happened within an hour.
 
 Votes need to occur on the list to be considered official, votes on
 IRC or whatever don't count.  Votes are also supposed to run for at
 least 24 hours to give everyone a chance to see it and make comments.
 Releases can't be vetoed, but the procedures we have documented explain
 why this is still useful.
 
 If we don't want to put the patch in, that's fine, but I'd like to
 have release votes go as they're supposed to before the release is
 actually done.

OK, that sounds very logical to me.

I think we are too late for 3.0.3 (nor is it a massive bug, fwiw), but
future releases, we should adopt the 24-hours policy, as a guideline
at least, to catch last-minute stuff.

And strongly +1 on votes on the list *only*.  The IRC channel is
nice, but we're not all there ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCcDG5MJF5cimLx9ARAg9uAKCXp0EUPKjfU7MzeQjkpQcwyhSf6QCdHZry
9smHOueY67g7wSLxPxAsGkE=
=I5Fe
-END PGP SIGNATURE-



Re: Please Test 3.0.3 For Release

2005-04-28 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


BTW, I don't think there's a need to redo a new tarball.  Once
the tarball's out there (FSVO out), a point of no return has
been reached, and I don't think the bug is *that* urgent.

- --j.

Michael Parker writes:
 On Wed, Apr 27, 2005 at 05:43:37PM -0700, Justin Mason wrote:
  
  And strongly +1 on votes on the list *only*.  The IRC channel is
  nice, but we're not all there ;)
  
 
 I think it's a shame that IRC has become less useful for developers.
 It used to be a very good tool for everyone.  Evidently Daniel hasn't
 been able to send his vote to the list, here is a small transcript
 from IRC:
 [17:21:51] Herk quinlan: so +1?
 [17:32:32] quinlan yeah, +1
 [17:32:54] Herk quinlan: can you mail dev, just to make it official
 
 Once 3 or more +1 votes have been received for a release any decision
 for ultimate release lies with the release manager.  There is no
 mandated soak time, if the release is ready it should go.
 
 I agree, that in most cases, we should allow 24 hrs or so for someone
 to register a veto on a patch before applying it to the stable
 branch.  That obviously doesn't always happen.  I make sure to do that
 for any patches of mine that might be controversial or if I'm unsure
 for any reason.
 
 We are talking about a bug that came in after the 3.0.3 release had
 been built and made public (not in an official directory, but public
 non the less).  So moving forward and putting anything into the stable
 branch will entail a bump in the release version for a release.
 
 3.0.3 has not been officially released, but there is a tarball in the
 wild that claims it is 3.0.3.  The only steps I've taken thus far are
 to move things over to dist so the Apache mirror system can begin
 syncing the release out to the various mirrors.
 
 I'm willing to scrap 3.0.3 all together and move on to 3.0.4, afterall
 the whole point of this exercise is to get something out there and
 stable.  However, please keep in mind when it comes to release, we not
 only need to get things built and tested, but we should allow time for
 tarballs to sync out to mirror sites before announcing.  For
 maintenance releases this is pretty easy because we're dealing with a
 fairly stable set of code.
 
 We're in no different place than if the release had received no +1
 votes so we might as well move forward, bump the release num, declare
 3.0.3 dead and get a nice stable 3.0.4 out the door.
 
 Michael
 
 --=_mail-17219-1114649992-0001-2
 Content-Type: application/pgp-signature
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline
 
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.0.7 (GNU/Linux)
 
 iD8DBQFCcDWHG4km+uS4gOIRAnRiAJ0TwtDOb9t7kVJKN0u74tQggBd9awCfQF2H
 TcC/MOqx+mANnHketsE6kDA=VC0X
 -END PGP SIGNATURE-
 
 --=_mail-17219-1114649992-0001-2--
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCcDiXMJF5cimLx9ARAmMiAJ9N4F+8zcbngMskG2EXgq/LwCe9EACeLOIg
1LT0W3HI0haajAMjDGOVGCs=
=WUGu
-END PGP SIGNATURE-



Re: Moving on to 3.0.4

2005-04-28 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Michael Parker writes:
 On Thu, Apr 28, 2005 at 11:41:16AM -0400, Theo Van Dinter wrote:
  
  But otherwise, I agree with Warren completely.  I wanted to get the patch
  in for 3.0.3 because it was trivial and the release wasn't complete at
  that point in time.  Now, however, the release has been completed and
  there isn't anything else compelling us to do a 3.0.4 at this point.
 
 I had already made the the 3.0.3 tarballs available for testing at
 least a half hour before I received the initial mail from Bugzilla for
 4287.  40 mins later you expressed a desire to get 4287 into the 3.0
 branch, well you said 3.0.3 but that ship had already sailed.  So if
 it falls to 3.0.4, no big deal, we vote, the patch is committed, and
 we do it all over again for 3.0.4.
 
 Consider the 3.0.3 release gone, so the same things that compelled the
 3.0.3 release are in place for 3.0.4.  Have you changed your mind about
 getting 4287 into the 3.0 branch?

I'm with Theo on this one -- I by no means consider 4287 a worthwhile
reason to go to the bother of cutting a 3.0.4.  and it *is* a bother!

4287 can get checked into the 3.0 branch, just in case there's something
in the future.  but it's by no means urgent.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCcRqTMJF5cimLx9ARAtZnAKCx5FDO3p9/jhHGWHgCNYnaR4oODACgojG2
w7baFBTVMnA86qnlGBywIX8=
=r5H2
-END PGP SIGNATURE-



Re: Moving on to 3.0.4

2005-04-28 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Michael Parker writes:
 On Thu, Apr 28, 2005 at 10:17:07AM -0700, Justin Mason wrote:
  
  I'm with Theo on this one -- I by no means consider 4287 a worthwhile
  reason to go to the bother of cutting a 3.0.4.  and it *is* a bother!
  
  4287 can get checked into the 3.0 branch, just in case there's something
  in the future.  but it's by no means urgent.
 
 Then why the hell was such a big deal made over it yesterday?

Well, dunno about everyone else, but what I mailed above is me sticking to
my guns from y'day -- I quote: 'BTW, I don't think there's a need to redo
a new tarball.  Once the tarball's out there (FSVO out), a point of no
return has been reached, and I don't think the bug is *that* urgent.'

;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCcTvaMJF5cimLx9ARAo7fAKCIrc/xlGcBFKZEum+bS1kWmZutLACfVU/F
KxiJA8n/L0CZr6TyUrv8sew=
=jqoq
-END PGP SIGNATURE-



Re: Purpose of Mail-SpamAssassin-current*

2005-04-29 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Michael Parker writes:
 Do we actually use the Mail-SpamAssassin-current* files?  If not, do
 they serve any other purpose?

check out the links in the build/README doc -- they're recommended there
by an ASF document on how /dist should be laid out.

However, I think we can probably change that if you like; it appears that
half of the recommended things to do in /dist are outright wrong, such as
using symlinks (which aren't supported by some mirrors. wonderful).  I
asked infrastructure about this once, and it fell into a bit bucket. ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCcYvYMJF5cimLx9ARAgEiAJ9aZRs9GPH5v2WzoU5D/gtRMf4AQgCePhn0
RaVe5+UlO2kjltw6WLo99KY=
=hvTz
-END PGP SIGNATURE-



Re: Broken nightly run

2005-05-01 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Theo Van Dinter writes:
 165354 jm if (sysopen ($tmpfile, $reportfile, 
 O_RDWR|O_CREAT|O_EXCL, 0600))
 165354 jm {
 165354 jm   last;
 165354 jm }
 
 last can't be used in a do, at least according to the POD.

well, you learn something new every day!  I wonder why not? ;)
anyway, yep, my bad.

(although Daniel's change could have done with less indentation
and whitespace-tweaking ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCdVmxMJF5cimLx9ARAvXNAJ0SQR6YSzXXYjHVyBRzj3WAk4q7KQCgjDuN
IYB89V+FUFNeYzrMkMxfmTk=
=W6ff
-END PGP SIGNATURE-



Re: ReleasePolicy

2005-05-03 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 This needs PMC votes to change it from draft to official status and make
 what it says policy.  No reason to do this PMC vote out of the public
 view, though.
 
   http://wiki.apache.org/spamassassin/ReleasePolicy
 
 I have attempted to clarify some of the release process and also beefed
 up the build/README (thanks Michael for some input on these).
 
 My vote: +1

see my other mail; -0.5 right now.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCdrFcMJF5cimLx9ARAi0dAJ9YaGbewFIjbsPqsImzqyfwNvZL/QCghhMI
jPAFoFaWg53KSm6RE87ibv4=
=uUo5
-END PGP SIGNATURE-



Re: svn commit: r165704 - /spamassassin/trunk/build/README

2005-05-03 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 [EMAIL PROTECTED] (Justin Mason) writes:
 
  OK, I do not agree with this.  in my opinion, a release number is only
  burned in stone once the file is announced, uploaded to CPAN, etc.
  I'd prefer to avoid burning too early, as it makes for less flexibility.
 
 I'm open to moving it to the public tarball stage if we add the
 procedure for how to back out the tags, Changes file (if needed?), etc.

Yeah, I'd agree.  Backing out the tags is unnecessary -- afaik it'll just
update them; and Changes, that's just a matter of recommiting a new change
that says oops, made a boo-boo, THIS IS THE REAL 3.0.X RELEASE.  Pretty
much like I did for 3.0.0-pre1 ;)

Michael -- btw -- my take is that someone could be running svn trunk or a
nightly snapshot, so attaching there can be only one status to a tarball
that never got onto /dist, is silly under the circumstances.

  Let's not make rules for the sake of making rules!
 
 No, my intent is not to do that.  I just want to avoid consternation at
 release time when the RM is already under enough pressure.  Make the
 decision beforehand, one we can stick with and won't want to change or
 argue later.

OK, that's cool.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCdsj+MJF5cimLx9ARAtcwAKCfjGgOuvuSXgRnwSy2/Dymt72hCgCdG8fe
2/p88i8lhk+F4aYmXkKsEUE=
=3+io
-END PGP SIGNATURE-



Re: ReleasePolicy

2005-05-03 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 [EMAIL PROTECTED] (Justin Mason) writes:
 
  see my other mail; -0.5 right now.
 
 Still?

+1 now ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCd+xMMJF5cimLx9ARAu59AJ4vuveZNKCXgP0k1TulG310k7LvpACgi30U
dljGd6BghpI1AAmtYSwhGG4=
=GeED
-END PGP SIGNATURE-



Re: svn commit: r168050 - /spamassassin/trunk/lib/Mail/SpamAssassin/PerMsgStatus.pm

2005-05-04 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


[EMAIL PROTECTED] writes:
 Author: quinlan
 Date: Tue May  3 19:31:07 2005
 New Revision: 168050
 ... 
 don't allow _ since that's not allowed in hostnames

I'm wondering if this is a good idea?

http://issues.apache.org/bugzilla/show_bug.cgi?id=21133 is an Apache httpd
bug asking to refuse ServerNames with underscores -- as you can see, it
was closed WONTFIX since they explicitly decided to permit this.

in my opinion, it strikes me as a good way for a spammer to evade
SpamAssassin if we get that wrong.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCeDldMJF5cimLx9ARAlg4AJwM/PnJPklwkI7kNuRzXE7DSQq5twCZAfM3
zRoXOjW38ur3fZfRdq4cgms=
=ZVe8
-END PGP SIGNATURE-



Re: svn commit: r168050 - /spamassassin/trunk/lib/Mail/SpamAssassin/PerMsgStatus.pm

2005-05-04 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
  don't allow _ since that's not allowed in hostnames
 
 [EMAIL PROTECTED] (Justin Mason) writes:
  
  I'm wondering if this is a good idea?
 
 Well, it's not allowed in the RFC and SURBL doesn't contain a single _
 hostname, but if you want to add it back, I'm okay with that.

let's see what Jeff thinks ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCeFC1MJF5cimLx9ARAi5YAJ0XN4zAcDHVl26SK3d91mPxzE8QwACeMMQh
dZeDdNGasp+viumM9ERC5so=
=61FM
-END PGP SIGNATURE-



Re: boosting

2005-05-05 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Sidney Markowitz writes:
 Fred wrote:
  There was similar work being done in the past to identify rules to be
  grouped into new meta rules, this (w|c)ould achieve similar results.
  http://bugzilla.spamassassin.org/show_bug.cgi?id=1363
 
 I think I'm missing something here. Are you saying that automatically
 grouping rules into meta rules that have similar classification properties
 is equivalent to boosting? Or do you mean that it is another approach that
 also can improve performance of weak learners?
 
 In any case, you have given me an idea for the microarray gene expression
 problem, so thanks! :-)

It does seem applicable, esp. with SpamAssassin having its own precooked
ruleset (with perceptron-generated, *not* hand-generated scores -- btw
what websites claim we use hand-generation?) and Bayes (with
naive-Bayes-like probability combining) -- both of those are currently
combined in a very simplistic way.

But the learning curve was too great for me.

Henry is definitely the guy who should comment, he may have investigated
this too...

Boosting, using the two classifiers, would certainly be an interesting
application.

The bug 1363 meta rule thing is quite separate, although related in a
different way -- that is, is there a way to combine a subset of bayesian
rules to come up with *new* high-reliability rules?

A related question is, are there good algorithms to determine a rule's
effectiveness on a corpus when considered in conjunction with multiple
other rules?  Right now we use a simplistic information-gain type
algorithm which effectively considers each rule in isolation, instead of
how they interact with (and may be redundant to) other rules.

To explain:  let's say we have three rules hitting the following
mails:

ruleham1ham2ham3spam1   spam2   spam3
R1  x   .   .   x   x   .
R2  .   .   .   x   x   .
R3  .   .   .   .   .   x

R2 is obviously more effective than R1, since it hits the same spam and
less ham.  But right now, we consider R3 to be less effective than R2,
because it has a lower hitrate on spam than R2 does.  However, it's the
only rule of the three that managed to hit the spam3 section of the
corpus, so *in conjunction with* R2, it's very effective.

That's the problem that's hard to solve algorithmically.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCeVloMJF5cimLx9ARAmlDAJ4zhIO/YsfBIMdlYXy0OPSyENF5YACfWYwv
sDS8zZ+4mcwhbUdUBXufJ6Y=
=mcqL
-END PGP SIGNATURE-



Re: registrar boundary inconsistencies

2005-05-05 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Chris Santerre writes:
   - change the domain code in SA to consider the domain a registry like
   eu.org or demon.co.uk (let us know and we'll change our code as long
   as it makes sense ;-).  This means we don't expect blacklist the
   entire registry.
  
   - SURBL (or your data provider) blacklists the entire domain
 
   - remove the hostname.domain listings ... why bother if nothing's
   going to hit them
 
 Daniel
 
 I vote for changing the domain code to recognise these domains.
 Blacklisting the entire domain can have too many problems. Removing the
 whole thing would let spammers game these domains. 

heh, I did say this would happen last year ;)   I also think we should
consider these private registries equivalent to TLD registries, as I said
back then.  Here's the bug -- it's still open:

  http://bugzilla.spamassassin.org/show_bug.cgi?id=3549

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCelT3MJF5cimLx9ARArd1AKCNWNX8Rw2vgQTVWgMqY78Vb29geQCfW7MR
IPh+BIIqohKXaBmqSXliZPI=
=s1rU
-END PGP SIGNATURE-



Re: [OT] Boosting and other potential research topics

2005-05-05 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


there's been quite a bit of research into N-gram bayesian phrases;
I'd recommend reading the spambayes list archives, the bogofilter
archives, and I think Gordon Cormack covered its accuracy too.

summary: you'll massively expand database size for not a huge
gain, iirc ;)

Dobly noise reduction is slightly different though.

- --j.

Ted Markowitz writes:
 Thanks for the thoughts, Chris. I've been thinking along some of these 
 same lines myself as to exactly how much more effective N-gram phrases 
 (with some arbitrarily N) would be vis a vis the Bayesian classifier. I 
 seem to remember some research along these lines described at the 2005 
 MIT Spam Conference by Jonathan Zdziarski of DSPAM where he talks about 
 Bayesian Noise Reduction using N-gram sized phrases as meta-tokens 
 which can then be fed into some spam/ham classifier.
 
 Cheers,
 
 --ted
 
 Chris Santerre wrote:
 
  Wellthis kind of goes along the idea of bayes chains. You can 
  look into which pairs/treos of bayes tokens hit the most spam and 
  least ham. Same goes for rules. There are some scripts around the 
  community to give top hitting rules, which might come in very useful.
   
  Once you find these magic pairs/treos, it should be relativley easy to 
  meta them together. Although I'm not sure how you would do that on the 
  bayes token side, as I think it kind of already is handled. Its public 
  knowledge that I dislike bayes and don't use it :)
   
  Its a good idea, and prbly the next best step to look at.
   
  HTH,
   
 
  Chris Santerre
  System Admin and SARE/URIBL Ninja
  http://www.rulesemporium.com http://www.rulesemporium.com/
  http://www.uribl.com http://www.uribl.com/
 
  -Original Message-
  *From:* Ted Markowitz [mailto:[EMAIL PROTECTED]
  *Sent:* Wednesday, May 04, 2005 6:55 PM
  *To:* dev@spamassassin.apache.org
  *Subject:* [OT] Boosting and other potential research topics
 
  In this same vein of exploring concepts like the application of
  boosting algorithms or using meta rulesets to enhance the SA
  classification process, I've been looking for an interesting
  doctoral dissertation topic in the spam domain for a some time now
  and was wondering if folks in the SA community had some ideas
  rolling around in the back of their minds that would lend
  themselves to doctoral-level research? Perhaps some area you'd
  really like to explore yourself, if only you had the time.:-)
 
  My program in CS is especially geared towards folks with a lot of
  hands-on, real world IT experience, and so topics with an applied
  research  development bent and a serious coding component are
  quite OK. Any ideas, interesting leads, or useful pointers would
  be much appreciated.
 
  Thanks muchly for your thoughts.
 
  --ted
 
  Sidney Markowitz wrote:
 
 Fred wrote:
   
 
 There was similar work being done in the past to identify rules to be
 grouped into new meta rules, this (w|c)ould achieve similar results.
 http://bugzilla.spamassassin.org/show_bug.cgi?id63
 
 
 
 I think I'm missing something here. Are you saying that automatically
 grouping rules into meta rules that have similar classification properties
 is equivalent to boosting? Or do you mean that it is another approach that
 also can improve performance of weak learners?
 
 In any case, you have given me an idea for the microarray gene expression
 problem, so thanks! :-)
 
  -- sidney
   
 
 
 -- 
 
 ===Ted Markowitz
 Chief Architect
 Cognosys LLC (http://www.cognosys.net)
 10 Hamilton Lane, Darien, CT 06820-2809, USA
 
 203-655-2400 (phone/fax) 203-984-6565 (cell)
 [EMAIL PROTECTED] (email)TJMarkowitz (AIM ID)
 === NOTICE: 
 This e-mail, including attachments, is intended solely
  for the person(s) or organization(s) shown in the message's
  header and may contain confidential and/or legally privileged
  information.  Any unauthorized disclosure, copying, or other
  unapproved use or retransmission of this information may be
  unlawful and is strictly prohibited.  If you are not the
  intended recipient, please delete this message immediately.
 ===
 
 
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCelV3MJF5cimLx9ARAmumAKCuT1EnKrDlYlZKLx3J+2YKoo+83gCfc+wb
ZthhE6q23GrXnfRFDFr0KBc=
=h+Tr
-END PGP SIGNATURE-



Re: registrar boundary inconsistencies

2005-05-05 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


John Gardiner Myers writes:
 Daniel Quinlan wrote:
 
  We can't just add them willy-nilly.
 
 Why not?  Treat them like .us -- do two queries.

we don't currently do that.  but that may be a good option, actually!
allow url_to_domain to return 1 datum, and query all of them.

In the case of .us, and these private registrars, return 2
domains, foo.eu.org and eu.org, or foo.state.us and
bar.foo.state.us.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD4DBQFCemjHMJF5cimLx9ARAsWsAJ91vAjk0Mn7J7M+TbFUKxn3b1bDOwCWKbuw
b/NvALdeCXRn600SsZ4trw==
=6YpK
-END PGP SIGNATURE-



Re: registrar boundary inconsistencies

2005-05-06 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Jeff Chan writes:
 On Thursday, May 5, 2005, 11:41:11 AM, Justin Mason wrote:
 That list does currently have some non-country code domains like:
 
 eu.org
 au.com
 br.com
 cn.com
 de.com
 de.net
 eu.com
 [...]
 
 Is SpamAssassin using that list?  If so, it it nearly
 sufficient to make this judgement about what level to check on?
 Can we improve it just by adding more private registries?

Yes.  Except in the bug I posted earlier, I was about the only
person who was +1 on that idea I think ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCe6e1MJF5cimLx9ARAp6BAJ9qzqcvgQezfhpYALsvFf8OtMXdUQCfV7EX
HHKhpRmuBo9fTKf7MyR3WMA=
=pHAQ
-END PGP SIGNATURE-



Re: svn commit: r169038 - /spamassassin/trunk/lib/Mail/SpamAssassin/Logger.pm

2005-05-07 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


OK, can we discuss this?

what is the difference between info() and dbg() with this change?
is there a UI in spamd to enable info() output without dbg() messages?
is warn() still the only way to output if debug is off?
if so, what's the point of info() as it stands?

- --j.

[EMAIL PROTECTED] writes:
 Author: quinlan
 Date: Fri May  6 22:34:09 2005
 New Revision: 169038
 
 URL: http://svn.apache.org/viewcvs?rev=169038view=rev
 Log:
 this isn't needed as far as I know
 
 Modified:
 spamassassin/trunk/lib/Mail/SpamAssassin/Logger.pm
 
 Modified: spamassassin/trunk/lib/Mail/SpamAssassin/Logger.pm
 URL: 
 http://svn.apache.org/viewcvs/spamassassin/trunk/lib/Mail/SpamAssassin/Logger.pm?rev=169038r1=169037r2=169038view=diff
 =---
  spamassassin/trunk/lib/Mail/SpamAssassin/Logger.pm (original)
 +++ spamassassin/trunk/lib/Mail/SpamAssassin/Logger.pm Fri May  6 22:34:09 
 2005
 @@ -62,7 +62,7 @@
  our %LOG_SA;
  
  # defaults
 -$LOG_SA{level} = INFO;   # log info, warnings and errors
 +$LOG_SA{level} = DBG;# log info, warnings and errors
  $LOG_SA{facility} = {};  # no dbg facilities turned on
  
  # always log to stderr initially
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCfScgMJF5cimLx9ARAk1bAKC9ncBleZHtb8WqAz4TyfZVQwP8PQCeI0hm
qLWIuIEqGSu+6lx7ZK33SpI=
=CWpN
-END PGP SIGNATURE-



Re: Question about a proposed change

2005-05-10 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Sidney Markowitz writes:
 Does anyone have any objection to my checking in the following change? It
 makes the code in Dns.pm independent of the format of the key that is used
 to check the reply packets so that it will be easier to play with using
 different keys such as hashes by changing only code in DnsResolver.pm.

looks fine to me -- however there are other calls to that bgsend() method
elsewhere.  it may need to be made there too.

- --j.

   -- sidney
 
   --
 
 Index: lib/Mail/SpamAssassin/Dns.pm
 ==--- 
 lib/Mail/SpamAssassin/Dns.pm(revision 169513)
 +++ lib/Mail/SpamAssassin/Dns.pm(working copy)
 @@ -145,7 +145,8 @@
 
return $self-{resolver}-bgsend($host, $type, undef, sub {
my $pkt = shift;
 -  $self-{dnsfinished}-{$pkt-header-id} = $pkt;
 +  my $id = shift;
 +  $self-{dnsfinished}-{$id} = $pkt;
  });
  }
 
 Index: lib/Mail/SpamAssassin/DnsResolver.pm
 ==--- 
 lib/Mail/SpamAssassin/DnsResolver.pm(revision 169513)
 +++ lib/Mail/SpamAssassin/DnsResolver.pm(working copy)
 @@ -296,7 +296,7 @@
return 0;
  }
 
 -$cb-($packet);
 +$cb-($packet, $id);
  return 1;
}
else {
 
   --
 
 --enigE9AF29CA6DDFA18317A26170
 Content-Type: application/pgp-signature; name=signature.asc
 Content-Description: OpenPGP digital signature
 Content-Disposition: attachment; filename=signature.asc
 
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.0 (MingW32)
 
 iD8DBQFCgRSkM4VFrCxwb/MRAgQBAJ9l3fcPsJFOFkNJryU60mzfqD5utgCfWvND
 N/diT8nQt9EI5uFWgNblPAg=9gDq
 -END PGP SIGNATURE-
 
 --enigE9AF29CA6DDFA18317A26170--
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCgRkEMJF5cimLx9ARAntPAKC0hbCW0ecCEuuPRDHIFMfF5ZXsqgCdHnAq
Szqwx2foqxhRReIhSQZY9Qw=
=JYSG
-END PGP SIGNATURE-



test suite warnings

2005-05-10 Thread Justin Mason
... not failures (although they prob should be):

t/spf...[4635] warn: plugin: eval failed: Not an ARRAY
reference at ../blib/lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm line 207.
t/spf...ok 2/8[4638] warn: plugin: eval failed: Not an
ARRAY reference at ../blib/lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm line
207.
t/spf...ok 4/8[4641] warn: plugin: eval failed: Not an
ARRAY reference at ../blib/lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm line
207.
t/spf...ok 6/8[4644] warn: plugin: eval failed: Not an
ARRAY reference at ../blib/lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm line
207.


--j.


Re: svn commit: r169596 - /spamassassin/trunk/lib/Mail/SpamAssassin/Conf/Parser.pm

2005-05-11 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


[EMAIL PROTECTED] writes:
 -  if (eval { ( =~ m{$re}); 1; }) {
 +  my $evalstr = '( =~ ' . $re . '); 1;';
 +  if (eval $evalstr) {
  return 1;
}

that's not going to work -- it has to be an interpolation of a var inside
the pattern, as in m{$re}, otherwise perl's regexp security checks will
not take place -- which is half the purpose of that function.  (and
the most important half, at that ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCgkN2MJF5cimLx9ARAhPJAJ9ULuiZz5e5mkMhLPxqcOz2eBLNZQCeO3ub
QIqxRMc/XZsf8oj1n4SNA/s=
=qst+
-END PGP SIGNATURE-



Re: svn commit: r169596 - /spamassassin/trunk/lib/Mail/SpamAssassin/Conf/Parser.pm

2005-05-11 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 [EMAIL PROTECTED] writes:
 
  -  if (eval { ( =~ m{$re}); 1; }) {
  +  my $evalstr = '( =~ ' . $re . '); 1;';
  +  if (eval $evalstr) {
   return 1;
 }
 
 [EMAIL PROTECTED] (Justin Mason) writes:
 
  that's not going to work -- it has to be an interpolation of a var inside
  the pattern, as in m{$re}, otherwise perl's regexp security checks will
  not take place -- which is half the purpose of that function.  (and
  the most important half, at that ;)
 
 Well, the problem was that $re isn't exactly how it would be later run,
 so the pattern being tested was something like:
 
   m{/foo/}
 
 We could do *both* evals.  The more significant part of my fix was
 changing the head test code to strip off the Subject =~ and
 [if-unset: foo] stuff.

We could do both, but is this the cause of the t/mimeheader.t
test failures?  in that case, it just needs to be fixed imo.

Are there tests in the test suite for the redirector usage case btw?

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCgk1JMJF5cimLx9ARAt48AKClilreDNG3dY2c8Dzhso5dj4wAPgCgiV1Z
0jRElI6ChGhPae4y3y75BdM=
=TBbB
-END PGP SIGNATURE-



Re: svn commit: r169596 - /spamassassin/trunk/lib/Mail/SpamAssassin/Conf/Parser.pm

2005-05-11 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Sidney Markowitz writes:
 Justin Mason wrote:
  Are there tests in the test suite for the redirector usage case btw?
 
 Excuse me if I'm misunderstanding the question in my fog-before-first-coffee
 of the morning...
 
 The redirector patterns are hardcoded in sub try_canon in uri.t so any
 change to them in 20_uri.cf has to be copied there.
 
 The redirector patterns in 20_uri.cf are tested by one case in uri_html.t
 which does not appear in uri_text.t. Once it is working, we should probably
 add a case for each pattern and have them be in bot uri_html.t and uri_text.t.

I was just wondering if the usage of is_regexp_valid(), in the redirector
pattern case, is tested -- ie. the configuration parameter it uses.   Not
if the redirectors are tested themselves. If they are, though, that's even
  better. ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCglubMJF5cimLx9ARAhE0AJ0TwQZMrNM4cuwKLUG7+GI31SXULQCeLus4
FmXYImOHBlP5HrCt7is87CY=
=5ufY
-END PGP SIGNATURE-



Re: redirector code broken

2005-05-11 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 Daryl C W O'Shea [EMAIL PROTECTED] writes:
 
  So you'd rather not qr// the regexp (in Conf.pm) and do an eval in PMS
  instead?  This would allow for modifiers to be included but will be a
  little slower.
 
 I'm fine with qr// of the regexp *if* we can get at it cleanly
 (supporting any legal perl regexp syntax).  We don't manage to do that
 anywhere currently.
 
 It doesn't have to be an eval, but I think an eval does solve/avoid the
 problem.

It *HAS* to be what it was previously.  That's the only way to test
REs for security issues.

Code to convert a qr// into a regexp string that can be inserted
into this test, is what's needed.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCgnHMMJF5cimLx9ARAipwAJ985vpf3Am1O1b8prRmXj6n8/EqQwCeOqS5
HofGcQnucCPB8sphwgBJAwY=
=2HWG
-END PGP SIGNATURE-



Re: t/dnsbl.t failing

2005-05-11 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Theo Van Dinter writes:
 On Thu, May 12, 2005 at 03:11:49PM +1200, Sidney Markowitz wrote:
  I've been seeing that with varying regularity depending on which network
  I'm on. It's pretty consistently bad on my home DSL, but works if I run
  it again immediately after. I guess then the queries are cached
  somewhere.
  
  Could it be that bugzilla just isn't a good machine to be using as the
  spamassassin.org nameserver as it's too slow to respond?
 
 I'm not too worried about latency to the nameservers, my dev box is the
 other authoritative NS for dnsbltest. ;)

fwiw, it fails every time here too.  I can't think what change it was
that did that, though...

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCgt1RMJF5cimLx9ARAhZ2AJ9INYKWHIGYGfowJc1HMO2RNc8TvwCfRjRP
OhOmrUQICAitf+0hj62Rju0=
=u9S/
-END PGP SIGNATURE-



Re: svn commit: r169749 - in /spamassassin/trunk: MANIFEST lib/Mail/SpamAssassin/Conf/Parser.pm t/regexp_valid.t

2005-05-11 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daryl C. W. O'Shea writes:
  This barfs on   /test//
  The last version included Quinlan's eval to catch that.

thanks -- reinstated some of your code to fix that ;)

 It also barfs on   .* (anything without delimiters).  The regexp has 
 to have delimiters as they are required to do the actual test later.

ok -- that's not the case for other inputs to that method.  I'd
add a test for a delimiter being present in the specific caller that
requires them -- or an additional argument specifying whether
they should be required in the is_regexp_valid() test.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCguCZMJF5cimLx9ARAlJsAKCMSFUHycAp8cdp0Jwji57TunytjwCgop/7
hY66q0Yll+Od0CshiRzkC4Y=
=9/G8
-END PGP SIGNATURE-



keep an eye out for automated svn builds: they'll fail

2005-05-12 Thread Justin Mason
due to the addition of spamc.h as a file generated by configure, you need
to do make distclean; perl Makefile.PL; make to rebuild any automatic
checkouts of svn trunk; simply make will not rebuild, as it doesn't
handle dependency info for the configure-generated files, and therefore
won't be able to find spamc.h.

This bit me on my mass-check checkout last night.

---j.


Buildbot slave update

2005-05-12 Thread Justin Mason
I'm moving the buildbot master from bugzilla onto our new Solaris zone at
spamassassin.zones.apache.org.  

This means that, if you're running a Buildbot slave, you'll need to change
its configuration to use that hostname instead of
bugzilla.spamassassin.org. The easiest way to do this is just to
recreate the slave dir.

Here's the command to do that:

  name=bugz-561 ; buildbot slave /home/buildbot/slaves/$name \
spamassassin.zones.apache.org:9989 \
$name $password

replace name with whatever each slave is called.

BTW, if you've forgotten the slave password, it's in the buildbot.tap
file in Python pickle format, on the line following passwordq#U.
You should see something like this:

usernameq!U
bugz-585-thrqU
passwordq#U
PASSWORDq$ubU   connectorq%NU

the q..U\n stuff seems to be delimiters in the pickle format, so you can
see the bit before that on the line is the password (in this case
PASSWORD, obv not the real one).

In fact, you can see the slave name there too! the username bit.

I'm waiting for some info from Theo, but we should be able to set up a
CNAME in the sa.org zone to do this without user intervention in future...
the good news is, we know have a Solaris 10 slave as well.  Hmm, is that
good news?

--j.


Re: Thread Safe

2005-05-13 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Cliff Stanford writes:
 I remember finding something in Bugzilla about version 3 not being 
 thread-safe currently with a fix pending.  I now can't find it.
 
 Anyone know if the current version of Mail::SpamAssassin is thread-safe?

if I recall correctly, someone reported a bug in thread-safety, due to a
perl bug, and the fix was applied to svn trunk.

  --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFChOBMMJF5cimLx9ARAs4NAJ9xJogX8caAmIUoaYByNhGo1qDsRgCeLEgL
XbeEnK1OvvhCXq4f5IU7iOU=
=baS1
-END PGP SIGNATURE-



Re: make test oddity

2005-05-13 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Theo Van Dinter writes:
 t/regexp_valid..ok 12/24config: invalid regexp for rule test:
 foo(bar: Unmatched ( in regex; marked by -- HERE in m/foo( -- HERE bar/
 
 config: invalid regexp for rule test: foo(?{1})bar: Eval-group not allowed
 
 config: invalid regexp for rule test: /foo(?{1})bar/: Eval-group not allowed
 
 config: invalid regexp for rule test: m!foo(?{1})bar!: Eval-group not allowed
 
 config: invalid regexp for rule test: /test//: syntax error
 
 So there seem to be a bunch of errors, but the test returns ok ...?

yep, those are tests to ensure that the validity check *fails* ;)
I need to get it to shut up the warns though.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFChPUPMJF5cimLx9ARAl4xAJ9wOWvGQ5JvUpVMYfZ954OYrPpJTgCgsCYh
WtNKdGpscFIEc+K6tPWGIdg=
=ymTU
-END PGP SIGNATURE-



Re: Shouldn't bad config options be dbg() not info() ?

2005-05-13 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Theo Van Dinter writes:
 Just upgraded to the latest trunk (170072) and now I get these:
 
 [14805] info: config: failed to parse line, skipping: use_auto_whitelist 0
 [14805] info: config: failed to parse line, skipping: use_pyzor 0
 [14805] info: config: failed to parse line, skipping: ok_languages__en
 
 Which used to work fine.   Bad config options used to be a dbg() since it's
 not a critical issue which the user needs to know about, but they ought to
 come out in a --lint run.

I think it makes sense for those to be info(); it's useful for them not to
be down in the dbg() noise level where they're almost never seen.

This is from spamd, right?  if so, they'll only appear in the log, and
won't appear in spamassassin output.  I think that's a reasonable way to
do it.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFChQlCMJF5cimLx9ARAituAJ4shcF5VoZ+P/+WwbgOgXy3fY3N6gCfe9KZ
IQCKUe8qJ+RcA1tGd1TsqZM=
=wfl5
-END PGP SIGNATURE-



Buildbot update

2005-05-13 Thread Justin Mason
OK, the hostname has changed again -- please use
buildbot.SpamAssassin.org.   if you've already changed them to
SpamAssassin.zones.apache.org, you can leave it at that, since
it's the same host anyway. ;)

--j.


translations and PO files

2005-05-13 Thread Justin Mason
Quick question for anyone who may know -- is a .po (gettext-style)
file more usable for translators out there?

In particular, Ubuntu's Rosetta -- https://launchpad.ubuntu.com/rosetta/
-- uses this format.  That looks cool, but I'm curious about other
tools that can be used to edit .po files as well, and what
the people who write translations think of them.

--j.


Re: How does one disable logging!?!

2005-05-14 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 Theo Van Dinter [EMAIL PROTECTED] writes:
  Since we've already overloaded the add_facilities() function to also
  change level w/ info, we could change it so that one can override
  for all the levels.  That would make it easier for people to just keep
  using -D, but it does overload the facility API.
 
 If I supported lowering/raising more so, I'd add a new API.

I'd say the easiest way to do it would be to add a new -v (verbose)
switch -- either than or turn info into something that's logged by default
unless the user uses a new -q (quiet) switch. Those are the generic
UIs generally used in UNIX utilities for this.

I'm not fond of the -Dinfo special case.  It reminds me too much of
the frankly bizarre -Drbl=-255 incantation.

I know you don't like adding new switches, but adding a *special case* to
an existing switch, which takes an entirely different action, is in my
opinion even *more* confusing for users.


  It seems like the breakdown ought to be something like:
  
  dbg messages only seen with -D, nothing major
  infomessages only seen in logs (not STDERR), nothing major
  warning messages logged and seen via STDERR, but processing continues 
  (warn)
  error   messages logged and seen via STDERR, processing stops (die)
 
 I can change the code to set different levels per facility.  That's not
 too hard.  I'll try to check in a fix this weekend.

What does this mean?  I don't understand what different levels per
facility refers to.


 The other possible addition would be to add a notice() level for most
 log-centric messages in the spamd and preforking code.

I like Theo's definition table above, a lot.  Could we add that to the POD
as a guideline?  ...And where would notice fit in?

  FWIW: I've already had users complaining to me about the info output.
  They'll run sa-learn, see some info output saying that a config line
  can't be parsed, and think sa-learn failed.  So the new verboseness is
  going to cause issues.
 
 Well, they are errors!  Most Unix programs would show an error.

Yes, I think it's acceptable for a command run manually.   We need
only be silent for the filtering case (ie the spamassassin cmd).

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFChnuWMJF5cimLx9ARAvw+AJoC95JCF94GX+HPirLa3hneIr4i/gCeLio3
cnbFskqZsOo3zAnCTgn6BdE=
=C6Xi
-END PGP SIGNATURE-



Re: TELL Spamd Protocol Command

2005-05-17 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


+1 looks good to me!

- --j.

Michael Parker writes:
 Howdy, as I've written previously, the LEARN and COLLABREPORT commands
 are being replaced by a single TELL command.  Here is the latest diff
 that implements the TELL command.  I'm pretty happy with everything,
 even the C portion, which was broken but I've just discovered the
 problem so is now ready to go.  It has the added benefit that it also
 fixes some wonkiness on the Solaris platform with the report/revoke
 spamc stuffs (probably bad C code).
 
 Feel free to take a look, I probably want to clean up the Client.pm
 interface a bit, but other than that I believe it is ready to go.  If I
 don't hear anything in the next day or so I'll go ahead and commit.
 
 Michael
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.2.2 (GNU/Linux)
 Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
 
 iD8DBQFCiTOnG4km+uS4gOIRAucJAKCSXJQkyRg0VyE1317UFuHxGnnOIgCfT1yf
 GKZUE5ohkKwGGZnn1tz4qp0ÿF9
 -END PGP SIGNATURE-
 
 --=_mail-15291-1116287919-0001-2
 Content-Type: text/plain; name=tell.diff; charset=iso-8859-1
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline;
  filename=tell.diff
 
 Index: lib/Mail/SpamAssassin/Client.pm
 ==--- 
 lib/Mail/SpamAssassin/Client.pm (revision 170485)
 +++ lib/Mail/SpamAssassin/Client.pm   (working copy)
 @@ -219,10 +219,27 @@
  
my $msgsize = length($msg.$EOL);
  
 -  print $remote LEARN $PROTOVERSION$EOL;
 +  print $remote TELL $PROTOVERSION$EOL;
print $remote Content-length: $msgsize$EOL;
print $remote User: $self-{username}$EOL if ($self-{username});
 -  print $remote Learn-type: $learntype$EOL;
 +
 +  if ($learntype == 0) {
 +print $remote Message-class: spam$EOL;
 +print $remote Set: local$EOL;
 +  }
 +  elsif ($learntype == 1) {
 +print $remote Message-class: ham$EOL;
 +print $remote Set: local$EOL;
 +  }
 +  elsif ($learntype == 2) {
 +print $remote Remove: local$EOL;
 +  }
 +  else { # bad learntype
 +$self-{resp_code} = 00;
 +$self-{resp_msg} = 'do not know';
 +return undef;
 +  }
 +
print $remote $EOL;
print $remote $msg;
print $remote $EOL;
 @@ -236,17 +253,19 @@
  
return undef unless ($resp_code == 0);
  
 -  my $learned_p = 0;
my $found_blank_line_p = 0;
  
 +  my $did_set;
 +  my $did_remove;
 +
while (!$found_blank_line_p) {
  $line = $remote;
  
 -if ($line =~ /Learned: yes/i) {
 -  $learned_p = 1;
 +if ($line =~ /DidSet: (.*)/i) {
 +  $did_set = $1;
  }
 -elsif ($line =~ /Learned: no/i) {
 -  $learned_p = 0;
 +elsif ($line =~ /DidRemove: (.*)/i) {
 +  $did_remove = $1;
  }
  elsif ($line =~ /$EOL/) {
$found_blank_line_p = 1;
 @@ -255,7 +274,12 @@
  
close $remote;
  
 -  return $learned_p;
 +  if ($learntype == 0 || $learntype == 1) {
 +return $did_set =~ /local/;
 +  }
 +  else { #safe since we've already checked the $learntype values
 +return $did_remove =~ /local/;
 +  }
  }
  
  =head2 report
 @@ -270,7 +294,49 @@
  sub report {
my ($self, $msg) = @_;
  
 -  return $self-_report_or_revoke($msg, 0);
 +  $self-_clear_errors();
 +
 +  my $remote = $self-_create_connection();
 +
 +  return undef unless ($remote);
 +
 +  my $msgsize = length($msg.$EOL);
 +
 +  print $remote TELL $PROTOVERSION$EOL;
 +  print $remote Content-length: $msgsize$EOL;
 +  print $remote User: $self-{username}$EOL if ($self-{username});
 +  print $remote Message-class: spam$EOL;
 +  print $remote Set: local,remote$EOL;
 +  print $remote $EOL;
 +  print $remote $msg;
 +  print $remote $EOL;
 +
 +  my $line = $remote;
 +
 +  my ($version, $resp_code, $resp_msg) = $self-_parse_response_line($line);
 +
 +  $self-{resp_code} = $resp_code;
 +  $self-{resp_msg} = $resp_msg;
 +
 +  return undef unless ($resp_code == 0);
 +
 +  my $reported_p = 0;
 +  my $found_blank_line_p = 0;
 +
 +  while (!$reported_p  !$found_blank_line_p) {
 +$line = $remote;
 +
 +if ($line =~ /DidSet:\s+.*remote/i) {
 +  $reported_p = 1;
 +}
 +elsif ($line =~ /^$EOL$/) {
 +  $found_blank_line_p = 1;
 +}
 +  }
 +
 +  close $remote;
 +
 +  return $reported_p;
  }
  
  =head2 revoke
 @@ -285,7 +351,50 @@
  sub revoke {
my ($self, $msg) = @_;
  
 -  return $self-_report_or_revoke($msg, 1);
 +  $self-_clear_errors();
 +
 +  my $remote = $self-_create_connection();
 +
 +  return undef unless ($remote);
 +
 +  my $msgsize = length($msg.$EOL);
 +
 +  print $remote TELL $PROTOVERSION$EOL;
 +  print $remote Content-length: $msgsize$EOL;
 +  print $remote User: $self-{username}$EOL if ($self-{username});
 +  print $remote Message-class: ham$EOL;
 +  print $remote Set: local$EOL;
 +  print $remote Remove: remote$EOL;
 +  print $remote $EOL;
 +  print $remote $msg;
 +  print $remote $EOL;
 +
 +  my $line = $remote;
 +
 +  my ($version, $resp_code, $resp_msg) = 

Re: svn commit: r170509 - in /spamassassin/trunk/t: spamc_optL.t whitelist_addrs.t

2005-05-17 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Michael Parker writes:
 I think this is too broad, AWL should work with multiple different db
 modules, SDBM_File is in the list of usable ones.  I think the better
 fix would be to figure out why it isn't picking that up and using it.
 
 -1 for this change, it's unnecessary.

fair enough -- I was under the impression AWL just used DB_File
and no other.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCikpKMJF5cimLx9ARApYyAJ0ST2fLRg1IKJNZ5HlN2X+Fy/NghACfZcJf
rKKqBXdXDnxBEK5xd/E9Svg=
=Wef+
-END PGP SIGNATURE-



Re: spamc Config File Patch

2005-05-18 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Michael Parker writes:
 Here is a quick benchmark to illustrate my point, spamc.old was the revision
 immediately before the conf patch, spamc.new is the conf patch revision:
 
 Benchmark: timing 1 iterations of spamc.new, spamc.old...
  spamc.new: 97 wallclock secs ( 0.61 usr  2.70 sys + 14.37 cusr 26.04
  csys = 43.72 CPU) @ 3021.15/s (n=1)
  spamc.old: 39 wallclock secs ( 0.48 usr  2.36 sys + 13.73 cusr 23.52
  csys = 40.09 CPU) @ 3521.13/s (n=1)

ouch.  that *is* slow.

OK, we have the following options imo:

- - leave it that it will look for a default config file in one location
  only, with one stat() operation.  This is my preferred option.

- - same as above, with a possible $SPAMC_CONFIG env variable allowed
  to set the location.  still only one stat() op, so it's fast,
  but it does allow people with odd installs to specify the file
  location, if that's a desirable feature.  in my opinion, this isn't
  really required, since they can just use -F.

- - remove the code.  I'm not too happy with this, since it's a long-desired
  feature and fixes another reported bug.

- - only check for a config file if the -F command line switch is present;
  in other words, no default config file.  that doesn't fix the reported
  bug, either.

ps: yes, that config file cannot be found warning has to go; there's a
bug open about that already.

pps: would probably have been better to discuss this on the bugzilla entry
btw.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCi7Z8MJF5cimLx9ARAmgfAJ97yPLxfsj0KEk8LferNkOK5FlX0gCfVA1y
OX9hrJLne5ayp4G7W9hmmIc=
=tfq8
-END PGP SIGNATURE-



Re: svn commit: r170745 - /spamassassin/trunk/t/regexp_valid.t

2005-05-18 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Michael Parker writes:
  +my $fh = IO::File-new_tmpfile();
  +open(LOGERR, .fileno($fh)) || die Cannot create LOGERR temp file;

should be  I think.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCi9+UMJF5cimLx9ARAuGcAJwJSKm8YB5JtKZOoegQR/lLRUSSuQCgtY4+
qDPBD9HHC+nRWqDVqCMGtZs=
=jMi0
-END PGP SIGNATURE-



Logger eating warnings

2005-05-20 Thread Justin Mason
I think it's the line that looks for a prefix in Logger.pm.
e.g.

warn foo: bar

will probably get through, but

warn bar

probably will not.  untested, just a hunch,

--j.


Re: svn commit: r171210 - in /spamassassin/trunk/lib/Mail/SpamAssassin: Dns.pm DnsResolver.pm Plugin/URIDNSBL.pm

2005-05-21 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


[EMAIL PROTECTED] writes:
 @@ -145,11 +145,20 @@
return if $self-{no_resolver};
  
$self-{sock}-close() if $self-{sock};
 -  $self-{sock} = IO::Socket::INET-new (
 -Proto = 'udp',
 -Type = SOCK_DGRAM,
 -  );
 +  my $sock;
 +  # find next available unprivileged port (1024 - 65535)
 +  # starting at a random value to spread out use of ports
 +  my $port_offset = int(rand(64511));  # 65535 - 1024
 +  for (my $i = 0; (!$sock  ($i64511)); $i++) {
 +my $lport = 1024 + (($port_offset + $i) % 64511);
 +$sock = IO::Socket::INET-new (
 +   Proto = 'udp',
 +   LocalPort = $lport,
 +   Type = SOCK_DGRAM,
 +   );
 +  }
  
 +  $self-{sock} = $sock;
$self-{dest} = sockaddr_in($self-{res}-{port},
  inet_aton($self-{res}-{nameservers}[0]));

as a matter of interest -- what will happen here if the port *is*
already in use?

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCj9uuMJF5cimLx9ARAvGfAKCAeR95CuYy3pw1Y+ixXWiVWt1doACbB07/
0FngMLA28wGdWVro3INa9kc=
=FUXR
-END PGP SIGNATURE-



Re: Buildbot question

2005-05-23 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Sidney Markowitz writes:
 Does anyone know if we should be able to use the latest version of
 Buildbot, 0.6.5 with buildbot.spamassassin.org? I know that I could just
 try it, but I don't want to spend time trying to get it to work only to
 find that the master has to be upgraded first.

afaik you can.   if it needs to be upgraded, c'est la vie, I'll
do that.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCki7nMJF5cimLx9ARAncoAJ9KnIXxAjCFAwXkfTpWBupnSWSa3QCgnHB8
LWYZa9sHq0pwGfOSgmB1U2o=
=q/Ne
-END PGP SIGNATURE-



Re: Buildbot question

2005-05-23 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Sidney Markowitz writes:
 Justin Mason said:
  afaik you can.
 
 Ok, I'll try it. First I'll confirm that I can get the 0.6.2 that I have
 installed running again, as I've had it down for a while.
 
 Another question -- Can we have a way of enabling network test for the
 buildbot runs? I can see how it should be an option, as some people might
 not want to load their network every time they run an automated test, but
 I don't mind, and the network stuff should be tested too.

yep, should be possible -- create a t/config file that enables it in
the buildbot slave's checkout ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCkjsRMJF5cimLx9ARAqILAJ9fCuwHPKgTq1xvLCYyf9e4DrupkQCdGEPI
oT4mqCj06elxViDZEDz/g5E=
=FVX1
-END PGP SIGNATURE-



Re: Buildbot question

2005-05-23 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Sidney Markowitz writes:
 I got past the setuid problem but now I don't see my slave appearing in
 http://buildbot.spamassassin.org:8010/
 
 I'm trying to run trunk-sidney-cygwin
 
 It got the message from master: attached and then said it was doing a
 keepalive, but nothing else.

ok, I had them commented to avoid filling up the waterfall page.  should
be showing up now soon... might be worth restarting it.

t-sidney-fedora3 at least is showing up.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCklYtMJF5cimLx9ARAhR3AJ9oeqqDmX1Mu5tdD+QEgs21Al96qQCfQGii
HvEdnhkHR2i1eF+pVHaFBys=
=o3Pf
-END PGP SIGNATURE-



Re: Additional SPAM recognition method

2005-05-23 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Theo Van Dinter writes:
 On Mon, May 23, 2005 at 06:45:12PM -0500, [EMAIL PROTECTED] wrote:
  Here's the algorithm:
  
1  Decode any URL-encoding in the message
2  Un-MIME the message
 
 Wrong order?
 
3  Scan all parts of the message for URLs and email addresses (this can be
  links, IMG tags, mailto:'s, or even just something that looks like a web
  address or email address).  Do NOT scan the headers.
 
 get_uri_list().
 
4  For each address, resolve the hostname to an IP and then look up that 
  IP
  in your favorite DNS RBL - I use sbl-xbl.spamhaus.org as it caches the 
  most,
  but you can also add bl.spamcop.net and relays.ordb.net
 
 SURBL?

A bit more like URIBL_SBL, although in URIBL_SBL, we use the NS of the
domains (because they're harder to switch to new servers in the spammer
shell-game style).

We did actually have an A of domain name test during 3.0.0 development,
I think, but dropped it for various reasons:

- - if a spammer were to use a hostname like
  jm_at_jmason_dot_org.spamdomain.com, they get a free backchannel to
  verify that I was (a) using SpamAssassin to filter to my mail, and (b)
  that that address is valid.  So blindly resolving the full hostname was
  judged as unsafe.   However, replacing hostname portions with another
  token is not useful: assuming that jm_at_jmason_dot_org.spamdomain.com
  will have the same A as spamdomain.com or www.spamdomain.com is
  naive and easily evaded.

- - more importantly, the results weren't very good. ;)   Not as good as
  URIBL_SBL and the SURBL rules, at least.  iirc, the hits mapped very
  closely to URIBL_SBL, esp since Spamhaus explicitly list nameservers of
  spammed domains.

The details should be on bugzilla somewhere.
Thanks anyway though!

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCkm5RMJF5cimLx9ARAgdbAJ9ji51PEG0MDlZc3XkG04JepiP6tQCdHhq6
xzicut+LZT7YmjyaZmQmCdg=
=U4oZ
-END PGP SIGNATURE-



[Buildbot-devel] buildbot-0.6.6 released (fwd)

2005-05-23 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


- --- Forwarded Message
 Date:Mon, 23 May 2005 17:14:57 -0700
 From:Brian Warner [EMAIL PROTECTED]
 To:  [EMAIL PROTECTED]
 Subject: [Buildbot-devel] buildbot-0.6.6 released
 
 I've just released buildbot-0.6.6, it's available on the sf.net download
 site: http://buildbot.sourceforge.net/ as usual. The release is signed with
 my GPG key (0x1514a7bd), and the tarball checksums are as follows:
 
  md5: a47f01a169fc3bd8ae25e2c487e241d2  buildbot-0.6.6.tar.gz
 sha1: 3c46129ea073de31d75cb7bdb974d69eb0748d09  buildbot-0.6.6.tar.gz
 
 This releases fixes a number of small but annoying bugs in several
 /usr/bin/buildbot subcommands. It also fixes some problems when upgrading
 from buildbot-0.6.4 or earlier. Complete release notes are attached below.
 
 I've got some large-scale changes planned for the way the buildmaster
 configures Builders, which I will start implementing in CVS now that this
 release is out. I'll post a separate message describing my plans.
 
 Have an (anti-)entomological day,
  -Brian
 
 * Release 0.6.6 (23 May 2005)
 
 ** bugs fixed
 
 The 'sendchange', 'stop', and 'sighup' subcommands were broken, simple bugs
 that were not caught by the test suite. Sorry.
 
 The 'buildbot master' command now uses raw strings to create .tac files
 that will still function under windows (since we must put directory names
 that contain backslashes into that file).
 
 The keep-on-disk behavior added in 0.6.5 included the ability to upgrade old
 in-pickle LogFile instances. This upgrade function was not added to the
 HTMLLogFile class, so an exception would be raised when attempting to load or
 display any build with one of these logs (which are normally used only for
 showing build exceptions). This has been fixed.
 
 Several unnecessary imports were removed, so the Buildbot should function
 normally with just Twisted-2.0.0's Core module installed. (of course you
 will need TwistedWeb, TwistedWords, and/or TwistedMail if you use status
 targets that require them). The test suite should skip all tests that cannot
 be run because of missing Twisted modules.
 
 The master/slave's basedir is now prepended to sys.path before starting the
 daemon. This used to happen implicitly (as a result of twistd's setup
 preamble), but 0.6.5 internalized the invocation of twistd and did not copy
 this behavior. This change restores the ability to access private.py-style
 modules in the basedir from the master.cfg file with a simple import
 private statement. Thanks to Thomas Vander Stichele for the catch.
 
 ---
 This SF.Net email is sponsored by Oracle Space Sweepstakes
 Want to be the first software developer in space?
 Enter now for the Oracle Space Sweepstakes!
 http://ads.osdn.com/?ad_idt12alloc_id344op=click
 ___
 Buildbot-devel mailing list
 [EMAIL PROTECTED]
 https://lists.sourceforge.net/lists/listinfo/buildbot-devel
 
 --- End of Forwarded Message
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCkntCMJF5cimLx9ARAr4/AJ9mFGjbFlMUt7GEYN2rokWU6QEkjACfZiQC
LROsC6PVpDJn5hYsW3aLnpg=
=MBed
-END PGP SIGNATURE-



spamc/spamd tests on Solaris all failing

2005-05-24 Thread Justin Mason
looks like it's related to a recent change on the spamc config
front.

http://buildbot.spamassassin.org:8010/t-solaris-10/builds/6/test/0

failures

http://buildbot.spamassassin.org:8010/t-solaris-10/builds/6/test_3/0

passes, from make disttest when t/data/spamc_blank.cf was NON-existent.

--j.


stability

2005-05-26 Thread Justin Mason
has returned to the trunk. ;)  All buildbots that are online -- 

t-red-hat-73t-debian-stable t-585thrt-solaris-10
t-sol10-561 t-sidney-fedora3

are now reporting successful builds, make test and make disttest.

http://buildbot.spamassassin.org:8010/

--j.


Re: Renaming DnsResolver::search to send

2005-05-31 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Dale Luck writes:
 This is a multi-part message in MIME format.
 
 --_=_NextPart_001_01C5660E.C78170A6
 Content-Type: text/plain;
   charset=iso-8859-1
 Content-Transfer-Encoding: quoted-printable
 
 SPF has a mode whereby a lookup could return data that requires *another*
 DNS lookup, e.g. the include:foo.com directive.
 
 Currently we have to use Mail::SPF::Query as a single method call, and
 allow it to run all necessary queries by itself and simply wait for it to
 return.  A more efficient way would be to have a polling mode, so we can
 process our own code while waiting for results, and poll for results
 intermittently -- similar to how we do DNSBL and URIBL lookups.
 
 This is exactly the stuff I rewrote inside the check function. Its
 the exact opposite of whats needed to be done if you can use any of
 these results to 'block' without needing any other scores.
 
 I run all the rbl lookups and then wait for the results to come back.
 If we get a block/hit we return without running anymore rules. This
 gave us a large performance boost in a server environment since we
 can eliminate rules that are essentially ignored early on. There are
 better things to do with the cpu.

well that's interesting!  So you're saying you let the checks run with a
longer total run time, in exchange for being able to early-exit based on
DNSBL lookup results to save CPU time?

I think we've always been heading in the opposite direction -- minimum
total runtime.

It's a good argument to make access to parts of the check() function more
fine-grained for users of the Mail::SpamAssassin::PerMsgStatus class.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCnNo8MJF5cimLx9ARAvtbAJ9lke3xMD7nRJ5pdikjex8YpCMmaACfU+mX
yh8F1dZnPFGstPEYBP0LmN4=
=EHrD
-END PGP SIGNATURE-



Re: svn commit: r179354 - /spamassassin/trunk/sa-learn.raw /spamassassin/trunk/spamassassin.raw

2005-06-01 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Theo Van Dinter writes:
 On Wed, Jun 01, 2005 at 04:30:31AM -, [EMAIL PROTECTED] wrote:
  +if (Mail::SpamAssassin::Util::am_running_on_windows()) {
  +  binmode(STDIN);   # bug 4363
  +  binmode(STDOUT);
  +}
 
 Since this is the default on UNIX anyway, should we bother with the if() ?

good point!

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCnV4QMJF5cimLx9ARAi6NAJ4l3DB0MDurKVpYcZAq15YE6CfYDwCZAc92
yOifliG55ekndXLWN3sVfn4=
=i/ep
-END PGP SIGNATURE-



Re: 3.1.0 prerelease?

2005-06-01 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


I don't think this is random.  I *do* see it in your log though ;)
naturally I'm not seeing that on the other build slaves (or here).

why is it picking up debugs in t/spamd_allow_user_rules.t in the first
place?  that test does not use -D.

also, the server pid line is what it's waiting for, and as you
can see it never gets to see it.  could the machine be very
loaded?  or is there a bug in the new Logger code?

I've reordered the server pid line to see if that improves
matters.

- --j.

Michael Parker writes:
 Also, I'm randomly getting thiis sort of error in the spamc/spamd tests:
 
 spamd start failed: log: [29808] dbg: logger: adding facilities: all
 [29808] dbg: logger: logging level is DBG
 [29808] dbg: spamd: creating INET socket:
 [29808] dbg: spamd: _Listen: 128
 [29808] dbg: spamd: _LocalAddr: 127.0.0.1
 [29808] dbg: spamd: _LocalPort: 62572
 [29808] dbg: spamd: _Proto: 6
 [29808] dbg: spamd: _ReuseAddr: 1
 [29808] dbg: spamd: _Type: 1
 [29808] dbg: logger: adding facilities: all
 [29808] dbg: logger: logging level is DBG
 lots more debug and then this:
 [29924] info: spamd: server started on port 62688/tcp (running version
 3.1.0-r179480)
 [29924] info: spamd: server successfully spawned child process, pid 29928
 [29924] dbg: prefork: child 29928: entering state 0
 [29928] dbg: prefork: sysread(7) not ready, wait max 300 secs
 [29924] info: spamd: server successfully spawned child process, pid 29929
 [29924] dbg: prefork: child 29929: entering state 0
 [29929] dbg: prefork: sysread(8) not ready, wait max 300 secs
 [29924] dbg: prefork: child 29928: entering state 1
 [29924] dbg: prefork: child reports idle
 [29924] dbg: prefork: child 29929: entering state 1
 [29924] dbg: prefork: child reports idle
 [29924] info: prefork: child states: II
 
 This leaves a few perl/spamd processes around and other tests fail
 after that.
 
 Michael
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.2.4 (Darwin)
 Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
 
 iD8DBQFCnpLAG4km+uS4gOIRArHbAJ0Wfa8xT+JB85QTHOJT38rDwYCsEQCgqE3p
 sslM8YQ0JTJcIFPyl5YGUVk=VSzT
 -END PGP SIGNATURE-
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCnpZvMJF5cimLx9ARAo0uAKC2Lmce3s7TmqYsl9oArS5M90L92QCghICw
yOHQqiqlngxypxJtywLgIXM=
=ywGE
-END PGP SIGNATURE-



Summer of Code

2005-06-02 Thread Justin Mason
can anyone think of good SpamAssassin projects for this?
I'm a bit stumped ;)

--j.


Re: svn commit: r179467 - /spamassassin/trunk/lib/Mail/SpamAssassin.pm

2005-06-06 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 [EMAIL PROTECTED] writes:
 
 if (open (IN, .$path)) {
   $txt .= join ('', IN);
 
 This seems like memory bloat.  Why don't we just stream this into the
 configuration parser?

that'd be nice, but I don't think we have an API to do that on Conf.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCpOm0MJF5cimLx9ARAg9ZAJ9bAi/YSvpataYaso/563SXzVeOuwCfa8cl
Q3pdgfJUuBQ+tSof2/THFbs=
=9cXt
-END PGP SIGNATURE-



Re: Returned post for announce@spamassassin.apache.org

2005-06-13 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Duncan Findlay writes:
 On Mon, Jun 13, 2005 at 10:09:16AM -0400, Theo Van Dinter wrote:
  On Mon, Jun 13, 2005 at 04:08:00AM -, [EMAIL PROTECTED] wrote:
   Hi! This is the ezmlm program. I'm managing the
   [EMAIL PROTECTED] mailing list.
   
   I'm sorry, the list moderators for the announce list
   have failed to act on your post. Thus, I'm returning it to you.
   If you feel that this is in error, please repost the message
   or contact a list moderator directly.
   
   --- Enclosed, please find the message you sent.
 
 Who moderates this or any of the spamassassin lists?

I do, and there are others on the team who do.  and in fact, we all should
be doing so. 

In addition, one of the checklist items on the release checklist is to
moderate your own submission iirc ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCrjVHMJF5cimLx9ARAiP7AJ4q4yGe+AOV4DozUaitEUq80u1kewCfdm/R
W4c/8HcUzzvNmyzAaEK9yss=
=7ee7
-END PGP SIGNATURE-



3.1.0?

2005-06-15 Thread Justin Mason
Are we ready for a prerelease?Here's all that's remaining in
the 3.1.0 queue:

2307maj P4  TRACKER_ID is no good with foreign (ie. non-English) 
lang...
4347maj P4  many config options are not validated
4346maj P5  sa-learn: massive memory usage on large messages
3563nor P5  Odd errors if tieing Bayes DB while learning...
4344nor P5  RFE: Enhance spamc under Win32 to same timeout 
behaviour ...
4363nor P5  rfe: windows line ending support
4364nor P5  False positive with DIET_1
4321min P5  XBL now includes NJABL_PROXY; may need a mod to rules
692 enh P5  RFE: write faster version of TextCat
3714enh P5  New ruleset to catch penny stock advisories


nothing serious left there, in my opinion.

I propose that we cut a tarball of 3.1.0-pre1 and start thinking about
mass-checks.  Votes please...

--j.


Re: buildbot failure in t-sidney-fedora3

2005-06-16 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Michael Parker writes:
  is this some kind of NAT problem?  I notice it's affecting Sidney and
  Michael's bots only.
 
 Possibly, but it's also some sort of SVN is really down problem.
 
 There also seems to be a mailer problem, multiple copies of msgs are
 getting sent and kicking off buildbot a bunch of times in a row.

hmm, yes, it really did go down there for a little bit.  that
makes sense then...

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCse6tMJF5cimLx9ARAsbeAJ9xNtK4xIamzdhRGX9nyvGtcCwm/gCgs8au
zOpSexymTbRRY2O58xnHYRc=
=7yzv
-END PGP SIGNATURE-



SpamAssassin 3.1.0pre1 PRERELEASE available!

2005-06-17 Thread Justin Mason
hi all --

it's time to broaden the pool of 3.1.0 testing -- so here's a prerelease.
It's functionally quite close to what 3.1.0 will be, although we haven't
yet done the rescoring mass-checks and Perceptron run, and there
may be one or two more patches going in before the full release.

We'd really appreciate it if you could take this for a spin and
(possibly) spot any issues...

It should be *quite* stable, but it hasn't seen much action in really
large sites yet, so a little caution is advisable.

URL:
  http://SpamAssassin.apache.org/devel/

you may have to wait for a mirror update before the files appear, it
seems!

md5sum of archive files:
  64ec405b8ac4c49209fe2be199c9adcf  Mail-SpamAssassin-3.1.0pre1.tar.bz2
  612987472203c85b34ac0f9715fe4dd0  Mail-SpamAssassin-3.1.0pre1.tar.gz
  726bad32f42715c2256ef4ab90747641  Mail-SpamAssassin-3.1.0pre1.zip

sha1sum of archive files:
  00c05495f146e0fcfaecad29a86d83be4e34c8ce  Mail-SpamAssassin-3.1.0pre1.tar.bz2
  a9bd82d9eeb92e127e14a1f0066699004544923b  Mail-SpamAssassin-3.1.0pre1.tar.gz
  fce976b6ff153de29b45639538e5fab65d0474c1  Mail-SpamAssassin-3.1.0pre1.zip


(ps: also, if you're planning to submit mass-check results, now's the
time to start getting those corpora in order! details on the wiki.)

(pps: devs, I left the IS_DEVEL_BUILD line uncommented deliberately.
it is one. ;)

--j.


rsync area

2005-06-18 Thread Justin Mason
how's about moving those over to the Solaris zone soon, before
the mass-checks?   get more stuff transitioned off bugzilla.

--j.


Re: svn commit: r191362 - /spamassassin/trunk/build/README

2005-06-19 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


actually, I'd prefer to keep it since 3.0.0 started -- there's a gzipped
old changelog in the distro which lists all changes up to 3.0.0,
so listing from 3.0.0 on in Changes keeps continuity.

- --j.

[EMAIL PROTECTED] writes:
 Author: felicity
 Date: Sun Jun 19 13:16:32 2005
 New Revision: 191362
 
 URL: http://svn.apache.org/viewcvs?rev1362view=rev
 Log:
 update build doc to point at when 3.1 development started, not 3.0...
 
 Modified:
 spamassassin/trunk/build/README
 
 Modified: spamassassin/trunk/build/README
 =---
  spamassassin/trunk/build/README (original)
 +++ spamassassin/trunk/build/README Sun Jun 19 13:16:32 2005
 @@ -72,10 +72,11 @@
  
- For releases on the trunk (e.g. a .0 release):
  
 -  TZ=UTC svn log -r HEAD:5810 --non-interactive  Changes
 +  TZ=UTC svn log -r HEAD:46030 --non-interactive  Changes
  
 -r5810 was the start of the 3.0.0 (2.70 at the time) work; replace
 -with the correct rev number for the point you want to start at.
 +r46030 was the start of the 3.1.0 work (created 3.0 branch); replace
 +with the correct rev number for the point you want to start at
 +if different.
  
  - Check in the updated Changes file.
  
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCtjFaMJF5cimLx9ARAq+4AJ0UXtnhuxF5qKJQkoaU38U2EurrzQCaAwmD
VWlojUp8Wd5gat+rntvcI5s=
=Hj2C
-END PGP SIGNATURE-



Re: SpamAssassin 3.1.0pre1 PRERELEASE available!

2005-06-19 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
64ec405b8ac4c49209fe2be199c9adcf  Mail-SpamAssassin-3.1.0pre1.tar.bz2
 
 One question: how did the '-' between '0' and 'p' get left out?

I've always used that formatting for prereleases... if there's another
way, it should be documented in build/README.  personally I prefer
X.Y.ZpreN anyway ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCtjQnMJF5cimLx9ARAoMzAJ42Q3I/crRFYfOqBMdswtnZeSxlxwCgvRJo
PSVpqt3uxzE2as7TmJv+ldc=
=7sCq
-END PGP SIGNATURE-



3.1.0 schedule

2005-06-21 Thread Justin Mason
let's get this properly underway... how's about this.

  - today to Mon, 2005-06-27:

clean up our corpora, get ready for mass-checking, try out
mass-check to spot any big memory leaks or whatnot.

  - Mon, 2005-06-27 to Wed, 2005-07-06:

mass-checks; move to C-T-R?

  - Wed, 2005-07-06: (Monday is July 4, let's wait 'til a little
after that weekend!)
  
collate mass-check results, generate logs for all scoresets
from those (Daniel, we can now get all scoresets from one
mass-check run, right?)

Start perceptron, check in results (will almost definitely need
Henry's help here)

  - Wed, 2005-07-06 to Wed 2005-07-13: tweak those scores if
necessary.

  - Wed 2005-07-13: release.

That's pretty relaxed -- 3 weeks.  With the single mass-check
run, it's more doable.

BTW I've done a bit of guessing here;
http://wiki.apache.org/spamassassin/RescoreMassCheck needs to be updated
on what the new procedure is.

So what do you all think?  It'd be nice to release 3.1.0 in time for
CEAS (July 21/22)...

--j.


Re: Normalized text ruletype

2005-06-21 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


One problem is that we've already added something for those mails in 3.1.0 --
but from the other direction ;)

Namely, Theo wrote a plugin which allows rules to be written which
are then translated into more complex rules, that match the variety
of obfuscations observed.  The two modes kind of clash... but
we should compare one against the other.

FWIW, I quite like the idea of massively normalising as you do there --
lowercasing, dropping spaces, etc.   I can see one problem with doing it
that way though.  If you approach it from the normalization angle, there
are issues with some kinds of obfuscation, e.g. the ones where a char in a
string has been replaced by multiple chars:

the quick brown fox jumped
the quick brow|\| fox jumped

coming from the other angle, by munging the rule strings, you *can*
match that.

anyway, I'll let Theo comment...

- --j.

Loren Wilton writes:
 RFC: Normalized text ruletypeWow, neat!  I've been looking at something like 
 this for quite some time.
 
 Adding in pipes and some of the other characters known to be used for
 obfuscations could well drastically increase your hit ratios, they
 are really common.
 
 I think this is quite possibly a good start on a new rule type.
 
 Loren
 
 --=_NextPart_000_05C8_01C57697.BB8EE2D0
 Content-Type: text/html;
   charset=iso-8859-1
 Content-Transfer-Encoding: quoted-printable
 
 !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN
 HTMLHEADTITLERFC: Normalized text ruletype/TITLE
 META http-equiv=Content-Type content=text/html; charset=iso-8859-1
 META content=MSHTML 6.00.2800.1505 name=GENERATOR
 STYLE/STYLE
 /HEAD
 BODY bgColor=#ff
 DIVFONT size=2Wow, neat!nbsp; I've been looking at something like this 
 for 
 quite some time./FONT/DIV
 DIVFONT size=2/FONTnbsp;/DIV
 DIVFONT size=2Adding in pipes and some of the other characters known to 
 be 
 used for obfuscations could well drastically increase your hit ratios, they 
 are 
 really common./FONT/DIV
 DIVFONT size=2/FONTnbsp;/DIV
 DIVFONT size=2I think this is quite possibly a good start on a new rule 
 type./FONT/DIV
 DIVFONT size=2/FONTnbsp;/DIV
 DIVFONT size=2nbsp;nbsp;nbsp; nbsp;nbsp;nbsp; Loren/FONT/DIV
 DIVnbsp;/DIV/BODY/HTML
 
 --=_NextPart_000_05C8_01C57697.BB8EE2D0--
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCuNNWMJF5cimLx9ARAhIXAJ9JdpBxQDWyc8AxRsXHkr9z6Db3lQCfRjhb
7+t77dN8g1uaS0n+lJSqwz8=
=QeQ0
-END PGP SIGNATURE-



3.1.0 schedule rev 2

2005-06-21 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


OK, a redo after a little chat -- with an extra optional week at the end.

- - today to Mon, 2005-06-27:

  clean up our corpora, get ready for mass-checking, try out mass-check to
  spot any big memory leaks or whatnot.

- - Mon, 2005-06-27 to Wed, 2005-07-06:
  
  mass-checks; move to C-T-R?

- - Wed, 2005-07-06: (Monday is July 4, let's wait 'til a little after that
  weekend!)

  collate mass-check results, generate logs for all scoresets from those
  (Daniel, we can now get all scoresets from one mass-check run, right?)
  Start perceptron, check in results (will almost definitely need Henry's
  help here)

- - Wed, 2005-07-06 to Wed 2005-07-13: tweak those scores if necessary.

- - Wed 2005-07-13: decide whether to release or delay an (optional)
  additional week, to 2005-07-20.

also, http://wiki.apache.org/spamassassin/RescoreMassCheck still needs to
be updated on what the new procedure is.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCuOd7MJF5cimLx9ARApIvAKCru5o536KwRwnmEtOuDdv6YtCZ1gCeI+lq
fq2iPCSgP7Sr2Fk1bKA8K2s=
=b+wi
-END PGP SIGNATURE-



Re: CEAS chat and a hackathon

2005-06-22 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Duncan Findlay writes:
 On Tue, Jun 21, 2005 at 07:40:59PM -0700, Justin Mason wrote:
  CEAS is on this year on July 21st and 22nd.   I'm planning to arrive
  Wed evening (the 20th), and leave on Sat (the 23rd)...
  
  Daniel was suggesting a brief hackathon, probably on the Saturday. if
  we're going to do that, we should decide on timing soon, before I have to
  go booking flights and all that ;)
 
 +1
 
 It'd be nice to get as many developers as possible in the same room -
 in fact it'll probably be a record. I think there'll be 5 of us in the
 area during CEAS?
 
 Not sure how much hacking we'll do, but I'm sure it'll be useful
 and/or fun!
 
  I'm +1 on doing this on Saturday; I'll probably scoot from wherever we do
  that direct to the airport then.
 
 Saturday's fine with me.

Yeah, it looks like Saturday it is.  cool!

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCuZVWMJF5cimLx9ARAjDgAJoDYR95kRg46Pd9bNE6HxVnZzvOIwCfTbVR
4r23eK8PDTxDTPXClWOhi7s=
=Tdly
-END PGP SIGNATURE-



Re: CEAS chat and a hackathon

2005-06-22 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


me too.

Michael Parker writes:
 Theo Van Dinter wrote:
 
 So this brings up the question of: who's planning to goto CEAS?
 
 I keep going back and forth about going, currently leaning towards
 not going but ...
 
   
 
 I'll be there.
 
 Michael
 
 --=_mail-17262-1119474425-0001-2
 Content-Type: application/pgp-signature; name=signature.asc
 Content-Transfer-Encoding: 7bit
 Content-Description: OpenPGP digital signature
 Content-Disposition: attachment; filename=signature.asc
 
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.2.4 (Darwin)
 
 iD8DBQFCudLxG4km+uS4gOIRAub6AKCmFB7vkyoQsJCX2gTzT11/eXsAyACcCd39
 +1h3riO4x5dSpNIaq3hdo64=Cof0
 -END PGP SIGNATURE-
 
 --=_mail-17262-1119474425-0001-2--
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCudy3MJF5cimLx9ARAuxeAJ0Y4kOpg4PGqkGvZFAxESbcEIc3iQCfR3N9
+UijuTRAPBfSG05cmiLNqPM=
=9JMp
-END PGP SIGNATURE-



Re: Want to get 3.0.4 out ... Still need reviews.

2005-06-23 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Theo Van Dinter writes:
 On Thu, Jun 23, 2005 at 03:26:20PM -0700, Dale Luck wrote:
  It appears to me that the link on the website for the spamassassin
  download for 3.0.4 in tar.gz format is actual to a tar file instead
  of a gzipped tar file.
 
 What's the URL?  The file is definitely a tar/gzip file, but a mirror may be
 messed up.

I've noticed this occasionally; it's arguable if the mirror is messed up
or not, if I recall correctly some HTTP UAs can transparently gunzip data.
It could be that some mirror is using mod_deflate, recognises the gzipped
data, and sends a header noting that it's content transfer encoding is
gzipped, or whatever.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCuzo9MJF5cimLx9ARArPrAJsEMzK8b1GH1wkeLmo1BSnZ1R098ACff7FA
dudt0B/45AG53uls95dUHPo=
=/buY
-END PGP SIGNATURE-



Re: 3.1.0 schedule

2005-06-25 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Hey -- I presume we won't be going ahead with this schedule, since
nobody's voted, explicitly given a thumbs-up, or updated the
details on how mass-checks now work in 3.1.0...

- --j.

Daniel Quinlan writes:
 [EMAIL PROTECTED] (Justin Mason) writes:
 
- Mon, 2005-06-27 to Wed, 2005-07-06:
  
  mass-checks; move to C-T-R?
 
 One week is enough.  It's single pass now, remember, so we could say
 Tuesday.  Either way...
  
  (Daniel, we can now get all scoresets from one
  mass-check run, right?)
 
 Yes.  We do have to add the --sample flag and the --reuse flag which are
 new this time.  We definitely should do some trial runs.  Of course,
 it's the slowest mass-check, but we can do it once!  :-)
 
  So what do you all think?  It'd be nice to release 3.1.0 in time for
  CEAS (July 21/22)...
 
 Sure.  Do we want to do any sort of PR?  Minor releases of software are
 not that big of deal, but whatever we want to do, we should plan ahead.
 
 Daniel
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCvgUIMJF5cimLx9ARAu6PAJ0QzoPqPcKpWI0meGGX/fRqSFM+UwCgl/Wg
aZO+40bmjkFKgXuwLtkOFWI=
=Wk0c
-END PGP SIGNATURE-



Re: zones' /etc being changed...?

2005-06-26 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Mads Toftum writes:
 On Sat, Jun 25, 2005 at 01:40:38PM -0400, Theo Van Dinter wrote:
  As for named, the rc script I wrote up and the symlinks I made in rc3.d are
  gone.  I recreated them, and named is now back up as well.
  
  I don't think anyone manually went in and messed up our /etc files, so it
  looks more like a global host change.  Has anyone else seen this?
  
 I did notice a couple of the patches being installed last week saying
 that they were patching the zones - this is probably the cause. All I
 could check at the time was that all zones started normally, but without
 knowing what each zone is supposed to run (and not wanting to poke about
 too much) I assumed all was ok.

is there some alternative thing we should be doing with rc scripts
in /etc?  I seem to recall something about some Solaris thing
to do with service management... is it expected that it'd nuke
rc scripts if they're not managed by something?

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCvw28MJF5cimLx9ARAlNDAJsGAzLksT5wtj+xsAVrRmhoN3Im5gCfS3/h
HM2wAKtOAEqRL/Eizmf+QG4=
=1PEZ
-END PGP SIGNATURE-



NOTICE: 3.1.0 mass-checks heads up

2005-06-26 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Hi all --

we're going to be starting the mass-checks for 3.1.0 RSN.   These will be
used to generate an up-to-date score set for that release.

If you have a hand-classified set of mail corpora [1], and are able and
willing to run mass-check over them [2] and submit the results via rsync,
that would be very helpful!  (BTW things are simpler this time around; we
should be able to do it with just one mass-check run.)

  [1]: http://wiki.apache.org/spamassassin/HandClassifiedCorpora
  [2]: http://wiki.apache.org/spamassassin/MassCheck

The current plan is to start the mass-checks this coming Wednesday (or
thereabouts).  See http://wiki.apache.org/spamassassin/Release310Schedule
.

It might be worthwhile getting your corpora in shape, running a test
mass-check, and taking a look at the high and low-scoring messages to spot
any FPs or FNs you may have missed.   Also, request an rsync account
if you haven't already got one.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCvyYvMJF5cimLx9ARAphnAJ0ZK3o93y4AHaFq3TEv+Ojk8rgRBACguvw5
T6z6g+1yRa5Eozp4CId26kU=
=g6Gq
-END PGP SIGNATURE-



Re: 3.1.0 schedule

2005-06-26 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Nix writes:
 On Sun, 26 Jun 2005, Theo Van Dinter spake:
  On Sat, Jun 25, 2005 at 06:29:44PM -0700, Justin Mason wrote:
  Hey -- I presume we won't be going ahead with this schedule, since
  nobody's voted, explicitly given a thumbs-up, or updated the
  details on how mass-checks now work in 3.1.0...
  
  Ok, so the first step is to announce this is coming up and have interested
  parties get accounts.  That part hasn't really changed.
 
 [waves]
 
 I may still have an account (username `nix'): but that was a long, long
 time ago --- pre-Apache, I think --- and I'm not sure if it's still
 there.
 
 The hiatus has ended as I've found time to automate spam-corups
 de-virus, de-bounce, and de-duping at last.
 
 I'm still not sure how intensely to de-dupe: should I zap articles with
 identical bodies?  identical bodies except for MIME headers? identical
 bodies except for identifiable bayes poison? Until the obfu rules came
 in, I'd have said the latter... but now I'm just zapping articles with
 identical bodies and rule hits, as the obfu rules make it very likely
 that two articles differing only in bayes poison will end in different
 rule-hit partitions anyway.)

Yeah, I think de-duping is a bit of a lost cause.  I'd say if you see 50
copies of the same message arriving one after the other, go ahead and
de-dupe, but in general, the volume is just too high to be able to humanly
achieve this any more, so let's just not worry about it :(

I've changed the guidelines on the wiki to reflect this.

  While waiting for that to complete (until Wednesday?), we can update
  the docs and do test runs to make sure it's all cool.
 
 Update docs, please! I've still got to work out what --reuse is for:
 reusing hits on net rules from pre-existing spam-status lines? (If so,
 how does this cater for newly added RBLs/URIBLs?)

/me points at Daniel...  he needs to update the doco.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCv0b7MJF5cimLx9ARAtSQAKCQTe9Neol+eOFyQJ42TQFo3W/CKgCeMO1J
4xMfTt2W9Tgoy27KT/ZhTgY=
=KPwj
-END PGP SIGNATURE-



Re: NOTICE: 3.1.0 mass-checks heads up

2005-06-28 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Theo Van Dinter writes:
 On Sun, Jun 26, 2005 at 03:03:27PM -0700, Justin Mason wrote:
  any FPs or FNs you may have missed.   Also, request an rsync account
  if you haven't already got one.
 
 FYI:
 
 I copied over the nightly accounts to the submission account list and setup
 the new submit directory to handle the results.  So rsync is ready.
 
 No one has asked for an account yet, btw.
 
 Oh, and we ought to clean out the nightly account list post 3.1.
 Right now we're only seeing nightly results from 5 or 6 people, but
 there are 27 accounts.

are there different accounts for nightly vs rescoring mass-checks?  I
thought they were all the same accounts.

We typically see many more rescore submissions than nightlies (for obvious
reasons).

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCwgK6MJF5cimLx9ARAtYoAJ4kNKySUhBZpdEaIBCVhlYqplMkAQCeKwt7
ZXHONmSnkXPYnNfikTXNiCs=
=1c2u
-END PGP SIGNATURE-



Re: NOTICE: 3.1.0 mass-checks heads up

2005-06-28 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Doc Schneider writes:
 Theo Van Dinter wrote:
  On Tue, Jun 28, 2005 at 07:08:58PM -0700, Justin Mason wrote:
  
 are there different accounts for nightly vs rescoring mass-checks?  I
 thought they were all the same accounts.
  
  Yes, they're separate accounts.  We always just copy the one file to
  the other file when we're ready to do the rescore run and add new accounts 
  as
  necessary.
 
 I can help do masschecks for 3.1.0 if needed. I have a good corpus of 
 ham and spam and add daily to it. All hand checked.
 
 Whom do I request an account from? Or is there a form?

Hi Doc --

sounds good!  http://wiki.apache.org/spamassassin/RsyncAccounts has all
the details. (we should be moving that stuff out of the distro imo.)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCwhclMJF5cimLx9ARAgHgAJ9Y+XZxGeVmgtyUl6r6QnicgdV3PQCfaGkQ
3F2esAdoBrLDLisARbikMv0=
=C0Wm
-END PGP SIGNATURE-



Re: autofoo guru needed :)

2005-06-28 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


strchr() should be fine -- I haven't seen a system that didn't support it,
either.   Regarding autoheader et al -- I have no idea why it wants to get
rid of HAVE_LIBSSL, but iirc autoscan is destructive, and only suitable as
a first try which you're supposed to spend some time hand-editing
afterwards, so I wouldn't say the removal of that holds too much water.

- --j.

Malte S. Stretz writes:
 Moin,
 
 I just remembered that I used strchr(3) in my last commit to spamc and 
 according to the man page is that one part of C99, so might be missing on 
 some system (?).
 
 I wondered what autofoo might say about it and got this:
 
 [EMAIL PROTECTED] ~/projects/current/spamassassin/3.1.clean/spamc $ autoscan
 configure.in: warning: missing AC_CHECK_FUNCS([alarm]) wanted by: utils.c:79
 configure.in: warning: missing AC_CHECK_FUNCS([dup2]) wanted by: spamc.c:513
 configure.in: warning: missing AC_CHECK_FUNCS([gethostbyname]) wanted by: 
 libspamc.c:1493
 configure.in: warning: missing AC_CHECK_FUNCS([inet_ntoa]) wanted by: 
 libspamc.c:360
 configure.in: warning: missing AC_CHECK_FUNCS([memset]) wanted by: 
 libspamc.c:303
 configure.in: warning: missing AC_CHECK_FUNCS([strcasecmp]) wanted by: 
 libspamc.c:805
 configure.in: warning: missing AC_CHECK_FUNCS([strchr]) wanted by: 
 libspamc.c:1488
 configure.in: warning: missing AC_CHECK_FUNCS([strerror]) wanted by: 
 libspamc.c:211
 configure.in: warning: missing AC_CHECK_FUNCS([strstr]) wanted by: 
 libspamc.c:828
 configure.in: warning: missing AC_CHECK_HEADERS([arpa/inet.h]) wanted by: 
 libspamc.c:40
 configure.in: warning: missing AC_CHECK_HEADERS([fcntl.h]) wanted by: 
 spamc.c:30
 configure.in: warning: missing AC_FUNC_FORK wanted by: qmail-spamc.c:87
 configure.in: warning: missing AC_FUNC_MALLOC wanted by: libspamc.c:465
 configure.in: warning: missing AC_TYPE_SIGNAL wanted by: spamc.c:625
 [EMAIL PROTECTED] ~/projects/current/spamassassin/3.1.clean/spamc $ autoheader
 autoheader-2.59: WARNING: Using auxiliary files such as `acconfig.h', 
 `config.h.bot'
 autoheader-2.59: WARNING: and `config.h.top', to define templates for 
 `config.h.in'
 autoheader-2.59: WARNING: is deprecated and discouraged.
 autoheader-2.59:
 autoheader-2.59: WARNING: Using the third argument of `AC_DEFINE' and
 autoheader-2.59: WARNING: `AC_DEFINE_UNQUOTED' allows to define a template 
 without
 autoheader-2.59: WARNING: `acconfig.h':
 autoheader-2.59:
 autoheader-2.59: WARNING:   AC_DEFINE([NEED_FUNC_MAIN], 1,
 autoheader-2.59:[Define if a function `main' is needed.])
 autoheader-2.59:
 autoheader-2.59: WARNING: More sophisticated templates can also be produced, 
 see the
 autoheader-2.59: WARNING: documentation.
 [EMAIL PROTECTED] ~/projects/current/spamassassin/3.1.clean/spamc $ svn diff 
 config.h.in
 Index: config.h.in
 ==--- 
 config.h.in (revision 202147)
 +++ config.h.in (working copy)
 @@ -25,9 +25,6 @@
  /* Define to 1 if you have the inttypes.h header file. */
  #undef HAVE_INTTYPES_H
 
 -/* Define to 1 if you have the `crypto' library (-lcrypto). */
 -#undef HAVE_LIBCRYPTO
 -
  /* Define to 1 if you have the `dl' library (-ldl). */
  #undef HAVE_LIBDL
 
 @@ -40,9 +37,6 @@
  /* Define to 1 if you have the `socket' library (-lsocket). */
  #undef HAVE_LIBSOCKET
 
 -/* Define to 1 if you have the `ssl' library (-lssl). */
 -#undef HAVE_LIBSSL
 -
  /* Define to 1 if you have the memory.h header file. */
  #undef HAVE_MEMORY_H
 
 Do we care about the missing checks?  Why does autoheader want to remove the 
 SSL/crypto defines?
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCwilnMJF5cimLx9ARAqKgAJsFpiLHV5JtFQ1bbgrE3jT8yGfRQgCghuKV
VVnGlLLQ0Mrm6YkbLg5dHq8=
=HaRf
-END PGP SIGNATURE-



amavisd update

2005-06-29 Thread Justin Mason
http://freshmeat.net/projects/amavisd-new/?branch_id=41554release_id=200273

'The program is ready for the coming version 3.1 of SpamAssassin.'

which is nice ;)

--j.


[VOTE] pre2, and R-T-C on the tree

2005-06-29 Thread Justin Mason
OK, for the mass-checks, we need to cut another prerelease, pre2.
I also think we should go R-T-C on the tree.  Please vote.

- Vote for the cutting of pre2, a tarball to do mass-checks with.

- Vote to go R-T-C on svn trunk.

(I'm +1 on both)

--j.


Re: [VOTE] pre2, and R-T-C on the tree

2005-06-29 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Michael Parker writes:
 Justin Mason wrote:
 
 OK, for the mass-checks, we need to cut another prerelease, pre2.
 I also think we should go R-T-C on the tree.  Please vote.
 
 - Vote for the cutting of pre2, a tarball to do mass-checks with.
 
 - Vote to go R-T-C on svn trunk.
 
 I would rather vote on an already generated pre2 tarball, but +1 for
 both.  I'll be happy to review a tarball once you've got one built.

please do!

  http://people.apache.org/~jm/devel/Mail-SpamAssassin-3.1.0-pre2.tar.gz
  http://people.apache.org/~jm/devel/Mail-SpamAssassin-3.1.0-pre2.tar.bz2
  http://people.apache.org/~jm/devel/Mail-SpamAssassin-3.1.0-pre2.zip

md5sum:

  d90ea805d073385059db7deadf1acde9  Mail-SpamAssassin-3.1.0-pre2.tar.bz2
  5ecb7b43863c7e093e26eba06fc749b6  Mail-SpamAssassin-3.1.0-pre2.tar.gz
  9a8b82b6fafae4c538a70bf6e5ccb25c  Mail-SpamAssassin-3.1.0-pre2.zip

sha1sum:

  12d908eba8f7e22608e4f1c4e14379b8d133b208  Mail-SpamAssassin-3.1.0-pre2.tar.bz2
  1e3e1e357443247c83712eea8a29e3f507ae15ec  Mail-SpamAssassin-3.1.0-pre2.tar.gz
  ade4ee3c4183204d78b715437b295e37fd1ce3e8  Mail-SpamAssassin-3.1.0-pre2.zip

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCw0arMJF5cimLx9ARAmOoAJ0UekWXbxqnnG+eps/3tTFpD8XGaQCeNKtx
IyRkh/41tsXWSCapmqZYb/4=
=+4+A
-END PGP SIGNATURE-



<    1   2   3   4   5   6   7   8   9   10   >