Re: [FEEDBACK REQUESTED] SA Handling of Relays..

2024-04-03 Thread Bill Cole

On 2024-04-02 at 19:02:35 UTC-0400 (Tue, 2 Apr 2024 16:02:35 -0700)
Michael Peddemors 
is rumored to have said:

Noticed that the External Relays includes what looks to be an 
erroneous entry..


Notice the last entry.. looking at the email headers, the only 
conclusion that I can make is that somehow X-Originating-IP gets 
treated as an external relay, and I don't think that should be..


I have a vague deep memory of this issue being discussed in depth in the 
past on the Users list. I believe the use of X-Originating-IP to 
identify an external relay is correct, although it is of minimal 
utility. I.e. it could easily be an attempt to mask the real origin. 
OTOH, because that header is not part of any sort of standard, it is 
never clear what exactly it means unless you have specific knowledge of 
how the writer of that header defines it.


Of course many headers can be forged, but notice in this case that 
header was injected by the second external relay.. there were no 
relays before the relay involved in accepting the email.


What role did 169.239.218.51 play in the transport? It is not clear to 
me from your description or included SA relay records...




Comments? (Let me know if I am not clear, I can always include raw 
headers if needed, but I think my point is obvious)


Clear as mud.


Should that have created a record in the External Relay array?


Probably. A full explanation of the mail path as you know it and the 
headers which SA used to generate these would help here.


Note that the VERACITY of relay records not written by trusted systems 
is known to be unverifiable and cannot be trusted


  [ ip=169.239.218.195 rdns=se-filter03-195.tld-mx.com 
helo=se-filter03-195.tld-mx.com by=REDACCTED ident= envfrom= intl=0 
id= auth= msa=0 ]
  [ ip=169.239.218.51 rdns=cp51.domains.co.za helo=cp51.domains.co.za 
by=se-filter03.tld-mx.com ident= envfrom=rev...@cde.co.za intl=0 
id=1rOAd4-007W0q-6X auth= msa=0 ]
  [ ip=216.73.163.102 rdns= helo=WIN-9UDRVPAB9FG by=cp51.domains.co.za 
ident= envfrom=rev...@cde.co.za intl=0 id=1rO6N0-0002Xr-0i 
auth=esmtpsa msa=0 ]
  [ ip=169.239.218.51 rdns= helo= by= ident= envfrom= intl=0 id= auth= 
msa=0 ] (from X-Originating-IP)


--
"Catch the Magic of Linux..."

Michael Peddemors, President/CEO LinuxMagic Inc.
Visit us at http://www.linuxmagic.com @linuxmagic
A Wizard IT Company - For More Info http://www.wizard.ca
"LinuxMagic" a Reg. TradeMark of Wizard Tower TechnoServices Ltd.

604-682-0300 Beautiful British Columbia, Canada



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: [ANNOUNCE] Apache SpamAssassin 4.0.1 available

2024-03-30 Thread Bill Cole

On 2024-03-29 at 14:08:30 UTC-0400 (Fri, 29 Mar 2024 19:08:30 +0100)
Benny Pedersen 
is rumored to have said:


Sidney Markowitz skrev den 2024-03-29 15:25:

7e6093c8514e1b18f3b47215dc97d51b7b70142ca2fe7242362c021bf770b2c1c1e99a8227d1c5b9b5d303e405ab9e6a7c67a60b5b03dcb6588bd68c733e2448 
  Mail-SpamAssassin-rules-4.0.1.r1916528.tgz


replaced fine with sa-update no ?


Yes. However, for completeness we always include a specific snapshot of 
the rules with each release. If you use sa-update, as almost everyone 
should, you do not need this rule tarball.


is admins just using this one time and not ever croned update and 
after upgrade major versions never issue a sa-update :(


*PARSER ERROR* (I've read it thrice, including aloud, and I don't 
understand what you mean.)


now that gentoo have binhost, i see more problems to other distro that 
release precompiled problems


*PARSER ERROR*


anyway thanks for release finaly now


We do what we can. This time around Sidney did the release management, 
which is a heavier lift than it may seem, given the myriad dependencies, 
platforms, and Perl versions supported. I think I speak for the whole 
PMC in saying that we are grateful for his diligence and responsibility 
in getting the release done properly.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: [VOTE] Release of 4.0.1 - vote will close on Friday, March 29, 2024 06:30am UTC

2024-03-26 Thread Bill Cole

On 2024-03-26 at 02:26:40 UTC-0400 (Tue, 26 Mar 2024 19:26:40 +1300)
Sidney Markowitz 
is rumored to have said:


[This email is bcc'd to the Apache PMC to ensure they notice it]

Hello everyone,

Calling for a vote on the release of Apache SpamAssassin 4.0.1

Only votes from members of the PMC will be binding, but everyone is
encouraged to thoroughly test that these files properly install and
pass the make test checks on your platform and any other tests that
you can perform.

Here are the files that will be released as Apache SpamAssassin 4.0.1
if there are at least three binding +1 votes and more +1 votes
than -1 binding votes at the end of the 72 hour voting period,
Friday, March 29, 2024 06:30am UTC.

Note that the policy on voting for releases is not the same as for
voting for code committing where a single binding -1 is a veto.

Files are in  https://dist.apache.org/repos/dist/dev/spamassassin/

There have been a few minor commits since the release of 4.0.1-rc1.
We think those changes are safe, but definitely test these release
files to be sure.

As per 
https://www.apache.org/legal/release-policy.html#release-approval

please download and check the files on your platform(s) before voting
+1 if you approve the release, or -1 and state a technical reason why
these files are not ready for release.

The current draft of the release announcement can be seen at
https://svn.apache.org/repos/asf/spamassassin/trunk/build/announcements/4.0.1.txt

I vote +1 for release after checking on Ubuntu 22.04, macOS 13, and 
Windows 10.


+1

Signatures of all 4 archives confirmed.

Built, tested, and running in production on MacOS X 10.11.6 (running 
trunk for weeks...)

Built and tested on macOS 12.7.4

(Both using the MacPorts build model and packaged dependencies, with 
some CPAN additions.)


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


New dependency on Devel::Cycle???

2023-11-20 Thread Bill Cole



Just tried a build on trunk, got this:

NOTE: settings for "make test" are now controlled using "t/config.dist".
See that file if you wish to customize what tests are run, and how.

checking module dependencies and their versions...
checking binary dependencies and their versions...
dependency check complete...

Warning: prerequisite Devel::Cycle 0 not found.

Looks like it is needed for a new test?
Not sure how to make that optional for tests only but we need to do 
something, even if it is just documentation.




--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: bing.com redirector

2023-10-03 Thread Bill Cole
On 2023-10-03 at 11:02:47 UTC-0400 (Tue, 3 Oct 2023 08:02:47 -0700 
(PDT))

John Hardin 
is rumored to have said:


On Tue, 3 Oct 2023, Giovanni Bechis wrote:


Hi,
I've received an email with a link like

https://www.bing.com/ck/a?!&=XXX=3=3=XXX=XXX=1

that redirects to another bing.com url that finally redirects to a 
phishing url. As a workaround I've added "url_shortener bing.com" and 
it works (but it's not correct because Bing it's not a shortener), 
should we add search engines as well to url shortener configuration 
or should we implement something else ?


This should be "something else".


+1

IMHO, this (presumably a tracking URL for a search result?) strikes me 
as much more solidly indicative of bad intent than a shortener.




--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: Question about branching svn

2022-12-17 Thread Bill Cole

On 2022-12-17 at 15:42:08 UTC-0500 (Sun, 18 Dec 2022 09:42:08 +1300)
Sidney Markowitz 
is rumored to have said:

Now that we have released 4.0.0, we can if we want create a 4.0 branch 
and have trunk be for a 4.1 branch.


Pros: If we do it now, then if something comes up that is important 
enough to require a 4.0.1 patch release, we will not have commits 
already in svn that are not suitable for being 4.0.1, either because 
they are not yet finished, or not tested, or too extensive. We can 
only commit to the 4.0 branch things that are completed and tested and 
deemed safe enough that 4.0 branch will always be stable and available 
for a 4.0.1 release if the need arises. Branching now means we don't 
have to worry that what we commit to trunk may end up unsuitable for a 
future 4.0 branch.


Cons: Until someone actually works on something that we would want in 
4.1.0 and not in a hypothetical 4.0.1, we have to consider every 
commit to trunk and replicate it in the 4.0 branch. We could instead 
put off branching until there is something to commit to trunk that we 
might not want in 4.0.1 or that would have to be reverted as part of 
the process of creating a branch.


Thoughts on this? Branch now or branch later?


Later, when we have compelling reasons for a 4.1. Is anyone actively 
working on things that couldn't go into a 4.0.x release?


As for moving to git, mark me down for a whiny non-binding -0.5. I 
recognize that there are sound arguments for doing so, especially in 
reducing friction for new contributors. But I will probably never 
reshape my brain to be comfortable using git. or to do it right...





--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: [VOTE] Release of 4.0.0 - vote will close on Saturday, December 17, 2022 09:00am UTC

2022-12-14 Thread Bill Cole
On 2022-12-14 at 03:22:58 UTC-0500 (Wed, 14 Dec 2022 21:22:58 +1300)
Sidney Markowitz 
is rumored to have said:

> [This email is bcc'd to the Apache PMC to ensure they notice it]
>
> Hello everyone,
>
> Calling for a vote on the release of Apache SpamAssassin 4.0.0

I vote +1 for releasing 4.0.0.

> Files are in  https://people.apache.org/~sidney/devel/

I have verified that all 3 archives match the GPG signatures and both hashes.

I have verified that all 3 archives expand to a source tree with identical 
contents.

I have verified that the unpacked source builds and tests cleanly on OS X El 
Capitan (v10.11.6/Darwin 15.6/Perl 5.34) and macOS Catalina (v10.15.7/Darwin 
19.6/Perl 5.34) using the Makefile.PL arguments and build environment of 
MacPorts (i.e. /opt/local/ prefix et al.)

I have been running trunk in production for weeks without incident.

> The only code change since the release of 4.0.0-rc4 has been the
> commit in Bug 8087 https://svn.apache.org/viewvc?view=rev=1905867
>
> As per https://www.apache.org/legal/release-policy.html#release-approval
> please download and check the files on your platform(s) before voting
> +1 if you approve the release, or -1 and state a technical reason why
> these files are not ready for release.
>
> The current draft of the release announcement can be seen at
> https://svn.apache.org/repos/asf/spamassassin/trunk/build/announcements/4.0.0.txt
>
> I vote +1 for the release after checking on Ubuntu 22.04, macOS 13,
> macOS 12, and Windows 10.
>
> sha256sum of archive files:
>
> e5aa17050a30bc72baa86afdc6048cadea4d1ec2ecc61d787717a059b8319e88  
> Mail-SpamAssassin-4.0.0.tar.bz2
> 65979da7d103e3c37563f23a1a24f470090afb33664348968a00bf3d09a84f36  
> Mail-SpamAssassin-4.0.0.tar.gz
> 063d59ab2c7a67c1707b5b6a6063f97bdc9e3e8ae1246f1d43aa3dd32bf35d06  
> Mail-SpamAssassin-4.0.0.zip
> ae4ffbb917ebc7fefa7240fc5bb5151dda663f8e4059161ad7c9b42eed1bac6d  
> Mail-SpamAssassin-rules-4.0.0.r1905950.tgz
>
> sha512sum of archive files:
>
> a0fe5f6953c9df355bfa011e8a617101687eb156831a057504656921fe76c2a4eb37b5383861aac579e66a20c4454068e81a39826a35eb0266148771567bad5f
>   Mail-SpamAssassin-4.0.0.tar.bz2
> db8e5d0249d9fabfa89bc4c7309a7eafd103ae07617ed9bd32e6b35473c5efc05b1a913b4a3d4bb0ff19093400e3510ae602bf9e96290c63e7946a1d0df6de47
>   Mail-SpamAssassin-4.0.0.tar.gz
> d907d59fd6af1560b0817d5397affeb096feaffd01614481b22a172976798f0ab438a7fb4d6878dfbb8338961f888dd69c2f7d9e743a48164e2842fa6f804571
>   Mail-SpamAssassin-4.0.0.zip
> 8ff0e68e18dc52a88fec83239bb9dc3a1d34f2dcb4c03cd6c566b97fa91242e3c8d006612aeb4df0acf43929eaaa59d542eb5cf904498343adf5eadefcb89255
>   Mail-SpamAssassin-rules-4.0.0.r1905950.tgz
>
> Regards,
>
> Sidney Markowitz
> Chair, Apache SpamAssassin PMC
> sid...@apache.org
> sid...@sidney.com


-- 
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


signature.asc
Description: OpenPGP digital signature


Re: Next steps for RC4

2022-12-02 Thread Bill Cole

On 2022-12-02 at 20:03:58 UTC-0500 (Sat, 3 Dec 2022 14:03:58 +1300)
Sidney Markowitz 
is rumored to have said:


The remaining open bugs targeted for 4.0.0 are 8077 and 8078.

8077 has the votes, ready for Giovanni to commit and close the issue.

8078 is a blocker for 4.0.0 and is pending Henrik getting time to 
spend on investigation. So that is more open-ended.


I am ready to build an RC4 as soon as the open issues are cleared, and 
I would like to move very quickly from RC4 to full release with 
minimal further testing other than the full build and regression 
testing.


To that end I propose that we only target any new issues for 4.0.0 if 
they can be closed very safely and quickly, and if there is reason not 
to put them off to a future 4.0.1 release. The goal will be to really 
be finished with 4.0.0 after 8078 is finally done.


Agreed.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Can someone explain what happened to my rules?

2022-11-04 Thread Bill Cole
I recently added multiple rules to my sandbox (in 80test.cf) and tweaked 
them over the space of a few weeks, including renaming some to remove 
the leading 'T_'


And QA seems to not see all of them and is showing hits on both the test 
and non-test versions. Example: 
https://ruleqa.spamassassin.org/?daterev=20221103-r1905040-n=%2FSCC_SPAMMER_ADDR_2


I'm *GUESSING* that some testers don't regularly update their rules or 
something like that. However, there are other rules 
(__SCC_HASHBUST_{1..3,6}) that have seemingly just vanished.


I've waited for ~3 weeks for this to all shake out, but it hasn't.


Re: Ready to move to RTC mode and produce a pre-3?

2022-08-19 Thread Bill Cole

On 2022-08-19 at 18:35:46 UTC-0400 (Sat, 20 Aug 2022 10:35:46 +1200)
Sidney Markowitz 
is rumored to have said:

We are down to no more open bugs and sufficient success on the CPAN 
test machines. My understanding is that testing of trunk in production 
environments has gone well.


If nobody has anything that they are working on that they intend to 
commit, or any other issues to bring up, I would like to go to RTC 
mode and I'll then cut a pre-release 3 that people can test with. If 
that passes a short time for testing I'll cut a release candidate in 
preparation for actual release.


I don't think we need a formal vote. Let's use lazy consensus over the 
next 72 hours for anyone who has a reason not to switch to RTC to 
speak up. I'm going to call it 22:22:22 UTC Monday 22 Aug 2022 because 
that's close enough to 72 hours and it is kind of a cool time and 
date.


+1

--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: HashBL enabled by default?

2022-05-26 Thread Bill Cole

On 2022-05-25 at 23:59:36 UTC-0400 (Thu, 26 May 2022 06:59:36 +0300)
Henrik K 
is rumored to have said:


Any objections enabling HashBL plugin by default for new installs?

I think it's going to be useful in the future with all the new 
features.
Only rule in stock rules using it at the moment is GB_HASHBL_BTC, but 
there

will probably be more and we should make sure it can be easily used if
needed.


I am mildly concerned that Gio may not be ready to have his future 
married to the operation of a public blacklist enabled by default in SA. 
Once it is switched on by default, it will be difficult to end or even 
substantially change in operation. We've handled DNSBLs dying well in 
the past on the project side, but it is also tough on the operator of a 
DNSBL.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: Esp module discussion

2022-05-14 Thread Bill Cole
On 2022-05-14 at 06:52:02 UTC-0400 (Sat, 14 May 2022 12:52:02 +0200)
 
is rumored to have said:

> Esp module may be effectively outdated and SpamAssassin releases are not 
> frequent as I would love to, for me there is no problem in removing the 
> module from SpamAssassin src tree and work on it out-of-tree.

+1


signature.asc
Description: OpenPGP digital signature


Re: Perlcritic for make test

2022-05-10 Thread Bill Cole

On 2022-05-10 at 00:21:36 UTC-0400 (Tue, 10 May 2022 07:21:36 +0300)
Henrik K 
is rumored to have said:

A quick hack to run it without taint, created t/perlcritic.t which 
contains:


#!/usr/bin/perl
$ENV{'PATH'} = '/bin:/usr/bin';
-d "xt" && "$^X xt/60_perlcritic.t" =~ /(.*)/ ||
   "$^X ../xt/60_perlcritic.t" =~ /(.*)/;
exec($1);

Let me know if you think it can be committed.


+1


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: bayes_ignore_header policy?

2022-05-08 Thread Bill Cole
On 2022-05-07 at 04:42:25 UTC-0400 (Sat, 7 May 2022 11:42:25 +0300)
Henrik K 
is rumored to have said:

> There's lots of common headers that are basically huge base64 strings,
> creating stupid amounts of random Bayes tokens.
>
> Apparently rulesrc/sandbox/axb/23_bayes_ignore_header.cf was created to
> handle some of these already?
>
> I've found atleast these missing:
>
> IronPort-SDR
> X-Exchange-Antispam-Report-CFA-Test
> X-Forefront-Antispam-Report-Untrusted
> X-Gm-Message-State
> X-MS-Exchange-AntiSpam-MessageData
> X-MS-Exchange-AntiSpam-MessageData-0
> X-MS-Exchange-CrossTenant-UserPrincipalName
> X-MS-Exchange-SLBlob-MailProps
> X-MSFBL
> X-Microsoft-Antispam-Message-Info
> X-Microsoft-Antispam-Message-Info-Original
> X-Microsoft-Antispam-Untrusted
> X-Microsoft-Exchange-Diagnostics
> X-Provags-ID
> X-SG-EID
> X-SG-ID
>
> Wouldn't these be better put directly into bayes/23_bayes.cf instead of some
> sandbox, that's intended more for testing rules than changing SA config?

Yes.

However, I'm not convinced that all of those are unhelpful for Bayes. Some will 
never repeat and so are pure noise, but those which identify specific senders 
may be useful. The MS anti-spam headers may be tokenized into useful pieces 
(e.g. "NSPM" or "SPM") even if the headers as a whole are opaque.

> Any objections 1) adding these new ones

I have not researched all of those, but I believe that some of those should in 
theory be useful in Bayes.

> 2) moving everything to 23_bayes.cf?

+1


-- 
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: svn commit: r1900446 - in /spamassassin/trunk/lib/Mail/SpamAssassin: Message/Node.pm Plugin/DecodeShortURLs.pm Plugin/OLEVBMacro.pm

2022-05-01 Thread Bill Cole

On 2022-05-01 at 16:33:13 UTC-0400 (Sun, 1 May 2022 16:33:13 -0400)
Kevin A. McGrail 
is rumored to have said:


On 5/1/2022 4:12 PM, Michael Storz wrote:
Kevin, the change from 'return undef' to "return" is correct because 
return returns undef in the scalar context. "return undef" should 
only be used when evaluated in the array context and the value undef 
is needed instead of ().


I only looked at the case for detect_utf16. If the change is ok for 
the other returns, you still have to investigate if you don't trust 
the change.
It's not a matter of trust, it's a matter of changing return undef 
where there is a comment that says "# let perl figure it out from the 
BOM".  That comment worries me that the return undef was purposeful.


I don't see how the explicit 'return undef' can have any specific 
relationship to that comment. The code checks for a UTF16 BOM and if it 
finds one, does not bother trying to detect whether the data is LE or 
BE, because with a BOM, Perl (Encode::Detect::Detector) determines that 
without help when asked to, later in the block that called 
detect_utf16(). Without a BOM, detect_utf16() is needed to look at the 
data and try to guess the endianness. With a BOM, detect_utf16() is a 
no-op and ultimately Encode::Detect::Detector gets used (and should work 
just fine.)


It helps to understand that detect_utf16() is called in exactly one 
place: in another part of Node.pm. It is called there in scalar context. 
In scalar context, 'return' without an argument returns undef. There's 
no logical reason for it to ever be called in list context, since what 
it is returning is (a pointer to) an Encode object.


Re: Preparing a 4.0.0 release

2022-04-22 Thread Bill Cole

On 2022-04-22 at 13:32:23 UTC-0400 (Fri, 22 Apr 2022 19:32:23 +0200)
David Bürgin 
is rumored to have said:


Sidney Markowitz:
We are down to six open issues on Bugzilla with a 4.0 milestone, none 
listed as blockers. Let's see if we can get a release out. I'll take 
care of the release manager duties. Henrik volunteered to spend some 
time wrapping up issues during the next few weeks. Of course anyone 
else can help too.


Have you considered migrating to Git before the next major release?


-1

This would be the worst possible time to switch version control 
platforms.



It
has many advantages over Subversion, and would make development more
accessible and transparent for other developers.


Asserts facts not in evidence...

What advantages would Git have over Subversion **for SpamAssassin**?

How would you propose refactoring the RuleQA and rescoring processes to 
work with Git instead of Subversion?


Re: [Bug 7477] Direct DNS Querying Per DNSBL Zone

2022-04-11 Thread Bill Cole

On 2022-04-11 at 11:23:40 UTC-0400 (Mon, 11 Apr 2022 08:23:40 -0700)
Michael Peddemors 
is rumored to have said:


On 2022-04-11 06:18, bugzilla-dae...@spamassassin.apache.org wrote:

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7477

Henrik Krohns  changed:

What|Removed |Added

Target Milestone|4.0.0   |Future

--- Comment #6 from Henrik Krohns  ---
Would need some cleaning up, but no time to look for 4.0.0. Is there 
_really_

demand for this feature, when resolvers are configurable pretty much
everywhere.. postponing..



There is demand out there I am sure, with recent changes to some RBL 
operators policies, and things like DoH out in the wild.


There can be cases where certain RBL's are blocked from queries at the 
upstream provider.


Just an opinion.. Sometimes SA administrators may not have operational 
control over resolver decisions.


I think it would be irresponsible to put this in SA and that it would 
cause more trouble than it can be worth.


Mail systems should use local caching resolvers under common control 
with the mail system. They SHOULD NOT use any resolver with 'safety' 
features designed to protect end users by breaking resolution. They 
SHOULD NOT use unaccountable free resolvers. FOR ANYTHING.


This sort of thing belongs in a dedicated resolver with forwarding 
options like dnsmasq or unbound, NOT in SA. I don't think we should be 
providing support for systems that are fundamentally misdesigned in 
order to  make the use of a 3rd-party resolver a top priority.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: svn commit: r1898654 - in /spamassassin/trunk: INSTALL UPGRADE debian/spamassassin.README.Debian lib/Mail/SpamAssassin/Util/DependencyInfo.pm spamd/spamd.raw

2022-03-06 Thread Bill Cole
On 2022-03-06 at 10:15:11 UTC-0500 (Sun, 6 Mar 2022 16:15:11 +0100)
Giovanni Bechis 
is rumored to have said:

> On 3/6/22 14:42, h...@apache.org wrote:
>> Author: hege
>> Date: Sun Mar  6 13:42:39 2022
>> New Revision: 1898654
>>
>> URL: http://svn.apache.org/viewvc?rev=1898654=rev
>> Log:
>> Remove deprecated --auth-ident from spamd (Bug 7599)
>>
> [...]
> auth-ident support is still present on "spamd-apache2" directory,
> do we want to still support spamd-apache2 ?

It isn't clear to me why it ever existed. I cannot imagine why anyone would 
WANT a spamd implementation inside httpd. It almost sounds like a software 
stunt akin to poetry implemented in sendmail.cf.




-- 
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


signature.asc
Description: OpenPGP digital signature


Re: bayes_auto_learn default value

2022-02-08 Thread Bill Cole
On 2022-02-08 at 07:46:17 UTC-0500 (Tue, 8 Feb 2022 13:46:17 +0100)
Axb 
is rumored to have said:

> On 2/8/22 11:33, Kevin A. McGrail wrote:
>> Auto learning is something that should never of existed. All it does is
>> reinforce misclassification and slowly spirals the database into having
>> wrong answers be more wrong.
>
> I don't agree - I've been running autoloearn for years and my bayes results 
> have always been solid.
> (and I'm speaking of a global bayes redis DB in a 200k user setup)

With substantially smaller systems (my own personal server and those I manage 
for my employer) I have the same benign experience. I don't think we should 
disable auto-learn by default *in any way* without actual research and hard 
data beyond anecdotal experience.


> Where I see potential is in optimizing auto expiration when using a file 
> based DB. Very often DB is locked and tokens cannot be expired which leads to 
> what you call "reinforce misclassification". If tokens are expired regularly, 
> skewing is very improbable.
> Thankfully, using Redis, it's way more controllable.

I think that's also not a problem for systems that are not persistently loaded 
with in-process mail.

All we see as SA maintainers are our own systems and cases that people are 
having problems with. I don't think we really know whether auto-learn works 
well generally or why/how it breaks when it does.

>> Since we don't seem to have consensus on changing the default does anybody
>> object to a pre-file that disables it? That would be more clearly
>> documented in people will look at the pre-file for V4.
>
> I'm -1 for disabling, one way or another.

Same. It would substantially change how peoples' existing stable systems 
operate.

I'm less averse to tweaking default auto-learning parameters. In ALL cases 
where I use auto-learn I have reduced both thresholds, so I learn as ham ONLY 
mail with negative scores (< -0.1, so effectively at least 2 ham-signs...) and 
learn as spam substantially more than just the absurdly spammy stuff. This 
sacrifices some overall effectiveness in theory but I think it also helps make 
Bayes less brittle. I have NOT done rigorous testing to prove that.

I believe that SA has reached the point of broad use where we should be making 
substantial change decisions based on hard data rather than anecdote and lore.

-- 
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: bayes_auto_learn default value

2022-02-07 Thread Bill Cole
On 2022-02-07 at 12:32:18 UTC-0500 (Mon, 7 Feb 2022 18:32:18 +0100)
Giovanni Bechis 
is rumored to have said:

> Hi,
> as per Mail::SpamAssassin::Conf(3), bayes_auto_learn defaults to 1/true.
> Is anybody against changing its default value to 0/false on trunk (aka 
> SpamAssassin 4.x) ?

-1

I know that there is a broad consensus among people who pay close attention to 
SA that auto-learning is risky, but having it enabled has been the default for 
long enough that a change will be unexpected and will break systems where 
auto-learning is enabled, is working well, and is generally ignored.

I should probably disclose that I have auto-learn enabled on my personal system 
and on those of my primary employer and it has been quite harmless, although I 
cannot say definitively that it is making a significant difference. The biggest 
strength I see from it is a steady stream of ham, which is hard to obtain 
otherwise.



-- 
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


signature.asc
Description: OpenPGP digital signature


Re: svn commit: r1893514 - /spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/PDFInfo.pm

2021-09-22 Thread Bill Cole
On 2021-09-22 at 10:59:53 UTC-0400 (Wed, 22 Sep 2021 14:59:53 -)
 
is rumored to have said:

> Author: mmartinec
> Date: Wed Sep 22 14:59:53 2021
> New Revision: 1893514
>
> URL: http://svn.apache.org/viewvc?rev=1893514=rev
> Log:
> Plugin/PDFInfo.pm: fix the "no such facility warn", triping the t/debug.t
>
> Modified:
> spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/PDFInfo.pm
>
> Modified: spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/PDFInfo.pm
> URL: 
> http://svn.apache.org/viewvc/spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/PDFInfo.pm?rev=1893514=1893513=1893514=diff
> ==
> --- spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/PDFInfo.pm (original)
> +++ spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/PDFInfo.pm Wed Sep 22 
> 14:59:53 2021
> @@ -143,7 +143,7 @@ package Mail::SpamAssassin::Plugin::PDFI
>  use Mail::SpamAssassin::Plugin;
>  use Mail::SpamAssassin::Logger;
>  use Mail::SpamAssassin::Util qw(compile_regexp);
> -use strict;
> +use strict; use feature qw(refaliasing state evalbytes say fc current_sub); 
> no feature qw(indirect);

Why?

I believe this will break compatibility with older Perl (<5.22) without 
actually needing all those 'features.'


-- 
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: spamassassin 3.4.5 wide chars

2021-08-16 Thread Bill Cole
On 2021-08-11 at 22:03:24 UTC-0400 (Thu, 12 Aug 2021 04:03:24 +0200)
Benny Pedersen 
is rumored to have said:

> https://bugs.gentoo.org/807781
>
> is it solved in 3.4.6 ?

That's not a SA bug report. It's a Gentoo bug report.

Fix your rules.


-- 
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: [auto] bad sandbox rules report

2021-05-05 Thread Bill Cole
On 2021-05-05 at 04:30:14 UTC-0400 (Wed,  5 May 2021 08:30:14 + 
(UTC))

Rules Report Cron 
is rumored to have said:


HTTP get: https://ruleqa.spamassassin.org/1-days-ago?xml=1
HTTP get: https://ruleqa.spamassassin.org/2-days-ago?xml=1
HTTP get: https://ruleqa.spamassassin.org/3-days-ago?xml=1
Invalid \0 character in pathname for ftdir: 
rulesrc/core\0rulesrc/sandbox at lib/Mail/SpamAssassin.pm line 2010.


That seems BAD but I do not understand what's causing it.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: svn commit: r1889308 - in /spamassassin/trunk: rules/10_hasbase.cf rules/20_ratware.cf rulesrc/sandbox/billcole/80_test.cf

2021-04-30 Thread Bill Cole

On 30 Apr 2021, at 15:03, Benny Pedersen wrote:

hopefully things can be stablinge in using the above module then just 
make another header check, if header check is not supported in the 
perl module it could be added


Doing header checks, even in complex combinations, is better done with 
rules than to hide them away in Perl modules. If a particular tool (e.g. 
Mailman 3, which is newer than that module) needs a new pattern of 
headers, we can have it in widespread use in a matter of days if we 
define it in rules. If it is embedded in code, it won't get used by most 
people until their distro updates to a version newer than the current 
trunk.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: svn commit: r1889308 - in /spamassassin/trunk: rules/10_hasbase.cf rules/20_ratware.cf rulesrc/sandbox/billcole/80_test.cf

2021-04-30 Thread Bill Cole

On 30 Apr 2021, at 12:05, Benny Pedersen wrote:


On 2021-04-30 17:38, Bill Cole wrote:

On 30 Apr 2021, at 10:29, John Hardin wrote:

On Fri, 30 Apr 2021, Henrik K wrote:

Please do not commit anything without make/lint check. :-(


+1


-header  __HAS_LIST_ID   exists:List-Id
+meta__HAS_LIST_ID   __ML2


Also, this should be the other way around - be consistent with 
__HAS_{headername} subrules being simply "the header exists", and if 
you want to alias it then make the *other* rule with the nonstandard 
name the meta.


I don't really have a preference one way or the other, only for not
having 2 identical but independent rules.


maillist.pm exists ?


No, it does not. At least not anywhere I can find...


imho detection on maillists should be made in this core module


Do you mean Mail::SpamAssassin::MailingList.pm?

That is an undocumented module which was last given specific attention 
in 2004, is not covered by the test suite, and is not used by any rule 
or any other SA module. I confess to being unaware of its existence when 
putting together the MAILING_LIST_MULTI rule but even now that I know, I 
do not see what benefit it would provide to use that (or any) module 
where a meta rule can suffice and be dynamically maintained. It seems 
more reasonable to just drop that module, as it seems likely that it has 
not been used by anyone for the past 10+ years, across many SA versions.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: svn commit: r1889308 - in /spamassassin/trunk: rules/10_hasbase.cf rules/20_ratware.cf rulesrc/sandbox/billcole/80_test.cf

2021-04-30 Thread Bill Cole

On 30 Apr 2021, at 10:29, John Hardin wrote:


On Fri, 30 Apr 2021, Henrik K wrote:



Please do not commit anything without make/lint check. :-(



-header  __HAS_LIST_ID   exists:List-Id
+meta__HAS_LIST_ID   __ML2


Also, this should be the other way around - be consistent with 
__HAS_{headername} subrules being simply "the header exists", and if 
you want to alias it then make the *other* rule with the nonstandard 
name the meta.


I don't really have a preference one way or the other, only for not 
having 2 identical but independent rules.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


ATTN: BUG BUMP: [Bug 7656] UTF8 rules, normalize_charset etc overhaul

2021-04-15 Thread Bill Cole
This bug is part of the complex related to smoothing out all the edge 
and corner cases of character set encoding for v4. There is some concern 
that changing the default for normalize_charset (to enable it) or even 
removing the switch altogether to nail down documentation of how to 
match problem characters like the Latin-1 "extended ASCII" range: 
basically any 8-bit character >127.


Making the change requires some work on rules that look for those 
high-bit-set characters by people who understand encoding issues and 
common failings (e.g. using a 1-byte high-bit-set character in a 
notionally UTF-8 document.) My personal opinion is that the change is 
worth the work, but I admit that I've not completely audited the default 
rules for problematic cases. I have been writing rules to work with 
normalize_charset for many years however. With reasonably modern Perl, 
there's no strong argument for normalize_charset=0 beyond the technical 
debt of code and rules written to accommodate it.



On 15 Apr 2021, at 8:55, bugzilla-dae...@spamassassin.apache.org wrote:


https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656

Bill Cole  changed:

   What|Removed |Added

 CC||billc...@apache.org

--- Comment #15 from Bill Cole  ---
(In reply to Henrik Krohns from comment #12)

Bumping this bug. Comments? Monologs are getting a bit tiresome.. :-)


+1

The minor pain of revamping rules that match non-ASCII characters is
compensated by the fact that this is a *normalization* and so reduces 
the
frequency of edge cases that escape rules written (perhaps 
inadvertently) to
depend on a particular subset of possible encodings. My personal 
experience
running SA instances that see a lot of non-ASCII messages is that 
enabling
normalize_charset is a best practice, and the default is basically 
tech debt.


As for requiring discussion on-list, these comments are sent to the 
dev list.
I'm going to bump it there to get the attention of anyone filtering 
out
Bugzilla mail (!? if that's a thing...) and will also post on the 
Users list to

get a broader audience.

--
You are receiving this mail because:
You are the assignee for the bug.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: Test failures in 3.4 branch

2021-04-07 Thread Bill Cole

On 7 Apr 2021, at 23:33, Bill Cole wrote:


On 7 Apr 2021, at 20:56, Sidney Markowitz wrote:

Here are failures I got running in the 3.4 branch with the rules, 
rulesrc, and t.rules symlinked to trunk.


 sudo make test TEST_FILES="xt/*.t"

in macOS 11.2.3

I'm not deep enough into the code to address the issue right now, 
just following release steps.


Can someone please look at this and comment?

xt/50_lang_lint.t ... Apr  8 12:24:46.964 [94600] 
warn: config: invalid meta T_KHOP_BOTNET_4 token: .8
Apr  8 12:24:47.821 [94600] warn: lint: 1 issues detected, please 
rerun with debug enabled for more information

Not found: anything =at t/lang_lint.t line 19.


[... more errors with T_KHOP_BOTNET_4 elided ...]

The string T_KHOP_BOTNET_4 (apparently a rule name) does not occur 
anywhere in the 3.4 branch or in trunk.


Theory: it's seeing a bogus rule file that isn't in the distribution. 
Maybe a user_prefs file?


DATA POINT: I just grabbed fresh 3.4 and trunk trees with the rules, 
rulesrc, and t.rules symlinked and 'sudo make test TEST_FILES="xt/*.t"' 
has made it through xt/50_spamc_x_E_R.t without issues, no mention of 
bad rules.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: Test failures in 3.4 branch

2021-04-07 Thread Bill Cole

On 7 Apr 2021, at 20:56, Sidney Markowitz wrote:

Here are failures I got running in the 3.4 branch with the rules, 
rulesrc, and t.rules symlinked to trunk.


 sudo make test TEST_FILES="xt/*.t"

in macOS 11.2.3

I'm not deep enough into the code to address the issue right now, just 
following release steps.


Can someone please look at this and comment?

xt/50_lang_lint.t ... Apr  8 12:24:46.964 [94600] 
warn: config: invalid meta T_KHOP_BOTNET_4 token: .8
Apr  8 12:24:47.821 [94600] warn: lint: 1 issues detected, please 
rerun with debug enabled for more information

Not found: anything =at t/lang_lint.t line 19.


[... more errors with T_KHOP_BOTNET_4 elided ...]

The string T_KHOP_BOTNET_4 (apparently a rule name) does not occur 
anywhere in the 3.4 branch or in trunk.


Theory: it's seeing a bogus rule file that isn't in the distribution. 
Maybe a user_prefs file?





xt/50_lang_lint.t ... 1/?
#   Failed test at t/SATest.pm line 811.
Apr  8 12:24:48.646 [94602] warn: config: invalid meta T_KHOP_BOTNET_4 
token: .8
Apr  8 12:24:49.292 [94602] warn: lint: 1 issues detected, please 
rerun with debug enabled for more information

Not found: anything =at t/lang_lint.t line 19.
xt/50_lang_lint.t ... 2/?
#   Failed test at t/SATest.pm line 811.
Apr  8 12:24:50.076 [94604] warn: config: invalid meta T_KHOP_BOTNET_4 
token: .8
Apr  8 12:24:50.710 [94604] warn: lint: 1 issues detected, please 
rerun with debug enabled for more information

Not found: anything =at t/lang_lint.t line 19.
xt/50_lang_lint.t ... 3/?
#   Failed test at t/SATest.pm line 811.
Apr  8 12:24:51.497 [94606] warn: config: invalid meta T_KHOP_BOTNET_4 
token: .8
Apr  8 12:24:52.141 [94606] warn: lint: 1 issues detected, please 
rerun with debug enabled for more information

Not found: anything =at t/lang_lint.t line 19.
xt/50_lang_lint.t ... 4/?
#   Failed test at t/SATest.pm line 811.
Apr  8 12:24:52.946 [94608] warn: config: invalid meta T_KHOP_BOTNET_4 
token: .8
Apr  8 12:24:53.591 [94608] warn: lint: 1 issues detected, please 
rerun with debug enabled for more information

Not found: anything =at t/lang_lint.t line 19.
xt/50_lang_lint.t ... 5/?
#   Failed test at t/SATest.pm line 811.
Apr  8 12:24:54.369 [94610] warn: config: invalid meta T_KHOP_BOTNET_4 
token: .8
Apr  8 12:24:55.014 [94610] warn: lint: 1 issues detected, please 
rerun with debug enabled for more information

Not found: anything =at t/lang_lint.t line 19.
xt/50_lang_lint.t ... 6/?
#   Failed test at t/SATest.pm line 811.
Apr  8 12:24:55.814 [94612] warn: config: invalid meta T_KHOP_BOTNET_4 
token: .8
Apr  8 12:24:56.466 [94612] warn: lint: 1 issues detected, please 
rerun with debug enabled for more information

Not found: anything =at t/lang_lint.t line 19.
xt/50_lang_lint.t ... 7/?
#   Failed test at t/SATest.pm line 811.
Apr  8 12:24:57.255 [94614] warn: config: invalid meta T_KHOP_BOTNET_4 
token: .8
Apr  8 12:24:57.918 [94614] warn: lint: 1 issues detected, please 
rerun with debug enabled for more information

Not found: anything =at t/lang_lint.t line 19.
xt/50_lang_lint.t ... 8/?
#   Failed test at t/SATest.pm line 811.
# Looks like you failed 8 tests of 8.
exec failed at xt/50_lang_lint.t line 6.
xt/50_lang_lint.t ... Dubious, test returned 2 (wstat 
512, 0x200)

Failed 8/8 subtests

 -

Sidney



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: ANNOUNCE: Apache SpamAssassin 3.4.5 available

2021-04-07 Thread Bill Cole

On 7 Apr 2021, at 17:37, Sidney Markowitz wrote:


Henrik K wrote on 7/04/21 1:31 am:

I think we should release 3.4.6 because of this bug:

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7897

Metas depending on uribl rules do not currently hit.. :-(


Though it seems there aren't any metas in stock ruleset that are 
affected.

I do have on my own rules.  So dunno, what is the timeline for a 4.0
release..



Henrik, are you saying that this isn't enough reason to release 3.4.6 
after all because it doesn't effect any metas in stock ruleset? I'm ok 
with doing the release manager work for 3.4.6 if there is consensus 
that it is necessary. Also, to be clear, is bug 7897 more important to 
fix than the bug 7822 that would be re-introduced by reverting its 
patch?


I think fixing 7897, which causes the failure of fairly common (albeit 
not default) rules, is MUCH more important than 7822, which is a bug in 
a disabled-by-default plugin that is relatively new. For years I've been 
using and suggesting to others the use of DNSBL synergy rules adding 
significant bonus points when a message hits 2 or more blocklists from a 
collection that may each have occasional FPs.


I believe we should release a 3.4.6 in the near term (days not months) 
to fix 7897 even if we have to do it with 7822 unfixed. I'd rather both 
were fixed but I don't have the free cycles right now to get a good 
enough understanding of the entanglements to fix it myself.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: Next steps in the 3.4.5 release

2021-03-17 Thread Bill Cole

Just to confirm:

I have been running 3.4-HEAD for most of the past year on a small 
production system. I am confident that it is solid, and my +1 for the 
last vote was to freeze and release.



On 17 Mar 2021, at 17:10, Sidney Markowitz wrote:


Hi everyone,

It's awfully quiet regarding the testing of 3.4.5-rc2. I'm making some 
assumptions in taking the next steps in the final release.


Please, speak up if any of these assumptions are wrong and we should 
not go on to a quick 3.4.5 build and vote.


I assume:

 1. There are people who have been running branch 3.4 in a production 
environment since at least when it was tagged as 3.4.5-rc1 last 
January (even though there was no announcement of 3.4.5-rc1 builds)


 2. The few commits to two minor issues that were committed since then 
have been sufficiently reviewed and are deemed safe.


 3. Enough committers do think that what we have in 3.4.5-rc2 is ready 
for release so nothing is left in the process except to build 3.4.5 
and call for a release vote.


 4. Nobody has a -1 vote against proceeding with release that they can 
explai with a technical reason.



I'll give this another two days. If there are no negative responses to 
consider, I'll proceed with the next steps in the release by building 
what we currently have as 3.4.5.


Regards,

 Sidney Markowitz
 Chair, Apache SpamAssassin PMC
 sid...@apache.org
 sid...@sidney.com



--
Bill Cole


Re: svn commit: r1885178 - /spamassassin/trunk/rulesrc/sandbox/jhardin/40_local_419replyto.cf

2021-01-05 Thread Bill Cole
John,

1st: thank you very much for working on generated rules and for all the rest of 
your work on rules.

I am curious about whether these very long regexes have been proven to actually 
work in full, or if it is possible that they are getting mishandled quietly. I 
don't see any hits on either rule on any of the mail systems I work with going 
back a month, so I am wondering if it is worthwhile to construct test messages 
that should hit due to elements in the latter parts of the patterns or if 
you've already done such tests.

-- 
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


signature.asc
Description: OpenPGP digital signature


Apache SpamAssassin updates mirror at SECNAP is down

2020-09-17 Thread Bill Cole
Since approximately 14:15 UTC 2020-09-14, our monitoring has shown the 
web server at sa-update.secnap.net (204.89.241.6) as down. Our 
documentation shows "Wazir Shpoon and John Meyer" as contacts for this 
mirror, but lacks email addresses for either individual. The mirror has 
been in use for almost a decade and seems to have never failed for any 
extended period in that time. We are grateful for this long support of 
the SpamAssassin community and hope it can continue. Please let us know 
whether and when you intend to resume mirroring service.


--
Bill Cole on behalf of the Apache Spamassassin Project
billc...@apache.org


LAST CALL: Anyone have contact info for Wazir Shpoon and/or John Meyer?

2020-09-16 Thread Bill Cole
The sa-update mirror monitor has been showing the secnap.net mirror 
down, for a few days and MIRRORED.BY has no contact info except the 
names and the fact that Wazir's email bounced in 2017.


I will comment out that mirror ~Monday if no contact is made and the 
mirror remains down.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not For Hire (currently)



Anyone have contact in for Wazir Shpoon and/or John Meyer?

2020-09-14 Thread Bill Cole
The sa-update mirror monitor is showing the secnap.net mirror down, and 
MIRRORED.BY has no contact info except the names and the fact that 
Wazir's email bounced in 2017.




--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not For Hire (currently)

Forwarded message:


From: aut...@sa-vm.apache.org
To: sysadm...@spamassassin.apache.org
Subject: checkSAupdateMirrors.sh on sa-vm.apache.org - 1 mirror DOWN, 
0 mirrors STALE

Date: Mon, 14 Sep 2020 13:18:08 + (UTC)

Fetching sa-update URLs from 
http://spamassassin.apache.org/updates/MIRRORED.BY


http://sa-update.dnswl.org/ (116.203.4.105): UP (CURRENT)

http://www.sa-update.pccc.com/ (69.171.29.42): UP (CURRENT)

http://sa-update.secnap.net/ (204.89.241.6): DOWN

http://sa-update.space-pro.be/ (176.28.55.20): UP (CURRENT)

http://sa-update.ena.com/ (96.4.1.5): UP (CURRENT)

http://sa-update.ena.com/ (96.5.1.5): UP (CURRENT)

http://sa-update.razx.cloud/ (172.67.74.101): UP (CURRENT)

http://sa-update.razx.cloud/ (104.26.3.18): UP (CURRENT)

http://sa-update.razx.cloud/ (104.26.2.18): UP (CURRENT)

http://sa-update.fossies.org/ (144.76.163.196): UP (CURRENT)

http://sa-update.verein-clean.net/ (148.251.212.58): UP (CURRENT)

http://sa-update.verein-clean.net/ (37.252.124.130): UP (CURRENT)

http://sa-update.verein-clean.net/ (148.251.29.131): UP (CURRENT)

http://sa-update.verein-clean.net/ (37.252.120.157): UP (CURRENT)

http://sa-update.spamassassin.org/ (64.142.56.146): UP (CURRENT)


Re: 3.4.5 Pre-Release 1 Built

2020-07-09 Thread Bill Cole

On 9 Jul 2020, at 14:14, Bill Cole wrote:


On 21 Jun 2020, at 22:12, Kevin A. McGrail wrote:


Hi All,

Working towards a 3.4.5 bug release and I have built a pre1 
release.  I

have this running in production and believe it is safe to use.


Probably, but it seems to be missing a recent fix. Makefile.PL kicks 
out this warning:


Argument "1.20200513.1" isn't numeric in numeric ge (>=) at 
lib/Mail/SpamAssassin/Util/DependencyInfo.pm line 656,  line 1.


OK, that one was actually not yet fixed anywhere. It is now.
I've also fixed the same issue in t/dkim.t.

After pushing all the related fixes onto the unpacked 3.4.5pre1 tree, 
'make test' succeeds on MacOS X 10.6 and 10.14 with Perl 5.28 
(MacPorts-based local build)


It is probably best to roll a pre2 package.

--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not For Hire (currently)


Re: 3.4.5 Pre-Release 1 Built

2020-07-09 Thread Bill Cole

On 21 Jun 2020, at 22:12, Kevin A. McGrail wrote:


Hi All,

Working towards a 3.4.5 bug release and I have built a pre1 release.  
I

have this running in production and believe it is safe to use.


Probably, but it seems to be missing a recent fix. Makefile.PL kicks out 
this warning:


Argument "1.20200513.1" isn't numeric in numeric ge (>=) at 
lib/Mail/SpamAssassin/Util/DependencyInfo.pm line 656,  line 1.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not For Hire (currently)


Re: Default Whitelist Auth - How to edit/override locally?

2020-03-07 Thread Bill Cole

On 7 Mar 2020, at 9:38, Kevin A. McGrail wrote:


Morning All:

How do I:

A) override this setting:

def_whitelist_auth *@*.bridgestonegolf.com


An "unwhitelist_auth" directive should work. From perldoc 
Mail::SpamAssassin::Conf -


unwhitelist_auth u...@example.com
Used to override a "whitelist_auth" entry. The specified email 
address has to match exactly

the address previously used in a "whitelist_auth" line.

e.g.

  unwhitelist_auth j...@example.com f...@example.com
  unwhitelist_auth *@example.com



B) remove it from the stock rules?


Get a committer to remove the line from trunk/rules/60_whitelist_auth.cf

Like in r1874953 :)


They have been spamming me and don't deserve a whitelist.


An entirely sufficient cause for removing a default whitelist entry.

--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not For Hire (currently)


Re: Refreshing the IRC channel

2020-02-26 Thread Bill Cole

On 26 Feb 2020, at 23:21, Kevin A. McGrail wrote:


Didn't even know we had a channel. 


It's documented in the wiki and has apparently been in existence for a 
decade.


I only poked my head in because I was getting an IRC client set up for 
my first ASF Members' Meeting. I hadn't had a working IRC client since 
Ircle broke and was abandoned ~2010 and hadn't been active on any 
channel in this century...



Who is in the channel with ops? 


No one visibly present has ops.


How
do you tell?


/msg chanserv access #spamassassin list


Answer to that command is:

[00:44:41]access #spamassassin list
[00:44:42] -ChanServ-   Entry Nickname/Host  Flags
[00:44:42] -ChanServ-   - -- -
[00:44:42] -ChanServ-	1 khopesh+ARefiorstv [modified 
10y 18w 2d ago]
[00:44:42] -ChanServ-	2 Herk   +ARefiorstv [modified 
10y 18w 2d ago]
[00:44:42] -ChanServ-	3 Warren +ARefiorstv [modified 
9y 7w 6d ago]
[00:44:42] -ChanServ-	4 Darxus +ARefiorstv [modified 
8y 49w 2d ago]
[00:44:42] -ChanServ-	5 freenode-staff +AFRefiorstv 
[modified 7y 36w 4d ago]
[00:44:42] -ChanServ-	6 patdk-wk   +ARefiorstv [modified 
7y 5w 6d ago]

[00:44:42] -ChanServ-   - -- -
[00:44:42] -ChanServ-   End of #spamassassin FLAGS listing.




On 2/13/2020 9:46 AM, Bill Cole wrote:

#spamassassin has this "topic" message:

Latest Version: SpamAssassin 3.4.1 |
http://wiki.apache.org/spamassassin/ | Also on this network:
#mailscanner, #amavis, #postfix

Set by Darxus on July 27, 2015 at 5:08:47 PM EDT

Anyone reading this who can fix that and also expand/correct the
population of people with ops rights on the channel, please do so. I
would think that adding PMC members, particularly those who are
recently engaged with the project, would be good choices for ops.


--
Kevin A. McGrail
kmcgr...@apache.org

Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not For Hire (currently)


Refreshing the IRC channel

2020-02-13 Thread Bill Cole

#spamassassin has this "topic" message:

	Latest Version: SpamAssassin 3.4.1 | 
http://wiki.apache.org/spamassassin/ | Also on this network: 
#mailscanner, #amavis, #postfix


Set by Darxus on July 27, 2015 at 5:08:47 PM EDT

Anyone reading this who can fix that and also expand/correct the 
population of people with ops rights on the channel, please do so. I 
would think that adding PMC members, particularly those who are recently 
engaged with the project, would be good choices for ops.

--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: ANNOUNCE: Apache SpamAssassin 3.4.3 available

2019-12-12 Thread Bill Cole

On 12 Dec 2019, at 11:36, sebb wrote:


Please don't ever use HTML for announce mails.


One might as well say "Please don't ever top-post."

Kevin's announcement message was multipart/alternative with a text/plain 
part first. As superfluous as the text/html part was, this style of mail 
is the default format generated by the MUAs used by the vast majority of 
users.



They are more likely to be treated as spam -- as this one was


If you are using SpamAssassin and don't locally rescore HTML_MESSAGE or 
make it a sub-rule of a meta-rule with a significant score, that is 
simply not true. Using the default SA ruleset & scores, that message 
scored -6.0, i.e. definitely not spam.


If you are using some other spam detection tool which considers the mere 
existence of a text/html part in a multipart/alternative message to be a 
significant indicator of spam, that bug should be discussed with that 
broken tool's developer(s).


If you simply have made a personal decision to treat such mail as spam, 
as it is absolutely your right to decide, you should be reconciled by 
now to the fact that a lot of legitimate mail sent by people who will 
never switch to sending pure text/plain mail is misidentified by your 
chosen configuration.



-- and so may
be overlooked by the moderators.


This mailing list is not moderated.

--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: New Release Candidate: 3.4.3-rc6 Testers Needed

2019-11-20 Thread Bill Cole

+1 for this. Noting the problem at 'info' is good.


On 20 Nov 2019, at 13:41, Henrik K wrote:


On Wed, Nov 20, 2019 at 07:29:10PM +0100, Giovanni Bechis wrote:

anybody against this diff ?

 Giovanni

Index: lib/Mail/SpamAssassin/DnsResolver.pm
===
--- lib/Mail/SpamAssassin/DnsResolver.pm(revision 1870052)
+++ lib/Mail/SpamAssassin/DnsResolver.pm(working copy)
@@ -858,7 +858,7 @@
   if ($rcode eq 'REFUSED' || $id =~ 
m{^\d+/NO_QUESTION_IN_PACKET\z}) {

 # the failure was already reported above
   } else {
-info("dns: no callback for id %s, ignored; packet: %s",
+dbg("dns: no callback for id %s, ignored; packet: %s",
  $id,  $packet ? $packet->string : "undef" );
   }
   # report a likely matching query for diagnostic purposes


If you would check trunk it's already there.  Please use the same code 
for

uniformity.

info("dns: no callback for id $id, ignored, packet on next 
debug line");

# prevent filling normal logs with huge packet dumps
dbg("dns: %s", $packet ? $packet->string : "undef");



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: Check_bayes sub in Bayes.pm

2019-06-17 Thread Bill Cole

On 17 Jun 2019, at 16:41, Shreyansh Shrivastava. wrote:

Check_bayes() is declared in Bayes.pm. An object of Bayes.pm is made 
in

init() sub of spamassassin.pm but check_bayes() isn't called there.

The 23_bayes.cf file has various configuration for check_bayes() 
having

different min,max defined. Also check_bayes() is passed to
register_eval_rule() of Conf.pm. I am not able to connect these dots 
to get

the entry point.


When the Mail::SpamAssassin object is initialized, it parses all of the 
.cf files and constructs a subroutine from each "rule" line loaded. See 
Mail::SpamAssassin::Conf::Parser, particularly add_test() and 
finish_parsing().



The rule lines in 23_bayes.cf like this:

body BAYES_00   eval:check_bayes('0.00', '0.01')
body BAYES_05   eval:check_bayes('0.01', '0.05')

are each translated into a test subroutine that calls check_bayes(). The 
pile of subroutines that is constructed from rules is called when 
Mail::SpamAssassin->check() is called, by way of 
Mail::SpamAssassin::PerMsgStatus->check() and 
Mail::SpamAssassin::PerMsgStatus->check_timed().


Does that help?


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: The obviously different case of subtest debug flood

2019-06-07 Thread Bill Cole

On 7 Jun 2019, at 14:53, John Hardin wrote:


Now if the hits were duplicates, and we logged something like:

Jun  7 11:25:44.265 [1569] dbg: rules: ran body rule __LOWER_E ==> 
got hit: "e" (100)


...where we're not collapsing on solely the rule name, I'd accept 
that.


FWIW, __LOWER_E specifically is this:

body__LOWER_E   /e/

So there's no issue of hits varying at all. It's evil twin 
__E_LIKE_LETTER will hit on anything that looks like an 'e' so it does 
have diverse hits.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: __E_LIKE_LETTER & __LOWER_E filling subtests debug

2019-05-31 Thread Bill Cole

On 31 May 2019, at 7:46, Kevin A. McGrail wrote:

Well there might be 6 rules like this but testing emails for lower 
case e

hits the maxhits on a lot of emails.


If anyone has insight into how I might measure a character occurrence 
ratio in messages less noisily, I'm eager to be enlightened.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: __E_LIKE_LETTER & __LOWER_E filling subtests debug

2019-05-31 Thread Bill Cole

On 30 May 2019, at 20:35, Kevin A. McGrail wrote:

I was curious if anyone noticed the debug output for subtests has 
gotten

insane:


It got a little discussion on users@ when I created those rules.

[...]

72_active.cf:    body    __LOWER_E   
/e/
72_active.cf:    tflags  __LOWER_E   
multiple maxhits=230


72_active.cf:    body    __E_LIKE_LETTER 
//
72_active.cf:    tflags  __E_LIKE_LETTER multiple 
maxhits=320


Assuming those maxhits are correct,


They are. In fact they were carefully tuned to catch the targeted 
extortion spam.



maybe we need something in the debug
output that says __E_LIKE_LETTER (number of hits if more than 1).


That would be a useful enhancement even without my flagrant log 
vandalism.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: svn commit: r1856009 - /spamassassin/trunk/rules/73_sandbox_manual_scores.cf

2019-03-21 Thread Bill Cole

I'm extremely uneasy with this.

We should not be manually scoring an essentially untested/unproven 
DNS-based list that is outside of project control at such a powerful 
level.


DNSBLs tend to decay into "list the world" mode eventually after they 
die, often far sooner than anyone intends or expects.


(FWIW: locally I'm scoring DKIMWL_* at 0.001 because I have no basis for 
believing it to be useful, yet.)


On 21 Mar 2019, at 16:16, p...@apache.org wrote:


Author: pds
Date: Thu Mar 21 20:16:30 2019
New Revision: 1856009

URL: http://svn.apache.org/viewvc?rev=1856009=rev
Log:
Adjust the score for DKIMWL

Modified:
spamassassin/trunk/rules/73_sandbox_manual_scores.cf

Modified: spamassassin/trunk/rules/73_sandbox_manual_scores.cf
URL: 
http://svn.apache.org/viewvc/spamassassin/trunk/rules/73_sandbox_manual_scores.cf?rev=1856009=1856008=1856009=diff

==
--- spamassassin/trunk/rules/73_sandbox_manual_scores.cf (original)
+++ spamassassin/trunk/rules/73_sandbox_manual_scores.cf Thu Mar 21 
20:16:30 2019

@@ -83,4 +83,5 @@ score FILL_THIS_FORM_LONG  2.00
 # Lots of hate; score as informative hammy, may override locally
 score RP_MATCHES_RCVD   -0.001

-
+# pds
+score DKIMWL_WL_HIGH-7.5



--
Bill Cole


Re: sa-update redirecting to ruleqa???

2019-03-05 Thread Bill Cole

On 5 Mar 2019, at 12:33, John Hardin wrote:

Attempting to grab the latest published rules update (using a utility 
script I wrote) I get this:


  Latest revision: 1854751
  wget https://buildbot.spamassassin.org/updates/1854751.tar.gz ...


That host is not in the mirror list at 
http://spamassassin.apache.org/updates/MIRRORED.BY
Or 
https://svn.apache.org/repos/asf/spamassassin/site/updates/MIRRORED.BY



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole


Re: [Bug 7692] New: mkrules regression tests do not pass after commit r1849441

2019-02-25 Thread Bill Cole

On 25 Feb 2019, at 14:56, Kevin A. McGrail wrote:


All, it's frustrating that commits are happening without proper make
tests.  It delays things greatly when trying to do releases that they
aren't in a passing state.  Please consider this.


Mea culpa. The change in r1849441 was to fix bug 7302, which caused 
quietly broken rule distribution. I couldn't imagine that the test suite 
would depend on build/mkrules building a known-bad ruleset without 
aborting.


Of course, we still want to make sure that build/mkrules doesn't 
knowingly emit a bad ruleset with nothing but a warning.


I'll test an alternative approach... If anyone has a brilliant idea on 
this, speak up!



 Forwarded Message 
Subject:[Bug 7692] New: mkrules regression tests do not pass after
commit r1849441
Date:   Mon, 25 Feb 2019 07:52:48 +
From:   bugzilla-dae...@spamassassin.apache.org
To: dev@spamassassin.apache.org



https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7692

Bug ID: 7692
Summary: mkrules regression tests do not pass after commit
r1849441
Product: Spamassassin
Version: unspecified
Hardware: PC
OS: OpenBSD
Status: NEW
Severity: normal
Priority: P2
Component: Regression Tests
Assignee: dev@spamassassin.apache.org
Reporter: giova...@paclan.it
Target Milestone: Undefined

After commit r1849441 t/mkrules.t fails with:

$ prove t/mkrules.t t/mkrules.t .. 23/97 # Failed test at t/mkrules.t
line 134.
cannot open log/log/mkrules_t/rules/70_sandbox.cf or
log/mkrules_t/rules/70_sandbox.cf at t/SATest.pm line 706.
# Failed test at t/mkrules.t line 136.
Not found: manif_found = 70_sandbox.cf: WARNING: not listed in 
manifest

file at t/mkrules.t line 138.
# Failed test at t/SATest.pm line 783.
Not found: lint_failed = LINT FAILED at t/mkrules.t line 138.
# Failed test at t/SATest.pm line 783.
# Failed test at t/mkrules.t line 138.
t/mkrules.t .. 85/97 # Looks like you failed 5 tests of 97.
t/mkrules.t .. Dubious, test returned 5 (wstat 1280, 0x500)
Failed 5/97 subtests Test Summary Report
---
t/mkrules.t (Wstat: 1280 Tests: 97 Failed: 5)
Failed tests: 23-24, 26-27, 30
Non-zero exit status: 5
Files=1, Tests=97, 6 wallclock secs ( 0.05 usr 0.01 sys + 4.42 cusr 
1.62

csys = 6.10 CPU)
Result: FAIL

Reverting the commit is a workaround.
Not sure if the problem is in mkrules or in regression tests.

--
You are receiving this mail because:
You are the assignee for the bug.


Re: Cron ~/svn/trunk/build/mkupdates/run_nightly | /usr/bin/tee /var/www/automc.spamassassin.org/mkupdates/mkupdates.txt

2018-12-27 Thread Bill Cole

On 25 Dec 2018, at 7:41, Kevin A. McGrail wrote:


Rules runs still failing lint...  snippet below

ERROR: LINT FAILED, suppressing output: rules/72_active.cf


I'm 99% sure that I've resolved this.

I looked around on sa-vm1 and found that ~automc/svn/trunk/rulesrc had 
not been updated since 2018-12-19 08:31 UTC, i.e. the first nightly run 
with the errant 'replace_tag' line outside of an ifplugin block. That 
failure triggered a bug in the run_nightly script: apparently if the 
'make' fails, the code that updates ~automc/svn/trunk/rulesrc from svn 
never runs, making the problem permanent until fixed manually on svn. 
This issue gets masked by the fact that different pieces of the 
masscheck/ruleqa complex keep copies of the rulesrc tree in 7 different 
places for various purposes, each of a different age.




--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole


Re: Cron ~/svn/trunk/build/mkupdates/run_nightly | /usr/bin/tee /var/www/automc.spamassassin.org/mkupdates/mkupdates.txt

2018-12-25 Thread Bill Cole

On 25 Dec 2018, at 7:41, Kevin A. McGrail wrote:


Rules runs still failing lint...  snippet below

ERROR: LINT FAILED, suppressing output: rules/72_active.cf

 ifplugin Mail::SpamAssassin::Plugin::FreeMail due to tflags nopublish
(tflags nopublish)
make: *** [build_rules] Error 2
+ exit 2
Makefile:1790: recipe for target 'build_rules' failed


This failure differs from last week's in that:

1. There is still in fact a non-empty lint-clearing 72_active.cf file 
being generated and distributed


2. I cannot reproduce this error on a clean checkout.

I think this is some sort of 'phantom rule' left over from before I 
fixed the problematic replace_tag last week. mkrules is still 
complaining about it, even though in my sandbox it has been fixed.


Will examine later.

--
Bill Cole


Re: Subtest __E_LIKE_LETTER and __LOWER_E listed many times in message header

2018-12-10 Thread Bill Cole

On 10 Dec 2018, at 1:56, Henrik K wrote:


On Mon, Dec 10, 2018 at 08:46:32AM +0200, Henrik Krohns wrote:

On Sun, Dec 09, 2018 at 01:06:01PM -0500, Bill Cole wrote:


To make this determination, the rules require the 'multiple' flag 
without
a cap on thne number of matches which a 'maxhits' parameter would 
set.


Please don't do unlimited maxhits, it's terrible if message 
accidently or
intentionally contains thousands of e's.  The eval code runs all 
sorts of
crap for every hit, not to mention the mass of debug lines it 
potentially

creates.

If I read right, isn't it enough to set __LOWER_E maxhits=21 and
__E_LIKE_LETTER maxhits=211 for the clause to evaluate as true?

body__LOWER_E   /e/i
tflags  __LOWER_E   multiple
replace_rules   __E_LIKE_LETTER
body__E_LIKE_LETTER //
tflags  __E_LIKE_LETTER multiple
metaMIXED_ES( __LOWER_E > 20 ) && ( 
__E_LIKE_LETTER > ( (__LOWER_E * 14 ) / 10) ) && ( __E_LIKE_LETTER > 
( 10 * __LOWER_E ) )

describeMIXED_ESToo many es are not es


Also consider limiting __HAS_IMG_SRC, __HAS_HREF, 
__HAS_IMG_SRC_ONECASE,

__HAS_HREF_ONECASE


Done.


I would use non-greedy .*? in all those also

/^[^>].*

Done.

Thanks for the input!

--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole


Re: Subtest __E_LIKE_LETTER and __LOWER_E listed many times in message header

2018-12-10 Thread Bill Cole

On 10 Dec 2018, at 1:46, Henrik Krohns wrote:


On Sun, Dec 09, 2018 at 01:06:01PM -0500, Bill Cole wrote:


To make this determination, the rules require the 'multiple' flag 
without
a cap on thne number of matches which a 'maxhits' parameter would 
set.


Please don't do unlimited maxhits, it's terrible if message accidently 
or
intentionally contains thousands of e's.  The eval code runs all sorts 
of
crap for every hit, not to mention the mass of debug lines it 
potentially

creates.


I recognize this as an issue, and I'm trying to think up alternative 
approaches. The ruleqa performance of this rule is puzzling.



If I read right, isn't it enough to set __LOWER_E maxhits=21 and
__E_LIKE_LETTER maxhits=211 for the clause to evaluate as true?


That would break the *correct* logic, which I just noticed was mangled 
by a typo in the revision I made yesterday to evade the 'possible divide 
by zero' mis-parse.


The goal is to identify messages where the ratio of all e-like 
characters (__E_LIKE_LETTER ) to simple Latin 'e' characters (__LOWER_E) 
is between 1.4 and 10. My reasoning for a range of ratios is that 
messages of any significant size will use one script predominantly, but 
perhaps not exclusively.


Consider a message with 200 U+0065 characters and 220 U+0435 characters: 
__LOWER_E = 200, __E_LIKE_LETTER  = 420. The ratio is 2.1, so this is a 
message which would match the intended logic. However, with your 
proposed maxhits limits: __LOWER_E = 21, __E_LIKE_LETTER  = 211 so the 
ratio is 10.05, no match.


Also consider a message with 200 U+0065 characters and 9 U+0435 
characters: __LOWER_E = 200, __E_LIKE_LETTER  = 209. The ratio is 1.045, 
so this is a message which would NOT match the intended logic. However, 
with your proposed maxhits limits: __LOWER_E = 21, __E_LIKE_LETTER  = 
209 so the ratio is 9.95, a match.


Finding a fine-tuned pair of maxhits values is hard, particularly since 
I don't have a good corpus of the target spam or of ham that 
*apparently* (according to ruleqa stats) is being matched by the current 
rule in some corpora. I've set maxhits at 250 and 400 for now on the 
principle that the spam I'm really targeting has less than half of 
those.




body__LOWER_E   /e/i
tflags  __LOWER_E   multiple
replace_rules   __E_LIKE_LETTER
body__E_LIKE_LETTER //
tflags  __E_LIKE_LETTER multiple
metaMIXED_ES( __LOWER_E > 20 ) && ( 
__E_LIKE_LETTER > ( (__LOWER_E * 14 ) / 10) ) && ( __E_LIKE_LETTER > ( 
10 * __LOWER_E ) )


This is now fixed:

meta  MIXED_ES  ( __LOWER_E > 20 ) && ( __E_LIKE_LETTER > ( (__LOWER_E * 
14 ) / 10) ) && ( __E_LIKE_LETTER < ( 10 * __LOWER_E ) )



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole


Re: svn commit: r1845736 - /spamassassin/branches/3.4/lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm

2018-11-04 Thread Bill Cole

On 4 Nov 2018, at 8:36, h...@apache.org wrote:


Author: hege
Date: Sun Nov  4 13:36:22 2018
New Revision: 1845736

URL: http://svn.apache.org/viewvc?rev=1845736=rev
Log:
Skip duplicate lookups


I'm uneasy about adding a de facto DNS caching change in 3.4.3. (And 
generally about apps caching name lookups, but that's a larger 
discussion...)


Are we absolutely sure that the pms object isn't EVER going to be reused 
for another message? e.g. in tools like MIMEDefang that spawn 
long-running workers to handle multiple messages?


Does the duplicate lookup really matter if the system resolver should 
have the result in a cache that presumably honors TTLs, making replies 
very fast for queries that should be cached?




Modified:
spamassassin/branches/3.4/lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm

Modified: 
spamassassin/branches/3.4/lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm
URL: 
http://svn.apache.org/viewvc/spamassassin/branches/3.4/lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm?rev=1845736=1845735=1845736=diff

==
--- spamassassin/branches/3.4/lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm 
(original)
+++ spamassassin/branches/3.4/lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm 
Sun Nov  4 13:36:22 2018

@@ -1059,6 +1059,10 @@ sub lookup_dnsbl_for_ip {
 sub lookup_single_dnsbl {
   my ($self, $pms, $obj, $rulename, $lookupstr, $dnsbl, $qtype) = @_;

+  my $qkey = "$rulename:$lookupstr:$dnsbl:$qtype";
+  return if exists $pms->{uridnsbl_seen_lookups}{$qkey};
+  $pms->{uridnsbl_seen_lookups}{$qkey} = 1;
+
   my $key = "DNSBL:" . $lookupstr . ':' . $dnsbl;
   my $ent = {
 key => $key, zone => $dnsbl, obj => $obj, type => 'URI-DNSBL',



--
Bill Cole


Re: Evasion with Unicode format characters

2018-10-30 Thread Bill Cole

On 30 Oct 2018, at 7:07, Cedric Knight wrote:


I'd be grateful for advice as to whether there's merit in filing these
concerns as one or more issues on Bugzilla, or for relevant 
background.


I do not believe the codebase is the place to address these issues, 
which are addressable in carefully created rules. Because your approach 
would hide useful data patterns from rules, it is exactly the wrong way 
to go about "solving" a problem with a novel flavor of spam. As John & 
Kevin have noted, they have worked on the specific case of the extortion 
spams in publicly available rules. I also have an ancient bundle of 
rules that I've been adjusting for the modern world and existence 
outside of my idiosyncratic environment (where severe FPs are 
evaded/mitigated) which is promising and will be public in some way 
soon.


Also, change this substantial in the core behavior of SA would be almost 
certain to NOT get into 3.4.3, which will be out soon and is likely to 
be dominant in production systems for some time despite the (coming 
soon) 4.0 release. If this were done in code rather than in rules, it 
would never be usable for sites not ready or able to go to 4.0


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole


PROPOSED CHANGE FOR REVIEW: Bug 2555 documentation of new functionality

2018-09-04 Thread Bill Cole
The code for the "%x" token implementing Exim-style domain defaulting 
for virtual user config directories was committed last week. This change 
is solely to the embedded pod documentation in both 3.4 and trunk:



Index: 3.4/spamd/spamd.raw
===
--- 3.4/spamd/spamd.raw (revision 1839977)
+++ 3.4/spamd/spamd.raw (working copy)
@@ -3450,6 +3450,10 @@
 words, if the username is an email address, this is the part after the 
C<@>

 sign.

+=item %x -- replaced with the full name of the current user, as sent by 
spamc.
+If the resulting config directory does not exist, replace with the 
domain part

+to use a domain-wide default.
+
 =item %% -- replaced with a single percent sign (%).

 =back
Index: trunk/spamd/spamd.raw
===
--- trunk/spamd/spamd.raw   (revision 1839832)
+++ trunk/spamd/spamd.raw   (working copy)
@@ -3450,6 +3450,10 @@
 words, if the username is an email address, this is the part after the 
C<@>

 sign.

+=item %x -- replaced with the full name of the current user, as sent by 
spamc.
+If the resulting config directory does not exist, replace with the 
domain part

+to use a domain-wide default.
+
 =item %% -- replaced with a single percent sign (%).

 =back



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steadier Work: https://linkedin.com/in/billcole


Re: weekly masscheck blew up

2018-09-01 Thread Bill Cole

On 1 Sep 2018, at 7:31 (-0400), Axb wrote:


My weekly masscheck blew up a while ago..

Bareword found where operator expected at 
/data/masscheckwork/weekly_mass_check/masses/../lib/Mail/SpamAssassin/Util.pm 
line 770, near "s/\t//gr"
Bareword found where operator expected at 
/data/masscheckwork/weekly_mass_check/masses/../lib/Mail/SpamAssassin/Util.pm 
line 778, near "s/\t//gr"
syntax error at 
/data/masscheckwork/weekly_mass_check/masses/../lib/Mail/SpamAssassin/Util.pm 
line 770, near "s/\t//gr"
syntax error at 
/data/masscheckwork/weekly_mass_check/masses/../lib/Mail/SpamAssassin/Util.pm 
line 778, near "s/\t//gr"


anybody else?
Axb


You are probably the only person submitting masschecks running <5.14.

Fixed by removing 'r' modifier, which was introduced in Perl 5.14 and 
wasn't needed anyway.




--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steadier Work: https://linkedin.com/in/billcole


Re: [Bug 7596] Update distribution to satisfy new Apache policies, including replacing SHA-1 checksums with SHA-256 or SHA-512

2018-08-24 Thread Bill Cole

On 24 Aug 2018, at 19:24, John Hardin wrote:


How far back does the SHA256 support go?


Not far:

$ svn log sa-update.raw  |grep -A1 -B3 SHA256


		r1826916 | billcole | 2018-03-15 23:15:19 -0400 (Thu, 15 Mar 2018) | 1 
line


		added optional support for SHA256 in addition to or instead of SHA1 
validation






Would 3.3.x or 3.4.0 be broken by dropping the SHA1 sigs?


Yes. 3.4.1 as well.


Re: RCVD_NUMERIC_HELO rule removal breaks lint

2018-08-22 Thread Bill Cole

On 22 Aug 2018, at 13:34, John Hardin wrote:


Giovanni:

The removal of RCVD_NUMERIC_HELO has broken the build (for me at 
least):


Checking anything
ok 1
1..2
HELO_MISC_IP depends on RCVD_NUMERIC_HELO which is nonexistent
HELO_MISC_IP depends on RCVD_NUMERIC_HELO which is nonexistent
HELO_MISC_IP depends on RCVD_NUMERIC_HELO which is nonexistent
HELO_MISC_IP depends on RCVD_NUMERIC_HELO which is nonexistent
ok 1
not ok 2
#   Failed test at t/basic_meta.t line 93.
# Looks like you failed 1 test of 2.


Are you still working on cleaning up the side-effects of that change?


If you rebuild the rules from a clean base, it will work. Relevant 
commits made this morning.


In trunk:

rm rules/*
svn update
make build_rules

And if you're working in the 3.4 branch, you should splice in rules, 
rulesrc, and rules-extra from trunk via symlinks or separate checkouts 
from trunk.


Re: svn commit: r1838588 - /spamassassin/branches/3.4/t/dnsbl.t

2018-08-21 Thread Bill Cole

On 21 Aug 2018, at 19:22 (-0400), kmcgr...@apache.org wrote:


Author: kmcgrail
Date: Tue Aug 21 23:22:01 2018
New Revision: 1838588

URL: http://svn.apache.org/viewvc?rev=1838588=rev
Log:
Reminder not to leave -D

Modified:
spamassassin/branches/3.4/t/dnsbl.t

Modified: spamassassin/branches/3.4/t/dnsbl.t
URL: 
http://svn.apache.org/viewvc/spamassassin/branches/3.4/t/dnsbl.t?rev=1838588=1838587=1838588=diff

==
--- spamassassin/branches/3.4/t/dnsbl.t (original)
+++ spamassassin/branches/3.4/t/dnsbl.t Tue Aug 21 23:22:01 2018
@@ -160,5 +160,6 @@ describe DNSBL_SB_MISS  DNSBL SenderBase
 tflags DNSBL_SB_MISS   net
 ");

-sarun ("-D -t < data/spam/dnsbl.eml 2>&1", \_run_cb);
+#note: don't leave -D here, it causes spurious passes
+sarun ("-t < data/spam/dnsbl.eml 2>&1", \_run_cb);
 ok_all_patterns();


This change is bad. It causes the test to fail spuriously and it shuts 
off the possibility of one of the antipatterns  ever matching, since the 
match is against a debug message.


My earlier small change to the patterns in this test relied on the -D 
output, as does the core of one of the antipatterns. The former could be 
fixed, the latter not so much. This change and its "reminder" antecedent 
should be reversed.





--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steadier Work: https://linkedin.com/in/billcole


Re: [GitHub] spamassassin pull request #2: Fixed Spelling.

2018-07-29 Thread Bill Cole
On 29 Jul 2018, at 20:56 (-0400), Kevin A. McGrail wrote:

> I moderated this through.  Any one know about this github for SA?

I am aware of its existence and also of its description:

   Read-only mirror of Apache SpamAssassin. Submit patches to
   https://bz.apache.org/SpamAssassin/. Do not send pull requests
   http://spamassassin.apache.org




>  Forwarded Message 
> Subject:  [GitHub] spamassassin pull request #2: Fixed Spelling.
> Date: Sun, 29 Jul 2018 22:39:40 + (UTC)
> From: jimmycasey 
> Reply-To: dev@spamassassin.apache.org
> To:   dev@spamassassin.apache.org
>
>
>
> GitHub user jimmycasey opened a pull request:
>
> https://github.com/apache/spamassassin/pull/2
>
> Fixed Spelling.
>
>
>
> You can merge this pull request into a Git repository by running:
>
> $ git pull https://github.com/jimmycasey/spamassassin trunk
>
> Alternatively you can review and apply these changes as the patch at:
>
> https://github.com/apache/spamassassin/pull/2.patch
>
> To close this pull request, make a commit to your master/trunk branch
> with (at least) the following in the commit message:
>
> This closes #2
>
> 
> commit b76d18fea774f467a224ce99de9862a225dcc51b
> Author: Jimmy Casey 
> Date:   2018-07-29T21:47:46Z
>
> Fixed Spelling.
>
> 
>
>
> ---



-- 
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steadier Work: https://linkedin.com/in/billcole


Re: IMPORTANT: Issue with get_uri_list in PMS in 3.4 blocking work on 3.4.2 release

2018-06-26 Thread Bill Cole

On 26 Jun 2018, at 12:01, Kevin A. McGrail wrote:

 Just an update that I was using the wrong prove with a custom 
compiled

perl so a newer perl did fix the issue.

Also, Giovanni's hint about the issue led to the patch that was 
causing the

issue which I proved with backporting to 3.4.1

Anyway, much more info at https://bz.apache.org/
SpamAssassin/show_bug.cgi?id=7591 as we are working the issue!


I have attached a test script to the bug and committed it to both 3.4 
and trunk. Because the bug is somewhere deep in Perl itself and 
dependent on an unknown combination of factors, it uses 3 different 
mechanisms to count URIs in 6 different simple messages.


Re: IMPORTANT: Issue with get_uri_list in PMS in 3.4 blocking work on 3.4.2 release

2018-06-25 Thread Bill Cole
e 
spamassassin script:


[root@centosvm trunk]# /usr/bin/perl5.16.3 -T ./spamassassin -D all,uri 
<3urlsplus 2>&1 |grep 'host[123]\.'
Jun 25 18:22:05.572 [67809] dbg: uri: parsed uri found of type parsed, 
http://host3.example.com
Jun 25 18:22:05.572 [67809] dbg: uri: cleaned parsed uri, 
http://host3.example.com
Jun 25 18:22:05.572 [67809] dbg: uri: parsed host host3.example.com, 
domain example.com
Jun 25 18:22:05.572 [67809] dbg: uri: parsed uri found of type parsed, 
http://host2.example.com
Jun 25 18:22:05.572 [67809] dbg: uri: cleaned parsed uri, 
http://host2.example.com
Jun 25 18:22:05.572 [67809] dbg: uri: parsed host host2.example.com, 
domain example.com
Jun 25 18:22:05.572 [67809] dbg: uridnsbl: domain example.com in skip 
list, host host3.example.com
Jun 25 18:22:05.572 [67809] dbg: uridnsbl: domain example.com in skip 
list, host host2.example.com
Jun 25 18:22:05.703 [67809] dbg: rules: ran uri rule __LOCAL_PP_NONPPURL 
==> got hit: "http://host3.example.com;
Jun 25 18:22:05.931 [67809] dbg: uri: parsed uri found of type parsed, 
http://host3.example.com
Jun 25 18:22:05.931 [67809] dbg: uri: cleaned parsed uri, 
http://host3.example.com
Jun 25 18:22:05.931 [67809] dbg: uri: parsed host host3.example.com, 
domain example.com
Jun 25 18:22:05.931 [67809] dbg: uri: parsed uri found of type parsed, 
http://host2.example.com
Jun 25 18:22:05.931 [67809] dbg: uri: cleaned parsed uri, 
http://host2.example.com
Jun 25 18:22:05.931 [67809] dbg: uri: parsed host host2.example.com, 
domain example.com
Jun 25 18:22:05.931 [67809] dbg: uri: parsed uri found of type parsed, 
http://host1.example.com
Jun 25 18:22:05.931 [67809] dbg: uri: cleaned parsed uri, 
http://host1.example.com
Jun 25 18:22:05.931 [67809] dbg: uri: parsed host host1.example.com, 
domain example.com

 an url: http://host1.example.com
 an url: http://host2.example.com
 an url: http://host3.example.com
[root@centosvm trunk]# /usr/local/bin/perl5.18.4 -T ./spamassassin -D 
all,uri <3urlsplus 2>&1 |grep 'host[123]\.'
Jun 25 18:22:15.708 [67813] dbg: uri: parsed uri found of type parsed, 
http://host2.example.com
Jun 25 18:22:15.708 [67813] dbg: uri: cleaned parsed uri, 
http://host2.example.com
Jun 25 18:22:15.708 [67813] dbg: uri: parsed host host2.example.com, 
domain example.com
Jun 25 18:22:15.708 [67813] dbg: uri: parsed uri found of type parsed, 
http://host3.example.com
Jun 25 18:22:15.708 [67813] dbg: uri: cleaned parsed uri, 
http://host3.example.com
Jun 25 18:22:15.708 [67813] dbg: uri: parsed host host3.example.com, 
domain example.com
Jun 25 18:22:15.708 [67813] dbg: uri: parsed uri found of type parsed, 
http://host1.example.com
Jun 25 18:22:15.708 [67813] dbg: uri: cleaned parsed uri, 
http://host1.example.com
Jun 25 18:22:15.708 [67813] dbg: uri: parsed host host1.example.com, 
domain example.com
Jun 25 18:22:15.708 [67813] dbg: uridnsbl: domain example.com in skip 
list, host host3.example.com
Jun 25 18:22:15.708 [67813] dbg: uridnsbl: domain example.com in skip 
list, host host2.example.com
Jun 25 18:22:15.708 [67813] dbg: uridnsbl: domain example.com in skip 
list, host host1.example.com
Jun 25 18:22:15.852 [67813] dbg: rules: ran uri rule __LOCAL_PP_NONPPURL 
==> got hit: "http://host3.example.com;
Jun 25 18:22:16.045 [67813] dbg: uri: parsed uri found of type parsed, 
http://host3.example.com
Jun 25 18:22:16.045 [67813] dbg: uri: cleaned parsed uri, 
http://host3.example.com
Jun 25 18:22:16.045 [67813] dbg: uri: parsed host host3.example.com, 
domain example.com
Jun 25 18:22:16.045 [67813] dbg: uri: parsed uri found of type parsed, 
http://host2.example.com
Jun 25 18:22:16.045 [67813] dbg: uri: cleaned parsed uri, 
http://host2.example.com
Jun 25 18:22:16.045 [67813] dbg: uri: parsed host host2.example.com, 
domain example.com
Jun 25 18:22:16.045 [67813] dbg: uri: parsed uri found of type parsed, 
http://host1.example.com
Jun 25 18:22:16.045 [67813] dbg: uri: cleaned parsed uri, 
http://host1.example.com
Jun 25 18:22:16.045 [67813] dbg: uri: parsed host host1.example.com, 
domain example.com

 an url: http://host1.example.com
 an url: http://host2.example.com
 an url: http://host3.example.com


I'm very much convinced this is something very strangely broken in Red 
Hat's bespoke variant of 5.16.3.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steadier Work: https://linkedin.com/in/billcole


Re: IMPORTANT: Issue with get_uri_list in PMS in 3.4 blocking work on 3.4.2 release

2018-06-25 Thread Bill Cole

On 25 Jun 2018, at 17:43 (-0400), Kevin A. McGrail wrote:

Bill, did you see the note from Giovanni? :" reverting r1823205 makes 
the

regression test work on 3.4 and trunk for me.
tested on CentOS6 (perl 5.10.1), can anybody confirm ?"


Yes. I have not tried to confirm yet. That seems very odd, since 
r1823205 was just a file shuffle in 3.4. Reverting r1823205 in trunk 
does nothing.


Is there an actual bug report for this that I'm just not finding?

Has anyone reproduced the error with any version of Perl other than Red 
Hat's 5.16.3?


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steadier Work: https://linkedin.com/in/billcole


Re: IMPORTANT: Issue with get_uri_list in PMS in 3.4 blocking work on 3.4.2 release

2018-06-25 Thread Bill Cole

On 25 Jun 2018, at 17:42 (-0400), Bill Cole wrote:

I have been unable to reproduce the problem on any of the following 
OS/Perl combinations:


Ubuntu Trusty, Perl 5.18.2-2ubuntu1.6
Ubuntu Xenial, Perl 5.22.1-9ubuntu0.5
MacOS X 10.6.8, Perl 5.26.2 (MacPorts build)
MacOS X 10.6.8, Perl v5.10.0 (Apple build)
MacOS X 10.11.6, Perl 5.26.2 (MacPorts build)
CentOS 7.5.1804, Perl 5.18.4 (Perl.org source, default build config)
CentOS 7.5.1804, Perl 5.28.0 (Perl.org source, default build config)


Also:

Fedora Core 27, Perl 5.26.2-406.fc27

--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steadier Work: https://linkedin.com/in/billcole


Re: IMPORTANT: Issue with get_uri_list in PMS in 3.4 blocking work on 3.4.2 release

2018-06-25 Thread Bill Cole

On 25 Jun 2018, at 9:54, Kevin A. McGrail wrote:

Just to be clear, others concur that with 3 uris or less, it works, 4 
or

more it fails. It's inconsistent and exists in trunk as well.


Yes. Also, it does not matter whether the URIs are pure ASCII or not, 
which implies that finding this in the idn_dots.t test was somewhat 
accidental, as was the fact that "use utf8" masked it. I have reverted 
my commit to the 3.4 branch with that change, since it was not solving 
the root bug.


Annoyingly, running the test script or the spamassassin script with the 
Perl debugger also masked the error.



It's inconsistent depending on the platform as well.


I can only replicate it on CentOS 7 with the (Red Hat) stock build of 
Perl 5.16.3. I'd guess that RHEL7 behaves identically but do not have a 
RHEL7 system to test. I have tested on other platforms (see below.)




I am not sure if it is a Perl bug or an SA bug or something we are 
doing

wrong but it is a blocker.


It's a Red Hat bug IMHO.

I have been unable to reproduce the problem on any of the following 
OS/Perl combinations:


Ubuntu Trusty, Perl 5.18.2-2ubuntu1.6
Ubuntu Xenial, Perl 5.22.1-9ubuntu0.5
MacOS X 10.6.8, Perl 5.26.2 (MacPorts build)
MacOS X 10.6.8, Perl v5.10.0 (Apple build)
MacOS X 10.11.6, Perl 5.26.2 (MacPorts build)
CentOS 7.5.1804, Perl 5.18.4 (Perl.org source, default build config)
CentOS 7.5.1804, Perl 5.28.0 (Perl.org source, default build config)

I had a *TRANSIENT* problem on Ubuntu Trusty at first but it was so 
weird (inconsistent between tests!) that I wiped the working directory 
and rebuilt from a fresh checkout, after which I could not get a 
failure.




Re: IMPORTANT: Issue with get_uri_list in PMS in 3.4 blocking work on 3.4.2 release

2018-06-23 Thread Bill Cole

On 22 Jun 2018, at 14:29 (-0400), Kevin A. McGrail wrote:


Hi All,

3.4 is not passing tests for me with the idn_dots.t and it appears to 
point
to a problem in P:M:S::get_uri_list.   I'm bleary from looking at this 
for

three days.  Can someone take a look at this?

If you modify the t/idn_dots to print the uri list from the generated
message in the test, it fails in 3.4 but passes in Trunk and in the 
3.4.1
release.  See below for output but basically there is a missing URI 
which

is why the test correctly fails.


I have made the test work by adding "use utf8" to the test script. This 
is just avoiding the underlying subtle bug.


The breakage is only seen (so far) on the RedHat perl 5.16.3 packaged 
for EL7 and derivatives. I believe that 5.16.x was the last major 
release to NOT work in UTF-8 by default without "use utf8" explicitly 
used. I have replicated the incorrect parse with the spamassassin script 
and a message with all-ascii URLs, so the problem is somewhere in the 
spectacularly complicated RE that extracts URIs from the cooked text 
array inside PerMsgStatus->get_uri_detail_list. Making matters worse, if 
I run either t/idn_dots.t or spamassassin with the Perl debugger (-d) 
the parse works.


Anyone who is still using an even older Perl could assist simply by 
confirming that the 3.4 branch from SVN fails subtest 4 of t/idn_dots.t 
if you remove or comment out the "use utf8" line I added to that file 
today.


It would be interesting to see it the problem would be solved by adding 
"use utf8" to every .pm that had a "use bytes" declaration before 2017. 
This is a bit of a shotgun approach but simpler than hunting for the 
specific issue. I'd try it myself, but that I'm basically on my last 
stealable minute for the weekend already.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steadier Work: https://linkedin.com/in/billcole


Re: Re Building SpamAssassin

2018-05-12 Thread Bill Cole

On 12 May 2018, at 21:53 (-0400), Saahil Sirowa wrote:


How can I build SpamAssassin on my local machine?


https://wiki.apache.org/spamassassin/SingleUserUnixInstall isn't a bad 
description, except that it references an old version and is extremely 
verbose and covers more than you are likely to need.


The simpler version:

svn co https://svn.apache.org/repos/asf/spamassassin/trunk
cd trunk
perl Makefile.PL
make
make test
make install

You can substitute "trunk" with whichever source branch or tag you are 
interested in working on. The most relevant ones are 'trunk' (which will 
be 4.0.0,) 'branches/3.4' (which will ultimately be 3.4.2,) and 
'tags/spamassassin_release_3_4_1' (the current release version, which 
has some bad bugs but is a static reference point if that's what you 
need...)





--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole


Re: +#!/usr/bin/env perl -w

2018-05-10 Thread Bill Cole
On 10 May 2018, at 17:06, Todd Rinaldo wrote:

>> On May 10, 2018, at 1:51 PM, Dave Jones  wrote:
[...]
>>
>> I think the safest way to handle this is to leave the script as it was a 
>> couple of days ago and on those systems that have perl in a different place, 
>> launch it with the full path to perl with the script as an arg:
>>
>> /usr/local/bin/perl masscheck.pl
>
> ExtUtils::MakeMaker should switch any #! it finds with the perl that invoked 
> Makefile.PL as it installs it. The catch is that the #! has to match 
> /^#!\S+perl/

More precisely...

The Makefile(s) generated by ExtUtils::MakeMaker will include targets that 
switch any #! it finds in targets defined in Makefile.PL with the path of the 
perl that ran Makefile.PL OR was specified as an option when running 
Makefile.PL. The #! line has to match /^#!\S+perl/

Nothing under t/, xt/, or masses/ is a make target defined in Makefile.PL. They 
are all untouched by ExtUtils::MakeMaker or make because they have no target 
definitions in Makefile.PL.


signature.asc
Description: OpenPGP digital signature


Re: +#!/usr/bin/env perl -w

2018-05-10 Thread Bill Cole

On 10 May 2018, at 3:21 (-0400), Axb wrote:


Why is this needed?


It is not possible to rely on /usr/bin/perl existing or being the 
"right" perl on some platforms. Most obviously, Perl was removed long 
ago from the FreeBSD base and the symlink at /usr/bin/perl was removed 
from the perl5 port as of v5.20. On MacOS, /usr/bin/perl is Apple's 
build of Perl, most recently v5.18, and historically it has been risky 
business to make significant enhancements to that Perl world, so the 
Perl installed for the purposes of SA might be in /usr/local, 
/opt/local, /sw, or in a user homedir.


The shebang trick using /usr/bin/env is the simplest way to address the 
issue of there being multiple interpreter worlds on a system for 
different bespoke purposes. If $PATH is set correctly I don't see how it 
is breaking on CentOS, although I guess it may be a consequence of 
local::lib usage.


The less simple but more rigorous fix for the underlying issue is to 
handle all scripts in the distribution in the same manner as the main SA 
scripts: a .raw version in the source that gets run through 
build/preprocessor with a suitable set of arguments. It is not clear to 
me why we have a substantial collection of scripts (including the test 
scripts) with a fixed interpreter path despite having a tool whose 
functions include making that a build-time configurable. There are even 
scripts which apparently are only ever run on sa-vm1 that have this 
shebang line: #!/local/perl586/bin/perl


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole


Re: +#!/usr/bin/env perl -w

2018-05-10 Thread Bill Cole
On 10 May 2018, at 8:07 (-0400), Kevin A. McGrail wrote:

> Agreed that is a mistake.  I am on the road.  Who made the change?

Me.

I've reverted it.

Details coming...

-- 
Bill Cole


Re: svn commit: r1828937 - /spamassassin/trunk/rules/60_whitelist_auth.cf

2018-04-11 Thread Bill Cole

On 11 Apr 2018, at 17:50 (-0400), Dave Jones wrote:


On 04/11/2018 04:29 PM, billc...@apache.org wrote:

Author: billcole
Date: Wed Apr 11 21:29:08 2018
New Revision: 1828937

URL: http://svn.apache.org/viewvc?rev=1828937=rev
Log:
Google Forms has generated spam, befouling the google.com reputation

Modified:
 spamassassin/trunk/rules/60_whitelist_auth.cf

Modified: spamassassin/trunk/rules/60_whitelist_auth.cf
URL: 
http://svn.apache.org/viewvc/spamassassin/trunk/rules/60_whitelist_auth.cf?rev=1828937=1828936=1828937=diff

==
--- spamassassin/trunk/rules/60_whitelist_auth.cf (original)
+++ spamassassin/trunk/rules/60_whitelist_auth.cf Wed Apr 11 21:29:08 
2018

@@ -80,7 +80,6 @@ def_whitelist_auth *@visadpsmessage.com
  def_whitelist_auth *@*.pinterest.com
  def_whitelist_auth *@indeed.com
  def_whitelist_auth *@*.hyatt.com
-def_whitelist_auth *@*.google.com
  def_whitelist_auth *@*.sears.com
  def_whitelist_auth *@*.jcpenney.com
  def_whitelist_auth *@*.landsend.com




Do you have an example email of this?


Discussed on the Users list today. A mostly-Thai form with an internal 
Hotmail address.


If we report this to Google and they handle it properly, it doesn't 
mean that we need to remove this entry unless there is a major problem 
with trust.


I disagree. Handling complaints (which Google mostly doesn't in any 
case) is entirely inadequate to justify trusting mail sent by users they 
don't actually know with an active backend that has a track record of 
abuse. Google Docs has become a phishing platform and we should not be 
telling people to trust it by default.



A single email occurrence is not enough to remove them.


I don't have copies of the similar-sender garbage I've been rejecting 
because it has been aimed at bogus local addresses.


Besides, this *@*.google.com shouldn't be that common under a 
subdomain of google.com.  It's not *@google.com which would be a 
higher risk.


No, *@google.com is still apparently only Google corporate mail. The 
only spam I've ever seen from such addresses is stupid recruiter tricks.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole


Wiki edit access please

2018-02-02 Thread Bill Cole

username: BillCole

Why: I keep finding wrongness/staleness that I want to fix.

If there's a way for committers to enable this for ourselves, I'm too 
clueless to find it...



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole


whitelist_* in default ruleset considered harmful (was Re: Extending the entries in 60_whitelist_spf.cf)

2017-11-30 Thread Bill Cole
TL;DR: These need to be def_whitelist_auth NOT whitelist_auth as you 
have been committing them. See the earlier exchange between myself and 
RW, who had assumed this was only about def_whitelist_auth entries.


Precisely because most users will never bother managing a large number 
of local rules, using whitelist_auth with its -100 score prevents the 
rest of SA (and prudent local rules) from mitigating the whitelisting 
effect in the event of a compromise or change in behavior of a 'trusted' 
sender. While the documentation doesn't say so explicitly, the 
implication in the Mail::SpamAssassin::Conf descriptions of the 
def_whitelist_* directives is that the default whitelist entries all use 
those less powerful versions. That was true until this week. I think 
changing back to that practice is imperative, albeit not enough of an 
emergency to fix unilaterally without discussion here.


On 26 Nov 2017, at 12:04 (-0500), Dave Jones wrote:

The current 60_whitelist_spf.cf is 11 years old.  What does everyone 
think about starting a 60_whitelist_auth.cf and extending this list to 
known good senders like *@alertsp.chase.com and 
*@email.dropboxmail.com?


My SA platform has very good results with thousands of whitelist_auth 
entries but 98% of the SA users are not going to know to create/manage 
these entries themselves.  Combined with other rules this also helps 
with spoofing legit senders like the IRS, Bank of America, etc.  I am 
not suggesting we put thousands of entries in the new 
60_whitelist_auth.cf but the common, high-profile, large senders that 
often get spoofed.


The current list of def_whitelist_from_spf entries is very beneficial 
and should be extended now that SPF and DKIM are widely deployed and 
are being taken seriously by the major mail hosting providers like 
Google.


Thanks,

Dave




--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole


Re: Extending the entries in 60_whitelist_spf.cf

2017-11-28 Thread Bill Cole

On 27 Nov 2017, at 10:22 (-0500), RW wrote:


On Sun, 26 Nov 2017 23:54:12 -0500
Bill Cole wrote:



Any whitelisting in the default ruleset should carry MUCH lower
weight than local explicit whitelisting ... NO sender should get a
default -100 just because we (SA maintainers) think they generally
mean well.



This isn't new functionality, there are already such default
whitelisting entries based on

def_whitelist_from_rcvd
def_whitelist_from_spf
def_whitelist_from_dkim


I'm entirely aware of that...

The proposal is to add  extra entries based on def_whitelist_auth, 
which

is a shorthand for separate def_whitelist_from_spf and
def_whitelist_from_dkim entries.


Well, the actual *COMMIT TO TRUNK* 
(http://svn.apache.org/viewvc?rev=1816394=rev) uses whitelist_auth 
for 6 entities, which IMHO is a terrible idea for the reasons I noted in 
my prior message.


Also terrible: whitelisting facebookmail.com, which really should be in 
the freemail domains list. Looks like dropboxemail.com is fine (legit 
use is DB->user mail) but email.dropbox.com belongs in the freemail 
domains as well.



The current entries are a bit incoherent. The scores are:

score USER_IN_DEF_WHITELIST -15.000  (from def_whitelist_from_rcvd)
score USER_IN_DEF_SPF_WL -7.500
score USER_IN_DEF_DKIM_WL-7.500

which suggests that a lot of overlap is expected on the latter two. 
But

the great majority of address globs are only for dkim.

I think a case can be made for transferring most of the score to a
single metarule.


I agree.


And, personally, I think -15 is a bit too much.


I mostly agree. Fooling def_whitelist_from_rcvd (given the actual list) 
is likely a harder target than finding a permissively-typo'd SPF record 
or cracking an account in one of the many domains in the other two, so 
I've got no problem with it being as strong as both of them combined. 
However, in my experience it is really rare for FP's to score more than 
8 (absent grossly over-scored local rules) so maybe cutting each of 
those in half would make sense.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole


Re: Extending the entries in 60_whitelist_spf.cf

2017-11-26 Thread Bill Cole

On 26 Nov 2017, at 20:00 (-0500), John Hardin wrote:


On Sun, 26 Nov 2017, Axb wrote:


On 11/26/2017 06:04 PM, Dave Jones wrote:
 The current 60_whitelist_spf.cf is 11 years old.  What does 
everyone think
 about starting a 60_whitelist_auth.cf and extending this list to 
known

 good senders like *@alertsp.chase.com and *@email.dropboxmail.com?

 My SA platform has very good results with thousands of 
whitelist_auth
 entries but 98% of the SA users are not going to know to 
create/manage
 these entries themselves.  Combined with other rules this also 
helps with
 spoofing legit senders like the IRS, Bank of America, etc.  I am 
not
 suggesting we put thousands of entries in the new 
60_whitelist_auth.cf but

 the common, high-profile, large senders that often get spoofed.

 The current list of def_whitelist_from_spf entries is very 
beneficial and
 should be extended now that SPF and DKIM are widely deployed and 
are being

 taken seriously by the major mail hosting providers like Google.


+1

Pls remember the "ifplugin" :)


+1 as well.


Conditional +1 from me...

Any whitelisting in the default ruleset should carry MUCH lower weight 
than local explicit whitelisting. See scoring for USER_IN_* rules as a 
template. Frankly, half of the 6 entities in today's commit have spammed 
me personally AND hit pure spamtraps multiple times over the space of 
years. While I will stipulate that they mostly send legit mail to people 
who want it, I also know with absolute certainty that they also send 
mail to people who they have no business mailing and repeatedly to 
addresses that no legitimate sender would try sending to more than once. 
NO sender should get a default -100 just because we (SA maintainers) 
think they generally mean well.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole


Re: warn: Use of uninitialized value $4 in concatenation (.) or string at Mail/SpamAssassin/Plugin/URIDNSBL.pm line 1042.

2017-06-08 Thread Bill Cole

On 7 Jun 2017, at 22:52, Philip Prindeville wrote:


I’m still seeing this now.

And yes, for spamhaus.  And yes, $ip is being passed in as ‘(‘.  
It should be possible to get a stack trace and figure out why it’s 
being passed in as that value...


Are we any closer to having a fix for this?


This bug has patches which have been integrated in svn:
   https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7231

This is my analysis of the root cause in a duplicate bug:
   https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7339#c2



I’m leaving a file here:

http://www.redfish-solutions.com/downloads/spamhaus.eml

where I can reproduce the issue with:

% spamassassin -D check -t spamhaus.eml


This sample does not produce the error for me using a patched SA 3.4.1


Re: PROPOSAL: Be stricter about creating Bugzilla issue before committing

2017-04-15 Thread Bill Cole

On 14 Apr 2017, at 22:40, Sidney Markowitz wrote:

I propose that we make a practice of 1) Create a Bugzilla issue for 
anything
we commit to source files or to any other files that are part of the 
build
process; 2) Include the text "bug " somewhere in the commit 
message.



Comments, votes?


+1


Re: M

2016-10-30 Thread Bill Cole

On 30 Oct 2016, at 4:53, Steve Sargent wrote:


Does anyone run SA on an iMac running OSX?


Yes, but that's really a better topic for the 
us...@spamassassin.apache.org list rather than here on the developers 
list.



If so can you offer any tips on how to do it please?


Via the MIMEDefang milter, attached to Postfix. Note that this is 
probably not the class of answer you were actually looking for and even 
if it is, it is not the one that most people using an iMac as a mail 
server would give. Since I suspect you are looking for a way to use SA 
client-side, you are far more likely to find someone who does that on 
the users' mailing list. Note that SA is mostly used on mail servers 
rather than on client machines, but since it's a suite of Perl modules 
and scripts rather than a single monolithic program, it's possible in 
principle to integrate it with any mail client that has a plugin 
mechanism. Whether anyone has actually done that is a question I cannot 
answer.


(NOTE REPLY-TO HEADER)


Re: Cpan build of 3.4.1 fails on sa-compile tests (Debian)

2016-10-10 Thread Bill Cole

On 10 Oct 2016, at 13:25, Jari Fredriksson wrote:


Jari Fredriksson kirjoitti 10.10.2016 20:03:


Jari Fredriksson kirjoitti 10.10.2016 19:18:


Says:

t/sa_compile.t  1/? # Failed test 1 in 
t/sa_compile.t at line 149


Not found: FOO =  check: tests=FOO  at t/sa_compile.t line 150.

# Failed test 2 in t/SATest.pm at line 755

'/root/.cpan/build/Mail-SpamAssassin-3.4.1-2/t/log/d.sa_compile/inst.basic/foo//local/bin/sa-compile 
--keep-tmps' failed: DIED, signal 127 () at 
t/SATest.pm line 991.


What could this mean?


That SA bugs 7005, 7181, and 7188 remain an unresolved tangle, which is 
partly my fault...


Appears to mean that I have to "force install" and then try to 
sa-compile. That way I get the missing dependencies all right.


That might work, IF this was a simple dependencies problem. It is not.

For Perl modules, a better approach than doing a "force install" in the 
CPAN shell and seeing how the installed module breaks is this:


 o conf make_arg "TEST_VERBOSE=1"
 test 

That gives you much more detail from the bare 'make test' that CPAN runs 
for you.


There might be a place for a bug ticket for the build system, but 
can't say if it's cpan or spamassassin...


This specific problem is a match for SA bug 7181, but I don't fully 
understand it in context. It could perhaps be Debian-specific, if the 
Debian package does something special with the file layout that the test 
isn't prepared for.


The t/sa_compile.t test seems mis-designed to me, since it doesn't test 
the current SA build but rather an entirely new build that it creates 
from scratch. Quoting Kevin from bug 7181, it is "a bit odder than all 
the other tests and has issues with prefix issues." I suspect the prefix 
issues are ALL of its issues, since the only way to fail test 1 in this 
particular way is if the special spamassassin script built just for the 
sa_compile.t tests fails to find the special rules file containing one 
rule that is created just for the sa_compile.t tests.


In short: the spamassassin script which was built by sa_compile.t isn't 
looking for rules where sa_compile.t is putting them.


This is broken, but it is entirely breakage in sa_compile.t, which does 
not use the configuration of the build which it supposedly is testing or 
any of the parts of SpamAssassin influenced in any way by that config.



I'm going to get this installed now.



Not really. Bizarre results from sa-compile after force install:

http://pastebin.ca/3727286


From the top:

each on reference is experimental at 
/usr/local/share/perl/5.20.2/Mail/SpamAssassin/Plugin/URILocalBL.pm 
line 353,  line 717.
keys on reference is experimental at 
/usr/local/share/perl/5.20.2/Mail/SpamAssassin/Plugin/URILocalBL.pm 
line 377,  line 717.
keys on reference is experimental at 
/usr/local/share/perl/5.20.2/Mail/SpamAssassin/Plugin/URILocalBL.pm 
line 406,  line 717.


These warn against the use of an experimental feature (removed in Perl 
5.24.0) and that usage was removed by this: 
https://svn.apache.org/viewvc?view=revision=1684653


plugin: failed to parse plugin (from @INC): Can't locate object 
method "lib_version" via package "Geo::IP" at 
/usr/local/share/perl/5.20.2/Mail/SpamAssassin/Plugin/URILocalBL.pm 
line 117,  line 717.


That implies that you need to install the MaxMind "Legacy" GeoIP 
library. https://github.com/maxmind/geoip-api-c is the upstream, dunno 
if there's a Debian package.


The many warnings about Bug 6558 are harmless. They warn of a real (if a 
bit unusual) potential issue if you happen to be compiling rules that 
you're planning to move in compiled form to a system with an old Perl 
and SA. Once a possibly-bad rule has been compiled once on a system, 
sa-compile will never warn on that rule  again unless and until it is 
changed, since sa-compile maintains a cache of compiled rules.


Re: Rule updates are too old - 2016-10-10

2016-10-10 Thread Bill Cole

On 10 Oct 2016, at 5:00 -0400, dar...@chaosreigns.com wrote:


SpamAssassin version 3.3.1 has not had a rule update since 2016-10-09.

20161009:  Spam and ham are above threshold of 150,000:  
http://ruleqa.spamassassin.org/?daterev=20161009

20161009:  Spam: 192155, Ham: 192458

The spam and ham counts on which this script alerts are from
http://ruleqa.spamassassin.org/?daterev=20161009
Click "(source details)" (it's tiny and low contrast).
It's from the second and third columns of the line that ends with
"(all messages)"

The source to this script is
http://www.chaosreigns.com/sa/update-version-mon.pl

It looks like both the weekly and nightly masschecks need to have 
sufficient

corpora in order for an update to be generated.


I'm a bit new here so maybe I'm missing something, but aren't these 
notifications bogus? It seems so, because I also see these:


On 9 Oct 2016, at 22:37 -0400, spamassassin_r...@apache.org wrote:


Author: spamassassin_role
Date: Mon Oct 10 02:37:50 2016
New Revision: 1764017

URL: http://svn.apache.org/viewvc?rev=1764017=rev
Log:
updated scores for revision 1763946 active rules added since last 
mass-check



On 10 Oct 2016, at 4:51 -0400, spamassassin_r...@apache.org wrote:


Author: spamassassin_role
Date: Mon Oct 10 08:51:39 2016
New Revision: 1764032

URL: http://svn.apache.org/viewvc?rev=1764032=rev
Log:
promotions validated

Added:
spamassassin/tags/sa-update_3.4.2_20161010085037/
  - copied from r1764031, spamassassin/trunk/



And it seems we've had such daily updates for all of October, yet I see 
4 notifications from this script. What am I missing?