On 22-05-2023 15:53, Bill Cole wrote:
On 2023-05-21 at 07:02:39 UTC-0400 (Sun, 21 May 2023 13:02:39 +0200)
Tom Hendrikx <[email protected]>
is rumored to have said:

Hi,

For the last years I have been contributing nightly masscheck data from my personal MTA setup. This has resulted in a rather small dataset compared to other contributors, but it seemed useful to me, mainly because of the non-english ham corpus (I have no idea if that is a valid assumption though).

THANK YOU!

One of the things that worries me most about SA is that we don't have a robust and diverse community of masscheck contributors. I don't have great ideas to fix that, but I am always grateful for the people who have put in the effort for the community.

I always feel a bit icky about the big userbase that depends on (projects like) spamassassin, and the small amount of contributions or peer review to code or rules. This is what I can contribute to improve that situation, and happy to actually do so.


Since a few weeks I've moved to a new MTA setup, where I no longer perform spam/virus scanning myself, but an upstream provider does this, and all mail (including all spam and virus content) is delivered with appropriate headers.

I have no problem spending some time on setting up a new masscheck job that uses the new corpus and tune it to ignore the upstream filter result headers etc, but I'd rather not invest time if you think that such a feed is not beneficiary to the ruleqa process.

I'd be happy to hear your thoughts.

I think we need as large and as diverse a collection of masscheck contributors as we can get. I am reluctant to ask you to add work to what I presume is a project to reduce your email efforts, but I hope you will continue to submit your results.


My professional interests in email security have already diminished years ago after switching jobs, but my personal interests always stayed alive. The amount of time that properly maintaining a fully self-managed email setup takes, is the reason that I'm switching to an external provider (f.i. the old setup was running Ubuntu 16.04 with distro-provided Spamassassin: version 3.4.2).

I'm still self-hosting though (on a new server), so I have full control on what to do with all messages. I have no problem with setting up mass-checks again with that dataset: corpus sorting is part of my daily routine, and running mass-checks is pretty effortless once correctly setup.

I'll make an attempt at a new mass-checks routine shortly.

Kind regards,
Tom

Reply via email to