Hi,
For the last years I have been contributing nightly masscheck data from
my personal MTA setup. This has resulted in a rather small dataset
compared to other contributors, but it seemed useful to me, mainly
because of the non-english ham corpus (I have no idea if that is a valid
assumption though).
Since a few weeks I've moved to a new MTA setup, where I no longer
perform spam/virus scanning myself, but an upstream provider does this,
and all mail (including all spam and virus content) is delivered with
appropriate headers.
I have no problem spending some time on setting up a new masscheck job
that uses the new corpus and tune it to ignore the upstream filter
result headers etc, but I'd rather not invest time if you think that
such a feed is not beneficiary to the ruleqa process.
I'd be happy to hear your thoughts.
PS 1 There are also some spamtraps that don't use the upstream service,
but the contributed corpus from that is quite low.
PS 2 Contributed masscheck data from the last few weeks is not based on
messages delivered through this upstream provider, only existing corpus
from the old setup was used.
Kind regards,
Tom