Re: how sendgrid is abusing the ukraine crisis (or they are still to dumb to filter for spam)

Greg Troxel Fri, 04 Mar 2022 12:55:01 -0800

Bill Cole <sausers-20150...@billmail.scconsult.com> writes:

> On 2022-03-04 at 09:18:08 UTC-0500 (Fri, 04 Mar 2022 09:18:08 -0500)
> Greg Troxel <g...@lexort.com>
> is rumored to have said:
>
>> Greg Troxel <g...@lexort.com> writes:
>>
>>> With stock scores, sendgrid gets
>>>
>>>  2.1 URIBL_GREY             Contains an URL listed in the URIBL greylist
>>>                             [URIs: sendgrid.net]
>>>  1.5 KAM_SENDGRID           Sendgrid being exploited by scammers
>>>
>>> and I find 3.6 a bit much.


(sorry, URIBL_GREY is only 1.1, so that's 2.6 between them)

> Note that those are quasi-independent rules. URIBL looks at all of the
> URIs in a message. KAM_SENDGRID only hits mail transferred through
> Sendgrid where the From header and envelope sender addresses are in
> unrelated domains.

I meant only that I find that for this particular sender, both rules
hit.

> I may be wrong, but I do not believe that all Sendgrid ham will hit
> either of those rules, although much surely will hit both. The KAM
> rules don't go through QA that would reveal their overlap/independence
> as the stock rules do, so there's not a good way that I can check.

I am unclear on if KAM_SENDGRID is supposed to hit on legit mail from
sendgrid; it is for this particular class of ham.  It sounds like you
think at least some sendgrid ham will hit this.

Return-Path: seems like it matches __KAM_SENDGRID1A, Received looks like
it matches __KAM_SENDGRID2, and the From: is from the government
office's domain.

>>> But maybe 72% of what sendgrid sends is
>>> spam?  (Knowing the spam % is actually a serious question.)
>>
>> sorry, didn't quite get back to stock for that  test, so I think it's
>> only 1.1+1.5=2.6, so tuned for 52% spam...
>
> FWIW, that is NOT how the math works for score determination. Even for
> the stock rules which get programmatically adjusted as a set, that's
> not a "tuning" target that would be useful or even have a calculable
> solution.

Sorry, I do know that, but what I was trying to get at, and did so
badly, was that if a rule has a score of 2.5, then I would expect that a
fairly large amount of the messages that trigger it would be spam.
Otherwise, I would expect that score to be reduced by the tuning
algorithms.

> The rule score tuning doesn't really pay any attention to aggregate
> score values except in >/< relation to the threshold. If 100% of a
> sender's mail is ham that just happens to score 4.2, that's great. If
> it is 100% spam, all scoring 5.2, that's also great. If it is a 50/50
> mix that SA scores perfectly at either 4.2 or 5.2, that would be
> astoundingly good. Message scores do NOT have a score distribution
> that can be approximated by any combination of statistically useful
> distributions which could support the sort of score arithmetic you are
> positing.

I see your point but it would be interesting to see the %spam data (out
of some background ham/spam a priori rate) per rule, somehow in a
scatter plot with score.

Also given how things are, if ham scored 4.2 it would take very little
in terms of a 1-point rule or 2 x .5 rules triggering vs not to push it
over.  So while 4.2 is a good score for ham in the metrics, it's not in
my view a good score for a ham message viewed over the ensemble of other
things that are likely to happen.

All I'm really trying to say is that ham getting 2.5 from one rule moves
it halfway to threshold, where it gets marked as spam if the rest of the
rules give it >=2.5.

> I wish Justin had originally made the base score -5 and the threshold
> 0. It's 20 years too late to fix that, but it would have made it
> easier for people to avoid wrong mathematical assumptions about the
> value of the aggregate score of a message.

I do know how scores are determined for the base ruleset (and above you
said that the KAM scores aren't determined that way, I think).

And I know it's against doctrine, but I find that the odds of spam
change from near 0 at -2 to near 1 at >=4.  Just above about 2, its
roughly 50%, and it's not linear.  Because of that I treat 3 different
from <1, putting 3 in a maybe-spam folder not allowed to show up on my
phone.  I know that's not how SA's "was this message scored
correctly" is defined, but I find this sort of sorting very useful.

The message in question did actually get to 5.0.  I've tweaked scores,
up and down, so I know that doesn't technically count.

signature.asc
Description: PGP signature

Re: how sendgrid is abusing the ukraine crisis (or they are still to dumb to filter for spam)

Reply via email to