Re: Score 0.001

2024-05-10 Thread Bowie Bailey

On 5/10/2024 2:57 AM, Thomas Barth wrote:

Am 2024-05-10 06:19, schrieb Reindl Harald (privat):

Am 10.05.24 um 00:05 schrieb Thomas Barth:

Am 2024-05-09 21:41, schrieb Loren Wilton:
Low-score tests are neither spam nor ham signs by themselves. They 
can be used in metas in conjunction with other indicators to help 
determine ham or spam. A zero value indicates that a rule didn't 
hit and the sign is not present. A small score indicates that the 
rule did hit, so the sign it is detecting is present.


0.001 seems to be the default lowest value. Is it possible to change 
it to 0.01 or 0.1?


what do you not understand in meta tests?
it's irrelevant if it's 0.001 or 0.01

these rules are used in combination with other rules

HTML_MESSAGE or SPF_HELO_NONE alone don't mean anything and so it 
must not score higher - it makes only sense combined with other rules


Most of the messages I receive only have a few hits because they 
hardly differ from a regular e-mail. That's why I want to assign a 
higher value to the individual tests. I don't know how many of the 
possible tests with a value of only 0.001 exist. With this value, 
theoretically 1000 different tests would have to be positive in order 
to achieve a total value of 1. Therefore, it is not irrelevant whether 
I have a minimum value of 0.001 or 0.1. I would even go further and 
say if there are more than 10 tests with a positive value: Spam! 
Either the strike level is reached or there are more than 10 tests 
with a positive value. So now I repeat my question: is it possible to 
increase the minimum value to 0.1 by default?


Going through this thread, I note that a few people have said "they are 
used in metas", but no one has actually given an example of how that works.


The rules with the low scores are not intended to contribute to the spam 
score for the email.  They only have a defined score at all because if 
the score is 0, SA will not run the rule.


It works like this:

Rule A has a score of 0.001
Rule B has a score of 0.001
Rule C is a meta that matches if both A and B match, and has a score of 5

It doesn't matter how small the scores are for rule A and B.  The only 
thing that matters is the score for rule C.  If only A matches, then it 
adds 0.001 to the score and the email is not spam.  If only B matches, 
then you get the same result.  But if they both match, then you get a 
score of 5.002.  The entire point of the 0.001 score is that you could 
match 100 of these rules and not affect the spam score.


These rules are generally things like, "the email has HTML", "there is 
an SPF check", "there is a google drive link", etc.  On their own, they 
do not mean anything, but metas can combine these low-scored rules into 
meaningful patterns that are then given larger scores.


--
Bowie


Re: Score 0.001

2024-05-10 Thread Bill Cole
On 2024-05-10 at 14:15:56 UTC-0400 (Fri, 10 May 2024 14:15:56 -0400)
Bill Cole 
is rumored to have said:

> On 2024-05-09 at 18:19:14 UTC-0400 (Thu, 9 May 2024 15:19:14 -0700)
> jdow 
> is rumored to have said:
>
>> On 20240509 15:05:46, Thomas Barth wrote:
>>> Am 2024-05-09 21:41, schrieb Loren Wilton:
 Low-score tests are neither spam nor ham signs by themselves. They can be 
 used in metas in conjunction with other indicators to help determine ham 
 or spam. A zero value indicates that a rule didn't hit and the sign is not 
 present. A small score indicates that the rule did hit, so the sign it is 
 detecting is present.
>>>
>>> 0.001 seems to be the default lowest value. Is it possible to change it to 
>>> 0.01 or 0.1?
>
> Sure. It's just a number.

Clarifying; You can change any score yourself on your own system locally if you 
like, but to make no rule ever score 0.001 you'd need to fix the scores for all 
low-score rules every time that you run sa-update. As John Hardin says, we will 
not be changing the default to 0.1 in the rules distribution; that would be too 
significant a value. I also think that there is value in having matched rules 
showing up in the long form (folded header) of the SA report with "0.0" if they 
are intended to have no direct impact on the ham/spam decision.


-- 
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Fwd: Re: Rule: "1.0 R_DCD 90% of .com. is spam"

2024-05-10 Thread Benny Pedersen

oh dear, when do he stop ?

 Original besked 
Emne: Re: Rule: "1.0 R_DCD 90% of .com. is spam"
Dato: 2024-05-10 20:17
Afsender: "Reindl Harald (gmail)" 
Modtager: Benny Pedersen 

Am 10.05.24 um 20:14 schrieb Benny Pedersen:

Matus UHLAR - fantomas skrev den 2024-05-10 18:46:

On 10.05.24 15:36, Rupert Gallagher wrote:
The ikea mail was received through ... 
mta-numbers.ikea.com.sparkpostmail.com and is a request for feedback.


The SA rule says ...

header R_DCD Received =~ /\.com\./

I still do not know where the rule comes from, DCD may actually mean 
dot-com-dot, and perhaps it is true that they are mostly spam.


where is the rule stored? what file?


On May 10, 2024, 17:18, Rupert Gallagher wrote:
I only have stock and KAM, and it is definitely not a custom rule of 
mine.


grep -r '\.com./' /var/lib/spamassassin/4.00/

seems some good dot.com rules everwhere


and what has this to do with the other idiot?
go and eat shit you dumb list spammer


Re: Score 0.001

2024-05-10 Thread Bill Cole
On 2024-05-10 at 11:00:45 UTC-0400 (Fri, 10 May 2024 08:00:45 -0700 (PDT))
John Hardin 
is rumored to have said:

> Note that poorly-performing rules may get a score that looks informational, 
> but that may change over time based on the corpora.

IOW: rules that in themselves are not good enough performers to get included in 
the daily active list will still be pulled into the active list with a trivial 
score if derivative meta rules which are good enough for real scores depend on 
them.

-- 
Bill Cole


Re: Score 0.001

2024-05-10 Thread Bill Cole
On 2024-05-09 at 18:19:14 UTC-0400 (Thu, 9 May 2024 15:19:14 -0700)
jdow 
is rumored to have said:

> On 20240509 15:05:46, Thomas Barth wrote:
>> Am 2024-05-09 21:41, schrieb Loren Wilton:
>>> Low-score tests are neither spam nor ham signs by themselves. They can be 
>>> used in metas in conjunction with other indicators to help determine ham or 
>>> spam. A zero value indicates that a rule didn't hit and the sign is not 
>>> present. A small score indicates that the rule did hit, so the sign it is 
>>> detecting is present.
>>
>> 0.001 seems to be the default lowest value. Is it possible to change it to 
>> 0.01 or 0.1?

Sure. It's just a number.

> 1) This cyberunit is unwarrantedly curious, why does this matter to you?
>
> 2) Probably not as  it may be related to how perl handles numbers.

Not so much. SA has no need for high-precision floating-point math so there is 
nothing special about 0.001 or 0.0001 or any other small number.

The reason for such low scores is to assure that the rule is checked, even if 
no other rule depends on it. Such rules usually are a component in multiple 
other meta rules that have more significant scores, but are not significantly 
spam or ham signs on their own.

-- 
Bill Cole


Re: Rule: "1.0 R_DCD 90% of .com. is spam"

2024-05-10 Thread Benny Pedersen

Matus UHLAR - fantomas skrev den 2024-05-10 18:46:

On 10.05.24 15:36, Rupert Gallagher wrote:
The ikea mail was received through ... 
mta-numbers.ikea.com.sparkpostmail.com and is a request for feedback.


The SA rule says ...

header R_DCD Received =~ /\.com\./

I still do not know where the rule comes from, DCD may actually mean 
dot-com-dot, and perhaps it is true that they are mostly spam.


where is the rule stored? what file?


On May 10, 2024, 17:18, Rupert Gallagher wrote:
I only have stock and KAM, and it is definitely not a custom rule of 
mine.


grep -r '\.com./' /var/lib/spamassassin/4.00/

seems some good dot.com rules everwhere




Re: Rule: "1.0 R_DCD 90% of .com. is spam"

2024-05-10 Thread Bill Cole
On 2024-05-10 at 11:08:53 UTC-0400 (Fri, 10 May 2024 15:08:53 +)
Rupert Gallagher 
is rumored to have said:

> R_DCD

That string does not occur anywhere in the SpamAssassin distribution, neither 
in the code nor in the rules, *including* the rules that are not currently 
performing well enough to in the active list.

If your system generated that hit, it is one of your own local rules. If it 
came from elsewhere, ask them.



-- 
Bill Cole


Re: Whitelist rules should never pass on SPF fail

2024-05-10 Thread Bill Cole
On 2024-05-09 at 17:21:07 UTC-0400 (Fri, 10 May 2024 07:21:07 +1000)
Noel Butler 
is rumored to have said:

> So what? domain owners state hard fail it SHOULD be hard failed, irrespective 
> of if YOU think you know better than THEM or not, if we hardfail we accept 
> the risks that come with it.

In principle, that is fine (as a demonstration of why some principles are 
pointless and do more harm than good...)

In practice, there is a prioritizing of whose wishes I prioritize on the 
receiving systems I work with. If my customer wants to receive the mail and the 
individual generating the mail is not generating that desire fraudulently, I 
don't care much about what the domain owner says. I do not work for the domain 
owners of the world and I am not obligated to enforce their usage rules on 
their users. Obviously I take their input seriously when trying to detect fraud 
but I've seen too many cases of "-all" being used with incomplete or obsolete 
lists of "permitted" hosts to accept that they know all of the places their 
mail gets generated.

I've also given up all hope of getting the few places that are still doing 
transparent forwarding to adopt SRS or any other mechanisms to avoid SPF 
breakage to ever change. There is no ROI in trying to fix such cases 
individually but users still want their college email addresses to work decades 
after graduating and some colleges have pandered to them. So have some 
professional orgs.


-- 
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: Rule: "1.0 R_DCD 90% of .com. is spam"

2024-05-10 Thread Matus UHLAR - fantomas

On 10.05.24 15:36, Rupert Gallagher wrote:

The ikea mail was received through ... mta-numbers.ikea.com.sparkpostmail.com 
and is a request for feedback.

The SA rule says ...

header R_DCD Received =~ /\.com\./

I still do not know where the rule comes from, DCD may actually mean 
dot-com-dot, and perhaps it is true that they are mostly spam.


where is the rule stored? what file?


On May 10, 2024, 17:18, Rupert Gallagher wrote:

I only have stock and KAM, and it is definitely not a custom rule of mine.



--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Spam is for losers who can't get business any other way.


Re: Rule: "1.0 R_DCD 90% of .com. is spam"

2024-05-10 Thread Rupert Gallagher
Ahhh

The ikea mail was received through ... mta-numbers.ikea.com.sparkpostmail.com 
and is a request for feedback.

The SA rule says ...

header R_DCD Received =~ /\.com\./

I still do not know where the rule comes from, DCD may actually mean 
dot-com-dot, and perhaps it is true that they are mostly spam.
 Original Message 
On May 10, 2024, 17:18, Rupert Gallagher wrote:

> I only have stock and KAM, and it is definitely not a custom rule of mine.
>
>  Original Message 
> On May 10, 2024, 17:11, Matus UHLAR - fantomas wrote:
>
>> On 10.05.24 15:08, Rupert Gallagher wrote: >My local evidence does not 
>> support the general claim that 90% of .com is spam. > >I just received a 
>> mail from informat...@info.email.ikea.com marked as spam, with positive 
>> R_DCD. The rule did not trigger on mail from other .com addresses. > >I do 
>> not know what R_DCD means, and search indexes do not help. Short of reading 
>> the source code, does anybody know what R_DCD means? I have no idea. where 
>> did you get this rule from? I don't see it in stock rules -- Matus UHLAR - 
>> fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to 
>> receive e-mail advertising to this address. Varovanie: na tuto adresu chcem 
>> NEDOSTAVAT akukolvek reklamnu postu. There's a long-standing bug relating to 
>> the x86 architecture that allows you to install Windows. -- Matthew D. Fuller

Re: Rule: "1.0 R_DCD 90% of .com. is spam"

2024-05-10 Thread Rupert Gallagher
I only have stock and KAM, and it is definitely not a custom rule of mine.

 Original Message 
On May 10, 2024, 17:11, Matus UHLAR - fantomas wrote:

> On 10.05.24 15:08, Rupert Gallagher wrote: >My local evidence does not 
> support the general claim that 90% of .com is spam. > >I just received a mail 
> from informat...@info.email.ikea.com marked as spam, with positive R_DCD. The 
> rule did not trigger on mail from other .com addresses. > >I do not know what 
> R_DCD means, and search indexes do not help. Short of reading the source 
> code, does anybody know what R_DCD means? I have no idea. where did you get 
> this rule from? I don't see it in stock rules -- Matus UHLAR - fantomas, 
> uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive 
> e-mail advertising to this address. Varovanie: na tuto adresu chcem 
> NEDOSTAVAT akukolvek reklamnu postu. There's a long-standing bug relating to 
> the x86 architecture that allows you to install Windows. -- Matthew D. Fuller

Re: Rule: "1.0 R_DCD 90% of .com. is spam"

2024-05-10 Thread Matus UHLAR - fantomas

On 10.05.24 15:08, Rupert Gallagher wrote:

My local evidence does not support the general claim that 90% of .com is spam.

I just received a mail from informat...@info.email.ikea.com marked as spam, 
with positive R_DCD. The rule did not trigger on mail from other .com addresses.

I do not know what R_DCD means, and search indexes do not help. Short of 
reading the source code, does anybody know what R_DCD means?


I have no idea. where did you get this rule from?
I don't see it in stock rules


--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
There's a long-standing bug relating to the x86 architecture that
allows you to install Windows.   -- Matthew D. Fuller


Rule: "1.0 R_DCD 90% of .com. is spam"

2024-05-10 Thread Rupert Gallagher
My local evidence does not support the general claim that 90% of .com is spam.

I just received a mail from informat...@info.email.ikea.com marked as spam, 
with positive R_DCD. The rule did not trigger on mail from other .com addresses.

I do not know what R_DCD means, and search indexes do not help. Short of 
reading the source code, does anybody know what R_DCD means?

Re: Score 0.001

2024-05-10 Thread John Hardin

On Fri, 10 May 2024, Thomas Barth wrote:

So now I repeat my question: is it possible to increase the minimum 
value to 0.1 by default?


Not really.

The score for a rule is either a fixed value assigned by the rule 
developer or a dynamic value calculated by masscheck nightly. There isn't 
a "macro" for informational scores that would affect them all at once; 
each informational rule would have to be updated individually.


And they are considered *informational* - they should not by themselves 
contribute to the ham/spam score, so a request to globally change the 
informational score from 0.0001 or 0.001 to 0.1 would not be approved.


For example, there is a rule that matches large monetary quantities in 
multiple formats and languages. That rule is used in combination with 
other rules to look for spam signs. It's scored as informational simply to 
expose the fact that the message has content like that, but by itself it 
doesn't indicate hammy or spammy content - the message could be a 419 
spam, or it could be a news article about the deficit.


Note that poorly-performing rules may get a score that looks 
informational, but that may change over time based on the corpora.



--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.org pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  You do not examine legislation in the light of the benefits it
  will convey if properly administered, but in the light of the
  wrongs it would do and the harms it would cause if improperly
  administered.  -- Lyndon B. Johnson
---
 4 days until the 76th anniversary of Israel's independence


Re: Score 0.001

2024-05-10 Thread jdow

On 20240509 23:57:12, Thomas Barth wrote:

Am 2024-05-10 06:19, schrieb Reindl Harald (privat):

Am 10.05.24 um 00:05 schrieb Thomas Barth:

Am 2024-05-09 21:41, schrieb Loren Wilton:
Low-score tests are neither spam nor ham signs by themselves. They can be 
used in metas in conjunction with other indicators to help determine ham or 
spam. A zero value indicates that a rule didn't hit and the sign is not 
present. A small score indicates that the rule did hit, so the sign it is 
detecting is present.


0.001 seems to be the default lowest value. Is it possible to change it to 
0.01 or 0.1?


what do you not understand in meta tests?
it's irrelevant if it's 0.001 or 0.01

these rules are used in combination with other rules

HTML_MESSAGE or SPF_HELO_NONE alone don't mean anything and so it must not 
score higher - it makes only sense combined with other rules


Most of the messages I receive only have a few hits because they hardly differ 
from a regular e-mail. That's why I want to assign a higher value to the 
individual tests. I don't know how many of the possible tests with a value of 
only 0.001 exist. With this value, theoretically 1000 different tests would 
have to be positive in order to achieve a total value of 1. Therefore, it is 
not irrelevant whether I have a minimum value of 0.001 or 0.1. I would even go 
further and say if there are more than 10 tests with a positive value: Spam! 
Either the strike level is reached or there are more than 10 tests with a 
positive value. So now I repeat my question: is it possible to increase the 
minimum value to 0.1 by default?


Values like .001 are generally used for meta rules. I use meta rules chiefly to 
combine multiple relatively benign tests into something significant when they 
all appear at once. (Or variations on that theme.) The tiny minimum score 
assures the rule gets processed and reported but has minimalist contributions to 
the overall score. If you want a minimum score of 0.004, 0.01, 100, whatever 
then use a "score" line associated with your rule. That saves you from the 
fruits of being too lazy to give a real score to a rule that is important to get 
just right.


{o.o}


Re: Score 0.001

2024-05-10 Thread Matus UHLAR - fantomas

On 09.05.24 20:41, Thomas Barth wrote:
I don't understand why there are so many checks where the meaningless 
value of 0.001 is assigned.


Those rules may be tested in the present.
They also may be informative, e.g. DMARC_MISSING or SPF_PASS
rules with score 0 are not used so using 0 is not possible in these cases.

Those rules may have different scores with diffent rulesets 
(bayes/non-bayes, network/non-netwotk)

And they can be used in metas, e.g:

score HTML_MESSAGE 0.001
meta OBFUSCATING_COMMENT   ((__OBFUSCATING_COMMENT_A && HTML_MESSAGE) || 
(__OBFUSCATING_COMMENT_B && MIME_HTML_ONLY)) && !__ISO_2022_JP_DELIM
score OBFUSCATING_COMMENT 0.000 0.000 0.001 0.723

The total score could be much higher. Do I 
have to define all the checks myself with a desired value?


you can redefine values if you think, but you should be careful about it.


X-Spam-Status: No, score=3.999 tagged_above=2 required=6.31
   tests=[DMARC_MISSING=0.001, FSL_BULK_SIG=0.001, 
HTML_IMAGE_RATIO_02=0.001,

   HTML_MESSAGE=0.001, PYZOR_CHECK=1.985, RELAYCOUNTRY_BAD=2,
   SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_TVD_MIME_EPI=0.01]

or

X-Spam-Status: Yes, score=7.281 tagged_above=2 required=6.31
   tests=[DMARC_MISSING=0.001, FSL_BULK_SIG=0.001, 
HTML_FONT_LOW_CONTRAST=0.001,
   HTML_IMAGE_ONLY_24=1.282, HTML_IMAGE_RATIO_02=0.001, 
HTML_MESSAGE=0.001,

   MIXED_HREF_CASE=1.999, PYZOR_CHECK=1.985, RELAYCOUNTRY_BAD=2,
   SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_TVD_MIME_EPI=0.01]



--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Chernobyl was an Windows 95 beta test site.