Re: CHAOS: v1.2.2: Of Documentation

2021-07-23 Thread Martin Gregorie
On Fri, 2021-07-23 at 19:49 +1000, Noel Butler wrote:
> I've still yet to see a list post explaining what this thing does
> so no he has not answered all questions about it, the most common sense
> thing of all time is if you advertise your wares, you at least tell
> people WTF it does, you don't send them to some web site to find out
> (which as some posters have indicated apparently does not even tell
> you).
> 

Yes, that is the same problem I have.

I understand that CHAOS generates rules and has fancy ways of setting
their scores but I've yet to understand:

- why it was developed in the first place, i.e. what problem(s) does it
  solve that manually written rules fail to address?

- what are its design principles?

- what do its generated rules do that that can't be done with manually
  written rules?

- how, if at all, does it test the rules it writes and what does it do
  with rules that either don't work as intended or hit ham instead of
  spam? 

- does it accept human input about what is spam and what is ham and if
  so, how is this input provided, maintained, and stored for future
  reference? 

  IOW: 
  - is it working entirely from messages found in the incoming mail
stream?
  - what about the outbound mail stream?
  - does it use mail archives or spam collections to test the rules it
generates

Martin




Re: CHAOS: v1.2.2: Of Documentation

2021-07-23 Thread Henrik K
On Fri, Jul 23, 2021 at 08:16:56AM +0300, Henrik K wrote:
> 
> > 2C) The initial release of CHAOS.pm did all kinds of scoring.  One of the
> > knocks I have about SpamAssassin is that is does not maintain counts of 
> > hits. 
> > My complaints about this go all the way back to 2010.  Counts and Amounts.  
> > SA
> > is great with Amounts.  It sucks with Counts.  To the SA Development crew's
> > credit, somewhere along the way, tflags were added to allow that 
> > functionality
> > in a very primitive fashion.  Many people are happy with that.  I'm just not
> > one of them.
> > ...
> > I read somewhere, while looking at META rules that SA internally builds an
> > array of the rules hit.  That way, as rules hit, METAs are then 
> > appropriately
> > updated.  Gee, an array.  Maybe we could add a count to that array if the 
> > user
> > wishes to?  I think that it is a lot of development; not so much the actual
> > process of doing it, but updating all the User handling thereof.  Alas, It 
> > is
> > what it is *SIGH*
> 
> There's zero actual information here.  What exactly are you finding hard to
> "count"?

Looking at the emoji code for example, you are doing all sorts of funny
stuff like creating dynamic rules with count names

"The rulename, JR_SUBJ_EMOJIS or  is appended with an
"_$count" whose score is 0.01.  Example: YOUR_RULENAME_3.  The rule's
description will reflect the number of Emojis found."

This is not really how SA is supposed to be used (even though it's
possible).  It's just complex and confusing.

Normal way is calling the eval function multiple times with the parameters
you want to check, there's many examples in the stock rules:

body HTML_OBFUSCATE_05_10  eval:html_range('obfuscation_ratio','.05','.1')
body HTML_OBFUSCATE_10_20  eval:html_range('obfuscation_ratio','.1','.2')



Re: CHAOS: v1.2.2: Of Documentation

2021-07-23 Thread Noel Butler

On 23/07/2021 18:01, Simon Wilson wrote:


- Message from Jared Hall  -
Date: Fri, 23 Jul 2021 00:07:52 -0400
From: Jared Hall 
Subject: CHAOS: v1.2.2: Of Documentation
To: users@spamassassin.apache.org

Simon Wilson wrote: could you, please, finally, describe what does this 
module do,

here to the list and/or to the wiki?

the description there is too hard to understand, epecially at the  
beginning,

and I couldn't force myself to understand it (multiple times).

Maybe you should start with the easy parts and follow with those more
compliated functionality, because I feel the description starts  with 
thelatter.


I'm guessing from the silence in response that this will remain a 
mystery.


Simon.

___
Simon Wilson
M: 0400 12 11 16


Reads perfectly well to me.  I guess to be compatible with any other  
plugin, I must delete all documentation entirely :)
No - but perhaps a start would be to *really* listen when people ask  
questions demonstrating you are not as good as you think you are at  
writing things which make sense to people other than yourself.


Seriously, every single rule that this module can generate is  listed.  
That's a good start, comparatively.


I answer, and have answered, all questions regarding this module.


Again no. Perhaps not all mailing list emails make it through the 
module...


I've still yet to see a list post explaining what this thing does
so no he has not answered all questions about it, the most common sense 
thing of all time is if you advertise your wares, you at least tell 
people WTF it does, you don't send them to some web site to find out 
(which as some posters have indicated apparently does not even tell 
you).


I wont comment on the rest of his trash talk, based on his useless smart 
arse replies, I don't care what this thing does we wont be touching it 
due to his childish pathetic attitude, for all we know it's malware.


--
Regards,
Noel Butler

This Email, including attachments, may contain legally privileged 
information, therefore at all times remains confidential and subject to 
copyright protected under international law. You may not disseminate 
this message without the authors express written authority to do so.   
If you are not the intended recipient, please notify the sender then 
delete all copies of this message including attachments immediately. 
Confidentiality, copyright, and legal privilege are not waived or lost 
by reason of the mistaken delivery of this message.

Re: CHAOS: v1.2.2: Of Documentation

2021-07-23 Thread Simon Wilson

- Message from Jared Hall  -
   Date: Fri, 23 Jul 2021 00:07:52 -0400
   From: Jared Hall 
Subject: CHAOS: v1.2.2: Of Documentation
 To: users@spamassassin.apache.org



Simon Wilson wrote:

could you, please, finally, describe what does this module do,
here to the list and/or to the wiki?

the description there is too hard to understand, epecially at the  
beginning,

and I couldn't force myself to understand it (multiple times).

Maybe you should start with the easy parts and follow with those more
compliated functionality, because I feel the description starts  
with thelatter.



I'm guessing from the silence in response that this will remain a mystery.

Simon.

___
Simon Wilson
M: 0400 12 11 16


Reads perfectly well to me.  I guess to be compatible with any other  
plugin, I must delete all documentation entirely :)


No - but perhaps a start would be to *really* listen when people ask  
questions demonstrating you are not as good as you think you are at  
writing things which make sense to people other than yourself.




Seriously, every single rule that this module can generate is  
listed.  That's a good start, comparatively.


I answer, and have answered, all questions regarding this module.


Again no. Perhaps not all mailing list emails make it through the module...

Open-ended questions, or questions that are vague and ambiguous, are  
ignored.  For instance, "Maybe you should start with easy parts"?  
OK, what's easy?  I'm reminded of an old Star Trek episode where Dr.  
McCoy is reattaching Spock's brain.  "It's so easy.  A child can do  
it", he muses.  Questions have value.  Statements less so.


Like that one?



This module has some unique stuff that CANNOT be done in a pure  
SpamAssassin environment.  It also has stuff that can be replicated  
using standard rules.


1) The module, if installed and using the config file as is, does no  
harm at all.  It will merely generate rules based upon what it  
finds.  These are all scored at the low rate of 0.01.  It's up to  
the user to decide what to with them.  They can wrap up a generated  
rule in a meta rule.  Example:


meta   JR_HATES_BEENTHERE   (JR_X_BEENTHERE)
score JR_HATES_BEENTHERE   8.0
||
2) Via a configuration file option, "chaos_mode", the module can be  
set to automatically score its rules.


chaos_mode AutoISP

It will still run along with existing files, cranking out higher  
scores for those rules marked with an asterisk.  That is still  
probably acceptable for most people.  But it can cause problems. The  
popular KAM ruleset scores SendGrid Emails with a high value. Mine  
is split into two different values that are scored differently.   
While they are both lower than KAM's, combined, I see that as a  
potential problem.  I have no knowledge of what somebody's rules are  
at any given moment.  Caveat Emptor.  There I go again with the  
Latin :)


2A) What values do I set for these rules?  As a percentage of  
another configuration file option, "chaos_tag":


chaos_tag 7

Per the example above JR_X_BEENTHERE is a rule that is Auto-Scored.  
If you lower the chaos_tag value, the score for this rule would be  
reduced.  If I increase the chaos_tag value, the score produced by  
this rule is raised.


2B) The AutoISP mode, as is, should be fine for anybody running  a  
spam tag level of 8 to 12.


2C) The initial release of CHAOS.pm did all kinds of scoring.  One  
of the knocks I have about SpamAssassin is that is does not maintain  
counts of hits.  My complaints about this go all the way back to  
2010.  Counts and Amounts.  SA is great with Amounts.  It sucks with  
Counts.  To the SA Development crew's credit, somewhere along the  
way, tflags were added to allow that functionality in a very  
primitive fashion.  Many people are happy with that.  I'm just not  
one of them.


I read somewhere, while looking at META rules that SA internally  
builds an array of the rules hit.  That way, as rules hit, METAs are  
then appropriately updated.  Gee, an array.  Maybe we could add a  
count to that array if the user wishes to?  I think that it is a lot  
of development; not so much the actual process of doing it, but  
updating all the User handling thereof.  Alas, It is what it is *SIGH*


2D) One thing about running AutoISP mode is that you can change a  
Rule's name in the configuration file and not matter what, you'll  
get the Rulename that's hard-coded into the program.  When a Eval  
plugin function is called, SA passes the rule name to the plugin.  
Most plugins just ignore it, and simply return a Hit/Miss value for  
the Rulename.  I ignore that completely.


2E) When I first released CHAOS, all it did was Automatic Scoring.  
And I used all kinds of fancy algorithms, even logarithmic, to  
demonstrate that.  That was pointless, as many pointed out at the  
time.  I don't do that stuff anymore.


2F) Still, as is, AutoISP will still work great f

Re: CHAOS: v1.2.2: Of Documentation

2021-07-22 Thread Henrik K
On Fri, Jul 23, 2021 at 12:07:52AM -0400, Jared Hall wrote:
> 
> 1) The module, if installed and using the config file as is, does no harm at
> all.  It will merely generate rules based upon what it finds.  These are all
> scored at the low rate of 0.01.  It's up to the user to decide what to with
> them.  They can wrap up a generated rule in a meta rule.  Example:
> 
> meta   JR_HATES_BEENTHERE   (JR_X_BEENTHERE)
> score JR_HATES_BEENTHERE   8.0

While I guess it's not illegal to whip up rules on the fly, it's awkward and
inflexible for the users.

> 2C) The initial release of CHAOS.pm did all kinds of scoring.  One of the
> knocks I have about SpamAssassin is that is does not maintain counts of hits. 
> My complaints about this go all the way back to 2010.  Counts and Amounts.  SA
> is great with Amounts.  It sucks with Counts.  To the SA Development crew's
> credit, somewhere along the way, tflags were added to allow that functionality
> in a very primitive fashion.  Many people are happy with that.  I'm just not
> one of them.
> ...
> I read somewhere, while looking at META rules that SA internally builds an
> array of the rules hit.  That way, as rules hit, METAs are then appropriately
> updated.  Gee, an array.  Maybe we could add a count to that array if the user
> wishes to?  I think that it is a lot of development; not so much the actual
> process of doing it, but updating all the User handling thereof.  Alas, It is
> what it is *SIGH*

There's zero actual information here.  What exactly are you finding hard to
"count"?



Re: CHAOS: v1.2.2: Of Documentation

2021-07-22 Thread Charles Sprickman
What would the elevator pitch be for this?

> On Jul 23, 2021, at 12:07 AM, Jared Hall  wrote:
> 
> Simon Wilson wrote:
>>> could you, please, finally, describe what does this module do,
>>> here to the list and/or to the wiki?
>>> 
>>> the description there is too hard to understand, epecially at the beginning,
>>> and I couldn't force myself to understand it (multiple times).
>>> 
>>> Maybe you should start with the easy parts and follow with those more
>>> compliated functionality, because I feel the description starts with 
>>> thelatter.
>> 
>> I'm guessing from the silence in response that this will remain a mystery.
>> 
>> Simon.
>> 
>> ___
>> Simon Wilson
>> M: 0400 12 11 16
> 
> Reads perfectly well to me.  I guess to be compatible with any other plugin, 
> I must delete all documentation entirely :)  
> 
> Seriously, every single rule that this module can generate is listed.  That's 
> a good start, comparatively.
> 
> I answer, and have answered, all questions regarding this module.  Open-ended 
> questions, or questions that are vague and ambiguous, are ignored.  For 
> instance, "Maybe you should start with easy parts"?  OK, what's easy?  I'm 
> reminded of an old Star Trek episode where Dr. McCoy is reattaching Spock's 
> brain.  "It's so easy.  A child can do it", he muses.  Questions have value.  
> Statements less so.
> 
> This module has some unique stuff that CANNOT be done in a pure SpamAssassin 
> environment.  It also has stuff that can be replicated using standard rules.  
> 
> 1) The module, if installed and using the config file as is, does no harm at 
> all.  It will merely generate rules based upon what it finds.  These are all 
> scored at the low rate of 0.01.  It's up to the user to decide what to with 
> them.  They can wrap up a generated rule in a meta rule.  Example:
> 
> meta   JR_HATES_BEENTHERE   (JR_X_BEENTHERE)
> score JR_HATES_BEENTHERE   8.0
> 
> 2) Via a configuration file option, "chaos_mode", the module can be set to 
> automatically score its rules.
> 
> chaos_mode AutoISP
> 
> It will still run along with existing files, cranking out higher scores for 
> those rules marked with an asterisk.  That is still probably acceptable for 
> most people.  But it can cause problems.  The popular KAM ruleset scores 
> SendGrid Emails with a high value.  Mine is split into two different values 
> that are scored differently.  While they are both lower than KAM's, combined, 
> I see that as a potential problem.  I have no knowledge of what somebody's 
> rules are at any given moment.  Caveat Emptor.  There I go again with the 
> Latin :)
> 
> 2A) What values do I set for these rules?  As a percentage of another 
> configuration file option, "chaos_tag":
> 
> chaos_tag 7
> 
> Per the example above JR_X_BEENTHERE is a rule that is Auto-Scored.  If you 
> lower the chaos_tag value, the score for this rule would be reduced.  If I 
> increase the chaos_tag value, the score produced by this rule is raised.
> 
> 2B) The AutoISP mode, as is, should be fine for anybody running  a spam tag 
> level of 8 to 12.  
> 
> 2C) The initial release of CHAOS.pm did all kinds of scoring.  One of the 
> knocks I have about SpamAssassin is that is does not maintain counts of hits. 
>  My complaints about this go all the way back to 2010.  Counts and Amounts.  
> SA is great with Amounts.  It sucks with Counts.  To the SA Development 
> crew's credit, somewhere along the way, tflags were added to allow that 
> functionality in a very primitive fashion.  Many people are happy with that.  
> I'm just not one of them.
> 
> I read somewhere, while looking at META rules that SA internally builds an 
> array of the rules hit.  That way, as rules hit, METAs are then appropriately 
> updated.  Gee, an array.  Maybe we could add a count to that array if the 
> user wishes to?  I think that it is a lot of development; not so much the 
> actual process of doing it, but updating all the User handling thereof.  
> Alas, It is what it is *SIGH*
> 
> 2D) One thing about running AutoISP mode is that you can change a Rule's name 
> in the configuration file and not matter what, you'll get the Rulename that's 
> hard-coded into the program.  When a Eval plugin function is called, SA 
> passes the rule name to the plugin.  Most plugins just ignore it, and simply 
> return a Hit/Miss value for the Rulename.  I ignore that completely.
> 
> 2E) When I first released CHAOS, all it did was Automatic Scoring.  And I 
> used all kinds of fancy algorithms, even logarithmic, to demonstrate that.  
> That was pointless, as many pointed out at the time.  I don't do that stuff 
> anymore.
> 
> 2F) Still, as is, AutoISP will still work great for most people. 
> 
> 3) As the first release of CHAOS was about as successful as the Hindenburg, I 
> added the concept of Manual scoring.  This works in the same fashion as most 
> people are accustomed to.  This is set in the configuration file:
> 
> chaos_mode Manual
> 
> There 

CHAOS: v1.2.2: Of Documentation

2021-07-22 Thread Jared Hall

Simon Wilson wrote:

could you, please, finally, describe what does this module do,
here to the list and/or to the wiki?

the description there is too hard to understand, epecially at the 
beginning,

and I couldn't force myself to understand it (multiple times).

Maybe you should start with the easy parts and follow with those more
compliated functionality, because I feel the description starts with 
thelatter.



I'm guessing from the silence in response that this will remain a mystery.

Simon.

___
Simon Wilson
M: 0400 12 11 16


Reads perfectly well to me.  I guess to be compatible with any other 
plugin, I must delete all documentation entirely :)


Seriously, every single rule that this module can generate is listed.  
That's a good start, comparatively.


I answer, and have answered, all questions regarding this module. 
Open-ended questions, or questions that are vague and ambiguous, are 
ignored.  For instance, "Maybe you should start with easy parts"? OK, 
what's easy?  I'm reminded of an old Star Trek episode where Dr. McCoy 
is reattaching Spock's brain.  "It's so easy.  A child can do it", he 
muses.  Questions have value.  Statements less so.


This module has some unique stuff that CANNOT be done in a pure 
SpamAssassin environment.  It also has stuff that can be replicated 
using standard rules.


1) The module, if installed and using the config file as is, does no 
harm at all.  It will merely generate rules based upon what it finds.  
These are all scored at the low rate of 0.01.  It's up to the user to 
decide what to with them.  They can wrap up a generated rule in a meta 
rule.  Example:


meta   JR_HATES_BEENTHERE   (JR_X_BEENTHERE)
score JR_HATES_BEENTHERE   8.0
||
2) Via a configuration file option, "chaos_mode", the module can be set 
to automatically score its rules.


chaos_mode AutoISP

It will still run along with existing files, cranking out higher scores 
for those rules marked with an asterisk.  That is still probably 
acceptable for most people.  But it can cause problems. The popular KAM 
ruleset scores SendGrid Emails with a high value. Mine is split into two 
different values that are scored differently.  While they are both lower 
than KAM's, combined, I see that as a potential problem.  I have no 
knowledge of what somebody's rules are at any given moment.  Caveat 
Emptor.  There I go again with the Latin :)


2A) What values do I set for these rules?  As a percentage of another 
configuration file option, "chaos_tag":


chaos_tag 7

Per the example above JR_X_BEENTHERE is a rule that is Auto-Scored. If 
you lower the chaos_tag value, the score for this rule would be 
reduced.  If I increase the chaos_tag value, the score produced by this 
rule is raised.


2B) The AutoISP mode, as is, should be fine for anybody running  a spam 
tag level of 8 to 12.


2C) The initial release of CHAOS.pm did all kinds of scoring.  One of 
the knocks I have about SpamAssassin is that is does not maintain counts 
of hits.  My complaints about this go all the way back to 2010.  Counts 
and Amounts.  SA is great with Amounts.  It sucks with Counts.  To the 
SA Development crew's credit, somewhere along the way, tflags were added 
to allow that functionality in a very primitive fashion.  Many people 
are happy with that.  I'm just not one of them.


I read somewhere, while looking at META rules that SA internally builds 
an array of the rules hit.  That way, as rules hit, METAs are then 
appropriately updated.  Gee, an array.  Maybe we could add a count to 
that array if the user wishes to?  I think that it is a lot of 
development; not so much the actual process of doing it, but updating 
all the User handling thereof.  Alas, It is what it is *SIGH*


2D) One thing about running AutoISP mode is that you can change a Rule's 
name in the configuration file and not matter what, you'll get the 
Rulename that's hard-coded into the program.  When a Eval plugin 
function is called, SA passes the rule name to the plugin. Most plugins 
just ignore it, and simply return a Hit/Miss value for the Rulename.  I 
ignore that completely.


2E) When I first released CHAOS, all it did was Automatic Scoring. And I 
used all kinds of fancy algorithms, even logarithmic, to demonstrate 
that.  That was pointless, as many pointed out at the time.  I don't do 
that stuff anymore.


2F) Still, as is, AutoISP will still work great for most people.

3) As the first release of CHAOS was about as successful as the 
Hindenburg, I added the concept of Manual scoring.  This works in the 
same fashion as most people are accustomed to.  This is set in the 
configuration file:


chaos_mode Manual

There are currently two exceptions in Manual mode.  I don't allow 
changing Rulenames for the mailer_check() and id_attachments() Eval 
functions.  The reason is that these Evals can produce a lot of Rule 
outputs.



OK, are you still with me?  If not, just implement Step 1) above.

4) Regarding overall development,