[Mailman-Developers] Re: ARC user options

2022-09-13 Thread Stephen J. Turnbull
Alessandro Vesely writes:

 > It would also be possible to link DB tables,

No, it's not.  It's all one row (IIRC).

 > or to define triggers that replicate insert/ update/ delete on a
 > number of tables/ fields.  

Which is exactly the complexity I don't want in Mailman if we can
avoid it.  Keep it to flat tables in approximately normal form.

 > The question is how much more insight than average list keeping
 > would be needed to do it.

"Too much." :-)  The simple "message per subscriber + per-subscription
'munge' flag" is just so much better.  For example, there are at least
three conceivable values: no munge, munge all, munge p=reject (and I
think Mailman actually implements munge p>=quarantine as well!)  This
gets very tedious in the twin-list implementation.

 > Would this approach make sense with Mailman 2?

The umbrella + twin lists approach is perfectly possible with Mailman
2, but the admin has to implement it themselves.  We are not going to
implement it or release it.

 > If the site policy is to accept posts from subscribers, it needs to
 > inspect the union of sub-lists subscriber sets.  How could that be
 > accomplished?

There's a "sibling list" feature for exactly this purpose.

 > > 4.  List-A-munge gets From munging for all posts, List-A and
 > >  List-A-nomunge never get From munging.  (In theory List-A-munge
 > >  could do munging only for p=reject posters, but always doing it
 > >  probably makes it easier for subscribers to maintain their
 > >  filters.)
 > 
 > How difficult is that to set up?

cost of 1 list X 3.

 > IMHO, it becomes overly complicated.

So don't do it.  Others will, however -- isn't that what
"decentralization" is all about? -- and some of them are quite good at
it.  Why not take advantage of that to make at least some mail flows
cleaner and more useful?

 > DMARC was thought so that From: bank.example can hardly be faked.

Yes, and that's still true.

 > Allowing fuzzy overrides is much like getting back to content
 > analysis.

Fuzzy overrides are not "allowed".  They *happened*.  Gmail did it
from the get-go.  RFC 8617 is recognition that it does happen, and a
protocol that purports to improve the accuracy of overrides.

 > I'd mark as trusted only a few domains, based on personal knowledge.

Have you analyzed your mail flows to see if there seem to be frequent
messages with multiple DKIM breaks?  AFAICS, in practice *you* as an
individual will need to trust your mailing lists because that's the
only place signatures are going to be invalidated, and you can demand
everybody else has to pass DMARC.  This is no different from before,
except replay attacks via mailing lists are going to be harder.  For
large sites with many users with diverse mail flows, the benefits of
both ARC and reputation systems are much larger.

 > It is not the cost.  To have a global knowledge of the Internet you need to 
 > have a user base that is statistically relevant with respect to the global 
 > population.  That is, you have to be Google, or Microsoft, or
 > Yahoo!, ...

That's not true.  Bayesian filters work well for almost everyone.

 > > If anything, it's the opposite for the mailing list community, because it 
 > > makes it easier for an independent list host construct and maintain its 
 > > reputation, and should get it better treatment from those with reputation 
 > > systems.
 > 
 > Yes.  However, I think that a list that experiments non-munging will be 
 > whitelisted sooner by small, personal sites who trust it than by large orgs 
 > computing its reputation.

They're *already* whitelisted by the small personal sites who trust
it.  The critical question is how fast are the large orgs going to
learn to trust small ARC participants, because it's exactly those
large orgs that are the root of all evil^Wer, most nondelivery
problems that we small sites experience.

ARC and DMARC are *not* targered at *us*, though if the large orgs use
them effectively we will benefit.  They're for large sites with
hundreds of thousands (and sometimes billions) of users who are
targeted by ransomeware hackers and national espionage agencies.

 > > This would require the MLM site to maintain a separate site.
 > 
 > It can be a dummy subdomain, a few lines in a zone file.  I'll change that 
 > line 
 > to "from a (sub)domain having p=reject", to have it more apparent.

Yeah, I was in a hurry.  Thing is, there are a lot of inexperienced
folks out there who would just send mail from "bi...@whitehouse.gov"
or something like that. :-)

 > Hm... a list SHOULD reject posts arriving with an ARC chain, valid or not. 
 > Shouldn't it?  I see no reasons to post indirectly (except for internal 
 > list-to-list flows which don't need ARC seals).

This is *Internet mail*.  "NO REASON" is its slogan!  But here's my
personal use case: I use a Japanese telco as my home ISP but use my
server at my employer as smarthost.  My employer (research university)
doesn't know or care, but they do 

[Mailman-Developers] Re: ARC user options

2022-09-13 Thread Alessandro Vesely

On Tue 13/Sep/2022 10:14:12 +0200 Stephen J. Turnbull wrote:

Alessandro Vesely writes:

Maintaining synchronization of configurations of two lists will be tedious 
for the admin, or involve relatively complicated coding if we arrange to 
automatically mirror configuration changes.


Couldn't symlink most stuff?


I don't think there's anything to symlink.  In Mailman 3 all of this 
configuration information is in an RDBMS like PostgreSQL, and routing 
of posts and modification of messages (both bodies and headers).



It would also be possible to link DB tables, or to define triggers that 
replicate insert/ update/ delete on a number of tables/ fields.  The question 
is how much more insight than average list keeping would be needed to do it.


Would this approach make sense with Mailman 2?



I'm not clear how that would work.  Would you expand?


1.  lis...@example.com has two subscribers:
 list-a-mu...@example.com
 list-a-nomu...@example.com
 List-A-[no]munge accepts subscriptions according to site and list
 policy.
2.  List-A is configured not to allow other subscribers under any
 circumstances.  List-A-[no]munge accept subscribers under the site
 and list policy.
3.  List-A-[no]munge refuse all posts, and advertise List-A as the
 destination for posts.  List-A accepts posts according to site and
 list policy.



If the site policy is to accept posts from subscribers, it needs to inspect the 
union of sub-lists subscriber sets.  How could that be accomplished?




4.  List-A-munge gets From munging for all posts, List-A and
 List-A-nomunge never get From munging.  (In theory List-A-munge
 could do munging only for p=reject posters, but always doing it
 probably makes it easier for subscribers to maintain their
 filters.)



How difficult is that to set up?

I saw some lists deploying a home-brewed From: munging tool.  In that case they 
can control it directly.



https://datatracker.ietf.org/doc/html/draft-ietf-dmarc-arc-usage-09 
would be the natural home but it's expired, so it doesn't do any harm 
to have it in your draft.


What I dislike of that document is its considering the availability of a global 
reputation system as a widespread feature of all mail servers,


90% of the email users on the Internet are served by organizations 
that can afford comprehensive and reasonably accurate reputation 
databases and update algorithms.  (Whether they do bother with 
accuracy is another question.)  So I think it's reasonable to ask "how 
does a reputation database affect this feature" several times.



IMHO, it becomes overly complicated.  Domain-based reputation is already fuzzy, 
and for giant organizations it becomes unmeaningful —they're just too big to 
block.  Now, start gaming all possible combinations of domains.  Replaying a 
modified message is all too easy, and ARC can be ambiguous about who modified 
what in a message.  Yes, you could feed a neural network with that.  Will it be 
reliable?



For the rest of us, there are less sophisticated but still useful shared 
reputation databases (ie, the RBLs), and local databases such as 
SpamBayes can be useful.



DMARC was thought so that From: bank.example can hardly be faked.  Allowing 
fuzzy overrides is much like getting back to content analysis.  I'd mark as 
trusted only a few domains, based on personal knowledge.



while only the known giants actually have one.  In that respect, 
ARC is a centripetal protocol, which is why I've been opposing it 
until this attempt.


Everything is centripetal, because the only way we really know how to 
scale networks while maintaining discoverability is hierarchically. 
All reasonably decentralized networks have a (usually very expensive) 
centralized system at their foundation.  I don't see ARC as being 
particularly biased toward centralization, just because powerful 
reputation systems are expensive.



It is not the cost.  To have a global knowledge of the Internet you need to 
have a user base that is statistically relevant with respect to the global 
population.  That is, you have to be Google, or Microsoft, or Yahoo!, ...



If anything, it's the opposite for the mailing list community, because it 
makes it easier for an independent list host construct and maintain its 
reputation, and should get it better treatment from those with reputation 
systems.


Yes.  However, I think that a list that experiments non-munging will be 
whitelisted sooner by small, personal sites who trust it than by large orgs 
computing its reputation.




3.  The no-munging method
[...]

Before allowing subscription to a non-munging list, a MLM MAY test
that a recipient effectively receives its messages by sending a test
message with a broken signature from a domain having p=reject.


This would require the MLM site to maintain a separate site.



It can be a dummy subdomain, a few lines in a zone file.  I'll change that line 
to "from a (sub)domain having 

[Mailman-Developers] Re: ARC user options

2022-09-13 Thread Stephen J. Turnbull
First let me make clear that (1) I do have influence on Mailman's
position here but (2) I am not authoritative and (3) Mailman has no
position yet.  I'm discussing this and that and we'll see where my
position and eventually Mailman's come out.  So anything I say may be
wrong (always check my logic ;-) and I may change my mind. :-)

Alessandro Vesely writes:

 > It is the MLM as a whole which has to conform, if it wishes to participate. 
 > Not the mailing list software.

If you mean the decision is list by list, conformance doesn't mean
much -- the subscribers still need to learn the rules list by list,
most of them won't know what "RFC " conformance means, and other
sites interacting with such a site will need to check the conformance
of lists individually.

On the other hand, if you mean site-wide, if it were implemented in
Mailman (and other software), conformance would be much more likely
and much more likely to be site-wide.

 > I push ARC as the authentication method because that was the major objection 
 > to 
 > using Author: (the "simple" method in the old version.)

Yes, I agree, authentication is important, and ARC provides validation
of the right data for some purposes.  I'm not sure it "does what you
want", but I do "want what it does".

 > > Maintaining synchronization of configurations of two lists will be tedious 
 > > for the admin, or involve relatively complicated coding if we arrange to 
 > > automatically mirror configuration changes.
 > 
 > Couldn't symlink most stuff?

I don't think there's anything to symlink.  In Mailman 3 all of this
configuration information is in an RDBMS like PostgreSQL, and routing
of posts and modification of messages (both bodies and headers).

 > I'm not clear how that would work.  Would you expand?

1.  lis...@example.com has two subscribers:
list-a-mu...@example.com
list-a-nomu...@example.com
List-A-[no]munge accepts subscriptions according to site and list
policy.
2.  List-A is configured not to allow other subscribers under any
circumstances.  List-A-[no]munge accept subscribers under the site
and list policy.
3.  List-A-[no]munge refuse all posts, and advertise List-A as the
destination for posts.  List-A accepts posts according to site and
list policy.
4.  List-A-munge gets From munging for all posts, List-A and
List-A-nomunge never get From munging.  (In theory List-A-munge
could do munging only for p=reject posters, but always doing it
probably makes it easier for subscribers to maintain their
filters.)

 > > https://datatracker.ietf.org/doc/html/draft-ietf-dmarc-arc-usage-09 
 > > would be the natural home but it's expired, so it doesn't do any harm 
 > > to have it in your draft.
 > 
 > What I dislike of that document is its considering the availability of a 
 > global 
 > reputation system as a widespread feature of all mail servers,

90% of the email users on the Internet are served by organizations
that can afford comprehensive and reasonably accurate reputation
databases and update algorithms.  (Whether they do bother with
accuracy is another question.)  So I think it's reasonable to ask "how
does a reputation database affect this feature" several times.  For
the rest of us, there are less sophisticated but still useful shared
reputation databases (ie, the RBLs), and local databases such as
SpamBayes can be useful.

 > while only the known giants actually have one.  In that respect,
 > ARC is a centripetal protocol, which is why I've been opposing it
 > until this attempt.

Everything is centripetal, because the only way we really know how to
scale networks while maintaining discoverability is hierarchically.
All reasonably decentralized networks have a (usually very expensive)
centralized system at their foundation.  I don't see ARC as being
particularly biased toward centralization, just because powerful
reputation systems are expensive.  If anything, it's the opposite for
the mailing list community, because it makes it easier for an
independent list host construct and maintain its reputation, and
should get it better treatment from those with reputation systems.

 > Should I add that it's out of scope to speculate how users can convince 
 > their 
 > mailbox provider to trust/ whitelist a given MLM?

I think that's always out of scope.  It doesn't hurt to add it, but
technically its out of scope for an RFC.

 > 3.  The no-munging method
[...]
 > *  Have an umbrella list with two subscribers, the twin lists.  The
 >twin lists would be configured to refuse subscriptions and posts
 >from non-members.

Whether to refuse posts from non-members is independent of the
no-munging protocol.

 > Before allowing subscription to a non-munging list, a MLM MAY test
 > that a recipient effectively receives its messages by sending a test
 > message with a broken signature from a domain having p=reject.

This would require the MLM site to maintain a separate site.
Otherwise