Re: Adding SpamAssassin Headers to IETF mail

2003-12-18 Thread Harald Tveit Alvestrand
Dean,

--On 17. desember 2003 16:01 -0500 Dean Anderson [EMAIL PROTECTED] wrote:

This is ridiculous.  The IETF is not getting a lot of spam, so adding
SpamAssassin headers is a solution in need of a problem.
the reason you don't see a lot of spam on IETF lists is because it's sent 
to the list administrators, and they filter it by hand.

The chief beneficiaries of automatic spam detection and deletion in the 
current IETF setup is the list administrators.







SA / Spam. Facts.

2003-12-18 Thread Brett Thorson
These are the facts.

On Wednesday 17 December 2003 16:01, Dean Anderson wrote:
 This is ridiculous.  The IETF is not getting a lot of spam, so adding
 SpamAssassin headers is a solution in need of a problem.

a lot is a subjective term.  Also, unless you are sniffing the traffic into 
our network, would you know how much spam our MX receives?

A rough approximation is that 1/3 of the mail into the IETF MX is spam.  
Estimate based on a small sample.  If a more accurate number is needed, 
please submit to the tracking system for prioritizing in the queue of IETF 
things to do.

Some spam we already filter out without spam assassin.
For example...
CC'ing mail to ietf-announce (as two of your posts did) gets caught in our 
spam filter because it is not appropriate on that mailing list.

 [EMAIL PROTECTED] wrote:
   ...this implementation is to allow the IETF community to get used
   to having these headers in the messages, and allow us to make any
   changes to the filtering rules.

 The above seems like a thinly veiled attempt to make SpamAssassin headers 
 a defacto standard supported by the IETF, without going through the 
 standards process.

It may seem that way to you, but in reality it isn't.  Just me deciding to use 
it because it worked well with exim, it was quick to setup, seemed to perform 
the task well, didn't need a lot of human intervention, it could be tuned.  
Oh, and it's free, so the IETF could afford it.

Mr. Anderson continued
 Obviously, if the goal is to standardize these headers, then a standard
 can be produced and put through the standards process.

The goal is to reduce spam, and reduce the human intervention needed to reduce 
spam.  

These are the facts.

--Brett



Re: Adding SpamAssassin Headers to IETF mail

2003-12-18 Thread Pekka Savola
On Wed, 17 Dec 2003, Harald Tveit Alvestrand wrote:
 --On 17. desember 2003 16:01 -0500 Dean Anderson [EMAIL PROTECTED] wrote:
  This is ridiculous.  The IETF is not getting a lot of spam, so adding
  SpamAssassin headers is a solution in need of a problem.
 
 the reason you don't see a lot of spam on IETF lists is because it's sent 
 to the list administrators, and they filter it by hand.
 
 The chief beneficiaries of automatic spam detection and deletion in the 
 current IETF setup is the list administrators.

.. which do not/cannot use SpamAssassin to filter the bounces(?)

-- 
Pekka Savola You each name yourselves king, yet the
Netcore Oykingdom bleeds.
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings




Re: SA / Spam. Facts.

2003-12-18 Thread Jari Arkko
Brett Thorson wrote:

The goal is to reduce spam, and reduce the human intervention needed to reduce 
spam.  
Right. I support the secretariat's efforts to reduce spam and
associated management effort on the IETF lists. Personally, I
have a good experience with SpamAssissin, so to me the technical
arrangement looks quite reasonable.
As for the rest, lets all remember that there may be no
fast, perfect, and inexpensive solutions. It may not be
reasonable to require that the false positive rate is zero.
We certainly can't replace spam detecting tools with
e-mail signatures and have it operational tomorrow. And
if the IETF has unused cash, there may be better uses for
it than paying for spam detection software.
--Jari




Re: Adding [ietf] considered harmful

2003-12-18 Thread John Stracke
Mark Allman wrote:

A tag in the subject line is clearly overdue.  But, if we're going to do
it, let's do it right.  Please use [IETF] not [ietf] because it's
more befitting of a proper acronym.
Just what we need, a mailing list that SHOUTS.

(Then again, for this list, maybe it constitutes fair warning...)

--
/===\
|John Stracke  |[EMAIL PROTECTED]|
|Principal Engineer|http://www.centive.com  |
|Centive   |My opinions are my own. |
|===|
|Music is not a noun, it's a verb. --John Perry Barlow|
\===/




Re: Adding SpamAssassin Headers to IETF mail

2003-12-18 Thread John Leslie
Harald Tveit Alvestrand [EMAIL PROTECTED] wrote:
 
 the reason you don't see a lot of spam on IETF lists is because it's
 sent to the list administrators, and they filter it by hand.

   Clearly, this cannot continue (unless we come up with some way to
pay people to perform this service).

 The chief beneficiaries of automatic spam detection and deletion in the 
 current IETF setup is the list administrators.

   I am really in no position to criticize the use of SpamAssassin.
I started using it for my personal account just before I left for
IETF-58, and have little hope of turning it off. (It flags as spam
roughly 4,000 emails per week.)

   But I think we should stop short of endorsing it.

   It is, frankly, wrong to propagate to the list any email which we
consider to be likely spam. We should instead come up with a way to
verify/authenticate/intuit/whatever that it is an individually-written
message considered to be on-topic by some person we have no reason to
distrust.

   SpamAssassin is a technical marvel -- and I suspect it could be
useful as a sorting tool to distinguish messages which deserve to be
distributed immediately vs. messages which need further verification.

   But that further verification should be done _before_ anything is
distributed to the list. If the SpamAssasin filtering were applied
_during_ the SMTP session to ietf.org and a descriptive error (with
URL) was returned (rather than 250 - OK), then we would have done
everything we reasonably could to notify an honest sender that we
needed further verification.

   (And, of course, any other content-processing tool could be used
instead of SpamAssassin -- indeed I'm not sure any useful purpose
is served by publishing which particular content-assessment tool we
use.)

   If we can't process during the SMTP session, then -- as a short-
term stopgap -- it is reasonable to flag messages for some automated
processing before distributing to the list.

   (None of this is to criticize anyone who runs SpamAssassin at
their own site to apply more rigorous rules -- I'm probably doing so
myself, even if unintentionally.)

   What I do wish to call into question is the wisdom of passing the
SpamAssasin headers to the list. I believe it creates the potential
for confusion as to what is or is not a legitimate message.

--
John Leslie [EMAIL PROTECTED]



Re: Adding SpamAssassin Headers to IETF mail

2003-12-18 Thread Keith Moore
the reason you don't see a lot of spam on IETF lists is because it's 
sent to the list administrators, and they filter it by hand.

The chief beneficiaries of automatic spam detection and deletion in 
the current IETF setup is the list administrators.
I'm one of those list administrators and I can attest that having spam 
flood the review queues of the mailing lists is a huge problem.  It's 
not terribly unusual for the review queue of some lists to get so large 
that  you can't download and resubmit Mailman's review page without 
crashing the web browser (and I've tried several different browsers on 
different platforms).

but despite first-hand experience with the problem I'm still worried 
about using SpamAssassin - I've seen it block too many legitimate 
messages.
 




Re: Adding SpamAssassin Headers to IETF mail

2003-12-18 Thread Harald Tveit Alvestrand
Keith,

the reason the secretariat is doing this in stages is exactly because we 
want to see how big the false-positive issue is.

I currently personally use Mailman 2.60 with Bayesian filtering and 
close-to-default rules; it seems to run at a very low rate of false 
positives.

--On 18. desember 2003 09:40 -0500 Keith Moore [EMAIL PROTECTED] wrote:

the reason you don't see a lot of spam on IETF lists is because it's
sent to the list administrators, and they filter it by hand.
The chief beneficiaries of automatic spam detection and deletion in
the current IETF setup is the list administrators.
I'm one of those list administrators and I can attest that having spam
flood the review queues of the mailing lists is a huge problem.  It's not
terribly unusual for the review queue of some lists to get so large that
you can't download and resubmit Mailman's review page without crashing
the web browser (and I've tried several different browsers on different
platforms).
but despite first-hand experience with the problem I'm still worried
about using SpamAssassin - I've seen it block too many legitimate
messages.









Never-ending arguments about mailing lists considered harmful (was: Re: Adding [ietf] considered harmful)

2003-12-18 Thread John C Klensin
Keith and others,

While...

(1) I agree that this (and any SpamAssassin or other
header-insertion or filtering) would, ideally, better be
done as a per-subscriber optional feature, and

(2) I recognize that, if for some reason (unfathomable
to me, but there is no accounting for taste), people
encapsulate messages in message/rfc822 body parts and
then sign them (or archive hashes of messages including
the headers), any modification of the encapsulated
message would wreak havoc, and

(3) I've got an MUA (and an MTA) that are capable of
filtering on Return-path and/or List-* and/or receipient
(including subaddress)fields,
there are three things about this discussion that bother me...

(i) A number of efforts within the community have pointed to the 
advantages of having more routine work done in a routine and 
automated way by the secretariat.   Since the secretariat is 
operating with very tight resources (something else that has 
been in enough documents and presentations that I assume/hope 
everyone knows), it is in _our_ advantage to let them automate 
anything they can sensibly automate without causing _severe_ 
problems.  Conversely, asking for things that might take large 
amounts of time and energy (such as per-user setting of tag 
fields or application of spam filtering), is, IMO, pretty lousy 
prioritization.

(ii) Even with powerful filtering and organizing tools, some of 
us prefer (as a matter of taste) to not have, e.g., one folder 
or color per mailing list or other correspondent.  For us, a 
subject line indicator of source makes it easier to organize 
things cognitively.  Is it a big deal one way or the other?  Not 
for me at least; I can't speak for others.  But it is helpful to 
some of us, regardless of what the MTA or MUA may or be able to 
do.  And that makes me (at least) a little intolerant of people 
starting religious wars that, themselves, consume large amounts 
of (human as well as network) bandwidth, if only because...

(iii) I am, personally, getting concerned that the IETF is 
approaching the point where we are more concerned about process 
and administration than we are about doing high-quality design 
and engineering and getting high-quality results out.  I don't 
think we are there yet, and I think the trends in that direction 
are still reversible, but I take

* the relative amount of energy the community seems
willing to spend discussing two, essentially trivial,
changes to mailing list management, or

* the fine details (rather than broad issues) of a
process WG charter, or

* heated arguments about proposals for which most of the
people actively participating in the discussions have
clearly not read the relevant documents, or
* IESG being willing to tie up Proposed Standards (or
even lower-maturity documents) in order to make sure
that all of the grammatical and procedural niceties are
adhered to, or
	probably several other things that belong on that list...

as symptoms of serious and deep problems with our priorities and 
how we do business.

For the record, before I'm quoted out of context (as I probably 
will be anyway), our copying procedures from SDOs that have 
become much more procedure-bound, so much so that they often 
appear to no longer care about quality or adoption or 
interoperability of standards as long as the many procedural 
rules are followed to the letter and they can report getting 
more standards out one year than in the previous one would not, 
IMO, be a good idea ... indeed, it would be closer to the height 
of stupidity.

To make a distinction that may be useful before you (or someone 
else) replies, if you (or someone else) wants to get on a tear 
about NATs, I may or may not agree with you, and I may or may 
not believe that the flaming the topic tends to generate will 
result in any real progress or changes in behavior, but at least 
I'm sure the issue is important to the future of the Internet. 
Can you say the same for whether the Secretariat and its mailing 
list machinery adds (or does not add) a few headers to a message 
or a few characters to a subject line ... assuming they don't 
_break_ conforming software used in a rational way (e.g., with 
the robustness principle in mind)?   And, if the answer is no, 
is there any hope of increasing the ratio of meaningful 
technical standards work to this sort of debate around here?

regards,
   john
--On Thursday, 18 December, 2003 09:58 -0500 Keith Moore 
[EMAIL PROTECTED] wrote:

sarchasm
Maybe we should also rewrite the From header field so that
people with dysfunctional MUAs won't have trouble replying to
the list?
Maybe we should also rewrite the Reply-to field so that it
doesn't matter when people get confused about the difference
between reply to author and reply all?

Re: Adding SpamAssassin Headers to IETF mail

2003-12-18 Thread Dean Anderson
On Thu, 18 Dec 2003, Keith Moore wrote:

 I'm one of those list administrators and I can attest that having spam 
 flood the review queues of the mailing lists is a huge problem.  

Ahh. Mail from non-subscribers that has to be reviewed.  SpamBayes or
other content filters would be a far better approach to this problem, And
they don't have the feature of revenge.

Also, sorting out pre-existing subject lines, pre-existing message-id's,
that have been seen already on the list is probably useful.

 It's not terribly unusual for the review queue of some lists to get so
 large that you can't download and resubmit Mailman's review page without
 crashing the web browser (and I've tried several different browsers on
 different platforms).

A better, faster user interface could be useful.  I think you can have
mailmail messages sent to an imap store for approval, where you can sort
them into different folders based on certain criteria (like replies), and
use faster user interfaces to forward them to the list.

It is strange that it crashes your web browser. I've used web browsers
(Netscape and IE) with a Call Accounting system, which shows Call Detail
Record pages with 20,000 records per page, and IE and Netscape can load
it.

 but despite first-hand experience with the problem I'm still worried 
 about using SpamAssassin - I've seen it block too many legitimate 
 messages.

Mostly, this is due to the revenge oriented blacklists that it uses.  

--Dean






Re: Adding SpamAssassin Headers to IETF mail

2003-12-18 Thread Leif Johansson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Harald Tveit Alvestrand wrote:
| Keith,
|
| the reason the secretariat is doing this in stages is exactly because we
| want to see how big the false-positive issue is.
|
| I currently personally use Mailman 2.60 with Bayesian filtering and
| close-to-default rules; it seems to run at a very low rate of false
| positives.
|
In my experience lots of false positives from spamassassin+bayesian
is the result of the user making false assumptions about the linearity
of the spamassassin point-scale.
MVH leifj
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQE/4eT08Jx8FtbMZncRAnw6AJ46QS2hwy6KVNFtGwnKLtNZbjvEeACgtckS
ad+Jaiv8wJPjix7MV9NS0SE=
=e3+E
-END PGP SIGNATURE-



Re: Adding [ietf] considered harmful

2003-12-18 Thread Keith Moore

 From lines and Reply-to and whatever are headers that are meant to
 be processed by computers.  So, you can say all you want about how
 dumb MUAs do or do not process these (and how intermediate mail
 servers should keep their mits off).  Now, humans use these lines,
 too.  So, call them dual use.
 
 The subject line, on the other hand, is just for people.  

Book titles are for people, too.  Does that mean that it's okay for a 
bookseller or library to change the titles on books, in order to help
the consumer indentify where they came from?

I'm a bit surprised at the frequency at which people who claim to be
networking protocol engineers fail to appreciate the benefits of clean
separation-of-function and layering.



Hashing spam

2003-12-18 Thread escom
I work on an approach to block spam with a database of hash (md5) string of
spam email:
1) Reporting a verified spam to the database server on the web
2) the mail client check incoming mail, generate a hash string send to and
verify the presence on the server, is yes block email.
3) download a hot list to block directly on the machine

i don't know if it's a good or bad idea.

--giuseppe




Re: Hashing spam

2003-12-18 Thread Joe Abley
On 18 Dec 2003, at 13:10, escom wrote:

I work on an approach to block spam with a database of hash (md5) 
string of
spam email:
1) Reporting a verified spam to the database server on the web
2) the mail client check incoming mail, generate a hash string send to 
and
verify the presence on the server, is yes block email.
3) download a hot list to block directly on the machine

i don't know if it's a good or bad idea.
http://www.rhyolite.com/anti-spam/dcc/




Re: Hashing spam

2003-12-18 Thread Vernon Schryver
 From: escom [EMAIL PROTECTED]

 I work on an approach to block spam with a database of hash (md5) string of
 spam email:
 1) Reporting a verified spam to the database server on the web
 2) the mail client check incoming mail, generate a hash string send to and
 verify the presence on the server, is yes block email.
 3) download a hot list to block directly on the machine

 i don't know if it's a good or bad idea.

The several existing implementations of something like that idea suggest
that some people think it is a reasonable idea.  I think it is useful
but has limitations.
See http://www.google.com/search?q=vipul+razor and http://cloudmark.com
for one (set of?) implementation(s).
See http://www.dcc-servers.net/ for another.  The DCC is often used
with SpanAssassin.  Refusing mail that has been seen anywhere else
(i.e. with non-zero DCC target counts) seems like a perfect fit for a
mailing list, but I'm probably biased and so won't suggest that.


I vote no if someone is taking a vote about trashing the Subject
headers.  Access to this mailing list should be more, not less difficult.
Anyone who cannot figure out how to sort mail from this list based on
its existing headers and rewrite Subject headers or anything else to
taste is not really interested in the nominal purpose of the list.

If it were practical, it would be good to require subscribers or at
least contributors to prove their interest by showing they can fetch,
compile, install, configure, and operate a simple TCP application such
as an SMTP server.  Anyone who lacks sufficient interest to do (or
already have done) something like that is unlikely to have anything
interesting to say to this list, except to other go-ers.  Such a test
might reduce the number of people who are interested only in non-technical
issues such as the administrative work of the IETF, the nasty evil
power grabbing U.N., ICANN, or legacy internet engineers widening the
digital divide, or whatever else concerns the people who are prompting
the continued statements of the painfully obvious.  (For some reason
perhaps related to procmail, I'm not seeing the questions that prompt
the obvious answers.  It would be swell if those offering the answers
would desist.)

Anyone who cannot find a usable POP3 or SMTP server with which to
subscribe to this list would certainly be better served and better
serve the rest of us by using the web pages of its archive.
See http://www.ietf.org/mail-archive/ietf/Current/maillist.html


Vernon Schryver[EMAIL PROTECTED]



Re: Adding [ietf] considered harmful

2003-12-18 Thread John Kristoff
On Thu, 18 Dec 2003 13:07:24 -0500
Keith Moore [EMAIL PROTECTED] wrote:

 I'm a bit surprised at the frequency at which people who claim to be
 networking protocol engineers fail to appreciate the benefits of clean
 separation-of-function and layering.

Hopefully the drawbacks are appreciated also.  Quoting Rich Seifert,
Layering makes a good servant, but a poor master.  Use layering to
organize the way you THINK about networks, but don't let it restrict how
you DESIGN networks.  If I recall correctly, David Clark used to say
something very similiar to this in a protocol workshop class at Interop
awhile back.

John



Re: Tag, You're It!

2003-12-18 Thread Stephen Sprunk
Thus spake John Stracke [EMAIL PROTECTED]
 Modifying the Subject: line is a Bad Thing; it invalidates digital
 signatures.  We're never going to get widespread use of signed email as
 long as we have pieces of mail infrastructure munging messages to make
 signatures useless.

Signed email already gets mangled by the ietf mail servers (AFAICT), so
what's one more bad idea in the mix?  /advocate class=devil

I can't believe this topic is even being debated.  Filtering has been a
standard feature of every MUA I've used for over 10 years, including my
current PDA and webmail systems.  IMHO, the problems (listed by others) with
this proposal grossly outweigh the complaints of a couple people who refuse
to use a modern MUA or can't figure out how to configure said MUA to filter
on the Sender header.

S

Stephen Sprunk God does not play dice.  --Albert Einstein
CCIE #3723 God is an inveterate gambler, and He throws the
K5SSSdice at every possible opportunity. --Stephen Hawking


smime.p7s
Description: S/MIME cryptographic signature


Re: Hashing spam

2003-12-18 Thread John Stracke
escom wrote:

I work on an approach to block spam with a database of hash (md5) string of
spam email:
1) Reporting a verified spam to the database server on the web
2) the mail client check incoming mail, generate a hash string send to and
verify the presence on the server, is yes block email.
3) download a hot list to block directly on the machine
 

It's been done, and the spammers have already evolved to get around it: 
they randomize the messages so that the hashes don't match.

--
/=\
|John Stracke  |[EMAIL PROTECTED]  |
|Principal Engineer|http://www.centive.com|
|Centive   |My opinions are my own.   |
|=|
|No, no, that's *not* a boat, that's Queen Victoria.|
\=/




Re: Hashing spam

2003-12-18 Thread Vernon Schryver
 From: John Stracke [EMAIL PROTECTED]

 I work on an approach to block spam with a database of hash (md5) string of
 spam email:

 ...
 It's been done, and the spammers have already evolved to get around it: 
 they randomize the messages so that the hashes don't match.

Unless you are mean naive and simplistic hashes, that is an overstatement.
As long as you want to accept mail from strangers, no spam filter can
perfectly predict whether copies of the next message from a stranger
are being sent to 30,000,000 of your intimate friends, but the various
hashing filters do some good work.

An estimate of the effectiveness of a large scale filter can be obtained
from what it sees as the spam ratio.  If it claims that 60% of all
mail is spam but the real ratio is 70%, then it must be 85% effective.

 

Concerning false positives for this mailing list--it would be wise to
define what mail is legitimate.  In many places, you must accept at
least 99.9% of all even remotely legitimate mail.  However, this context
is different.  Here a boolean good/spam is simplistic and wrong.
Instead we have a spectrum:
  1. on-topic messages from subscribers
  2. on-topic messages from non-subscribers
  3. noise from subscribers
  4. noise from non-subscribers
  5. pure spam such as advertisements for loan sharks

In this list, only #1 is clearly good. It is good to avoid rejecting
#2, but there is surely no harm in sometimes delaying #2.  If the
senders of any rejected or false positive #2 received an informative
non-delivery report so that they could retransmit, what would be the harm?

SpamAssassin is reported to be better than 60% accurate.  #2 is surely
rare compared to #1.  Thus, as long as SpamAssassin white-lists all
subscribers, there would be no harm in the occasional rejection of #2.


Vernon Schryver[EMAIL PROTECTED]



Re: Tag, You're It!

2003-12-18 Thread Doug Royer


Stephen Sprunk wrote:

Thus spake John Stracke [EMAIL PROTECTED]
 

Modifying the Subject: line is a Bad Thing; it invalidates digital
signatures.  We're never going to get widespread use of signed email as
long as we have pieces of mail infrastructure munging messages to make
signatures useless.
   

Signed email already gets mangled by the ietf mail servers (AFAICT), so
what's one more bad idea in the mix?  /advocate class=devil
 

Mine seems to make it. This one is (at least was) signed - I hope :-)

--

Doug Royer |   http://INET-Consulting.com
---|-
[EMAIL PROTECTED] | Office: (208)520-4044
http://Royer.com/People/Doug   |Fax: (866)594-8574
  |   Cell: (208)520-4044
  We Do Standards - You Need Standards




smime.p7s
Description: S/MIME Cryptographic Signature


Re: Never-ending arguments about mailing lists considered harmful (was: Re: Adding [ietf] considered harmful)

2003-12-18 Thread Keith Moore
John,

Trying to make this response a brief one, and hopefully the last message
I need to write on this topic for a while.

1) While I generally support reducing secretariat workload when
possible, I don't think it follows that it's to our advantage to let
them automate anything they can sensibly automate without causing severe
problems,  particularly without taking due care in how it is done. 
We've had quite a few problems already with lists being subject to
arbitrary censorship, and many of spamassassin's criteria have no sound
justification.

I should at this point re-iterate that so far nothing harmful has been 
done, and it does look like there's some attempt at due care.  I hope
that publicizing this issue will encourage more due care.

2) I have given several reasons for objecting to adding [xxx] to message
headers, ranging from theoretical/academic arguments about
separation-of-function and layering to statements of personal experience
that this very practice causes problems with reading mail on small
displays, with searching, etc.  These are not absolutes but merely
factors that people should consider rather than immediately assuming
that subject munging is a good idea.

3) It's gotten to the point that almost any argument about a technical
subtlety on the IETF list gets labelled a religious war.  I suspect this
is partly because we're straining to articulate the justification for
our positions (so they look somewhat like religious arguments even when 
there's an underlying technical basis for them), but that's inherent
in the fact that these subjects are subtle.  

I remember a time when we valued the exchange that helped to illuminate
these subtleties and give justification for our positions, and when we
did not think that this level of exchange was inappropriate or an
excessive consumption of bandwidth.  I'm not sure what has changed, but
I hope it's not the case that we can no longer try to understand subtle
effects of technical decisions - because I believe our inability to do
that has caused the quality of our output to suffer tremendously.

4) I see the [xxx] labelling as a design issue.  Even if we claim we're 
only designing for ourselves, it's still a concern because to me the
casual attitude toward adding [xxx] reflects a lack of understanding of
fundamental network protocol design principles.   I see the spamassassin
filtering as a process issue, but one that affects our ability to
produce good designs, because I've seen several occasions where
valuable input from outsiders was discarded for arbitrary reasons and
the design suffered for it.



John, I know you well enough to know that 

- You've seen more than a few problems with header munging yourself, 
and with munging of protocols by intermediaries in general;
- You are more aware than most that the Internet is a diverse community
with widely varying needs and capabilities and that it is becoming 
more diverse all the time;
- You know enough about protocol design to appreciate the value of
separation of layers in general, and of separation of function between 
user agent and transport in particular; and
- You know enough about information storage and retrieval systems to
appreciate the value in keeping data models clean.

So I don't think I need to convince you of these things.  If I'm talking
to you specifically, I try to frame my statements with knowledge of your
experience and depth in mind. When I make statements like the above on
the IETF mailing list, I'm doing so for the benefit of people who don't
seem to understand these things (regardless of who is in the To field),
and part of my reason for doing so is to try to remedy that situation in
a small way.

Any good design is necessarily a compromise.  It might be that there are
cases where, _after_ considering the various factors, that adding [xxx]
is a reasonable compromise, particularly for a list that operates only
for a year or two - one can argue that UA capabilities won't change much
while the list is in use.  However such compromises are _not_ justified
by statements of the form it works for me, therefore it is good for
everyone -- particularly when the Internet is so diverse and when
there's a tendency for these practices to become entrenched.

It does seem like we often get bogged down in arguments between people
of widely varying depths, or between people of very different kinds of
expertise.  In the first case there is no basis for compromise because
the person who is out of his depth doesn't understand the need for
compromise or the basis that makes the compromise reasonable.  In the
second case compromise is difficult because there is little or no common
ground.  I'm not sure how to resolve either kind of impasse in a
reasonable fashion other than by discussion, though this does sometimes
get tedious. Yes, I'd like to find a better way.

At any rate, it seems difficult to get a compromise before it is clear
that people understand the issues associated with a 

Re: Adding SpamAssassin Headers to IETF mail

2003-12-18 Thread Jake Nelson
Dean Anderson wrote:
 Mostly, this is due to the revenge oriented blacklists that it uses.

You are aware that's it's trivial to disable all the blacklist testing in
the config, aren't you? SpamAssassin is extremely configurable.

-- Jake Nelson




layering and separation of function

2003-12-18 Thread Keith Moore
 On Thu, 18 Dec 2003 13:07:24 -0500
 Keith Moore [EMAIL PROTECTED] wrote:
 
  I'm a bit surprised at the frequency at which people who claim to be
  networking protocol engineers fail to appreciate the benefits of
  clean separation-of-function and layering.
 
 Hopefully the drawbacks are appreciated also.  Quoting Rich Seifert,
 Layering makes a good servant, but a poor master.  Use layering to
 organize the way you THINK about networks, but don't let it restrict
 how you DESIGN networks.  If I recall correctly, David Clark used to
 say something very similiar to this in a protocol workshop class at
 Interop awhile back.

there's clearly a limit to how much layering is desirable, and it's
often desirable to have a way to bypass layering in corner cases.
I'm convinced that IPv4 worked better because it _didn't_ separate 
location and identity, than it would have otherwise, because the cost
of the extra mapping layer would have been prohibitive for most of
IPv4's history (and may still be prohibitive, but we're getting
closer).

but failure to have clean interfaces and separation of function 

- makes the whole system more complex and less reliable (because 
components can't rely on other components functioning as advertised -
e.g. NATs try to second-guess apps and apps try to second-guess NATs)
and 

- makes the system less adaptable to meet unanticipated needs (because
assumptions about how things work are no longer isolated in certain
parts of the system but they permeate the entire system - meaning that
the entire system has to evolve rather than evolving it one piece at a
time)



Re: Adding [ietf] considered harmful

2003-12-18 Thread Mark Allman

  The subject line, on the other hand, is just for people.  
 
 Book titles are for people, too.  Does that mean that it's okay for a
 bookseller or library to change the titles on books, in order to help
 the consumer indentify where they came from?

Um, my library slaps a helpful identification tag on the spine of every
book to help me find it.  Your analogy, man ...

allman





Re: Adding [ietf] considered harmful

2003-12-18 Thread Mark Allman

Keith-

 Putting [foo] in the subject header is just another example of this
 trend.  Sure, it might be useful to people with dysfunctional MUAs,
 and there are a lot of those people out there. There were once a lot
 of people whose MUAs couldn't do reply all, too.

This is just wrong.

From lines and Reply-to and whatever are headers that are meant to
be processed by computers.  So, you can say all you want about how dumb
MUAs do or do not process these (and how intermediate mail servers
should keep their mits off).  Now, humans use these lines, too.  So,
call them dual use.

The subject line, on the other hand, is just for people.  Sure we can
make programs and filters grok them to classify mail if there is some
standard format (e.g., i-d actions).  But, fundementally subject lines
are for humans, not computers.  So, comparing subject line munging to
reply-to munging seems to me to pretty much apples and oranges.

You might read the above as supporting your point that we should not add
[ietf] to subject lines because subject lines are not for computers
(or dysfunctional MUAs) to process.  However, I think the correct
interpretation is that it is OK for the mail server to add these tags
**and** they may aid the entities that the subject line is actually for
in the first place (humans).  Hence, they are fine.

allman


(I cannot actually believe I am sending a non-snide comment in this
thread.  Someone should slap me.  I read through the whole thread last
night.  Every message was dumberer than the previous one (probably
including this one!).  I was literally laughing out loud.  I cannot
believe we are even having such a dumbass debate.  But, it was like a
wreck on the highway and I could not stop rubber-necking.  If we have
this much trouble about 6 characters in the subject line then we might
as well forget that problem statement thingy.  Really.)





Re: Hashing spam

2003-12-18 Thread Keith Moore
The problem with this analysis is that it assigns greater value to 
contributions from subscribers than to contributions from 
non-subscribers.  But often the failure to accept clues from 
outsiders causes working groups to do harm - and filtering messages 
in the #2 category increases this tendency.  The occasional rejection 
of #2 messages can be very harmful.

On Dec 18, 2003, at 3:01 PM, Vernon Schryver wrote:

  1. on-topic messages from subscribers
  2. on-topic messages from non-subscribers
  3. noise from subscribers
  4. noise from non-subscribers
  5. pure spam such as advertisements for loan sharks
In this list, only #1 is clearly good. It is good to avoid rejecting
#2, but there is surely no harm in sometimes delaying #2.  If the
senders of any rejected or false positive #2 received an informative
non-delivery report so that they could retransmit, what would be the 
harm?

SpamAssassin is reported to be better than 60% accurate.  #2 is surely
rare compared to #1.  Thus, as long as SpamAssassin white-lists all
subscribers, there would be no harm in the occasional rejection of #2.




What eMail is legitimate

2003-12-18 Thread John Leslie
Vernon Schryver [EMAIL PROTECTED] wrote:
 
 Concerning false positives for this mailing list--it would be wise to
 define what mail is legitimate.  In many places, you must accept at
 least 99.9% of all even remotely legitimate mail.  However, this context
 is different.  Here a boolean good/spam is simplistic and wrong.
 Instead we have a spectrum:
 
 1. on-topic messages from subscribers
 2. on-topic messages from non-subscribers
 3. noise from subscribers
 4. noise from non-subscribers
 5. pure spam such as advertisements for loan sharks

   Agreed that these categories exist. Alas, we cannot necessarily tell
them apart. :^(

 In this list, only #1 is clearly good.

   I'd greatly prefer to avoid flame-wars about how much difference
there is between #1 and #2...

   Personally, I consider the question pointless because we don't have
any dependable way to tell them apart. Please realize how trivially
easy it is to harvest poster addresses from archives and forge those
as From addresses.

 It is good to avoid rejecting #2, but there is surely no harm in
 sometimes delaying #2.

   I do not agree that there is surely no harm. (But I'd _really_
rather not argue that question.)

 If the senders of any rejected or false positive #2 received an
 informative non-delivery report so that they could retransmit, what
 would be the harm?

   I _won't_ discuss the possible harm...

   But Vernon's point that a prompt non-delivery report minimizes the
possible harm is an excellent one.

 SpamAssassin is reported to be better than 60% accurate.  #2 is surely
 rare compared to #1.  Thus, as long as SpamAssassin white-lists all
 subscribers, there would be no harm in the occasional rejection of #2.

   This is where I must disagree. Whitelisting something as easily
forged as the From address is simply wrong -- and if it is published
rule, we're sure to see spammers forging whitelisted From addresses
as their standard operating practice.

   If, OTOH, Vernon would like to whitelist the combination of From
address and IP address of the sending SMTP server, that could be a
very worthwhile practice, virtually immune to spammer forging.

--
John Leslie [EMAIL PROTECTED]



Re: What eMail is legitimate

2003-12-18 Thread Vernon Schryver
 From: John Leslie [EMAIL PROTECTED]

 ...
This is where I must disagree. Whitelisting something as easily
 forged as the From address is simply wrong -- and if it is published
 rule, we're sure to see spammers forging whitelisted From addresses
 as their standard operating practice.

As is true of many theories about what spammers do or will do, practice
differs from (simplistic) theory.  In the real world, whitelisting by
sender works fine and is not abused often enough to matter.  Whether
it works today because it is rarely used is a secondary issue good for
no more than trying to predict the future.

Yes, I know that spammers often forge source addresses.  I get more
than my fair share of demands from lusers that I unsubscribe them from
this or that stream of porn or other offensive spam.  Nevertheless,
such problems are trivial in this context.

That reasoning involves a second error common to IETF talk about spam
and mailing list noise.  It is the academic pretense that all failures
are of equal gravity and completely unacceptable.  In this case, the
failure mode that supposedly makes whitelisting by sender unacceptable
is merely leaking a little spam.


If, OTOH, Vernon would like to whitelist the combination of From
 address and IP address of the sending SMTP server, that could be a
 very worthwhile practice, virtually immune to spammer forging.

If you mean manual whitelisting, that sounds good in theory, but fails
in practice.  I've experience with various sorts of whitelisting,
because the DCC depends on whitelists to distinguish solicited from
unsolicited bulk mail.  Whitelisting by IP address fails in practice
because so much bulk mail comes from so many different and changing
SMTP clients.  For an example at the small end of the spectrum of
 bulk mail sources, I've had to regularly change the whitelisting
for IETF mailings.  Bigger legitimate bulk mailer often have too
many SMTP clients for outsiders to count, not to mention manually
whitelist.  You must find other ways to whitelist them.

However, whitelisting bulk mail by IP address is trivial compared to
whitelisting private mail by IP address.  I use greylisting (see
http://www.dcc-servers.net/dcc/greylist.html ) which can be described
as automated whitelisting by the triple (sender,sender-IP-address,target).
It works well, but only because it is automated and it uses 4yz soft
failures.  Many ISPs start sending a single message from one IP address
and switch to another after a few minutes--lather and repeat for up
to half a dozen different IP addresses for a single message.  It would
be hopeless to try to manually whitelist the IP addresses used by
customers of such ISPs.  The ISPs that do this sort of thing are among
the largest.


Vernon Schryver[EMAIL PROTECTED]



Re: Spam

2003-12-18 Thread Dr. Jeffrey Race
On Wed, 17 Dec 2003 23:10:43 -0500, Bill Cunningham wrote:
Now that the federal government has taken some steps in regulating spam,
does that mean that a technical need as the IETF would look for, isn't
needed?Maybe the Spam should be forgot about.

Bill has the CMOS backup battery failed in your workstation?   It is
December 17, not April 1 :)

Jeffrey Race




Re: Hashing spam

2003-12-18 Thread kent
On Thu, Dec 18, 2003 at 03:39:58PM -0500, Keith Moore wrote:
 The problem with this analysis is that it assigns greater value to 
 contributions from subscribers than to contributions from 
 non-subscribers.  But often the failure to accept clues from 
 outsiders causes working groups to do harm

I don't believe this is true, for any normal definition of often.  
Occasionally might be believable.

  - and filtering messages 
 in the #2 category increases this tendency.

One could just as easily argue that such filtering would decrease the
tendency, because people would modify their behavior to subscribe to
groups they cared about.  Also, one could just as easily argue that
working groups are just as likely to be harmed by distracting comments
from outsiders... 

 The occasional rejection 
 of #2 messages can be very harmful.

Seems more likely to me that the amount of harm would be lost in the
normal noise of ietf processes.

Regards
Kent

 On Dec 18, 2003, at 3:01 PM, Vernon Schryver wrote:
 
   1. on-topic messages from subscribers
   2. on-topic messages from non-subscribers
   3. noise from subscribers
   4. noise from non-subscribers
   5. pure spam such as advertisements for loan sharks
 
 In this list, only #1 is clearly good. It is good to avoid rejecting
 #2, but there is surely no harm in sometimes delaying #2.  If the
 senders of any rejected or false positive #2 received an informative
 non-delivery report so that they could retransmit, what would be the 
 harm?
 
 SpamAssassin is reported to be better than 60% accurate.  #2 is surely
 rare compared to #1.  Thus, as long as SpamAssassin white-lists all
 subscribers, there would be no harm in the occasional rejection of #2.

-- 
Kent Crispin   Be good, and you will be
[EMAIL PROTECTED],[EMAIL PROTECTED]lonesome.
p: +1 310 823 9358  f: +1 310 823 8649   -- Mark Twain
SIP: [EMAIL PROTECTED]




Re: Hashing spam

2003-12-18 Thread Keith Moore
 But often the failure to accept clues from
outsiders causes working groups to do harm
I don't believe this is true, for any normal definition of often.
Occasionally might be believable.
if I look at why working groups do harm, the failure to accept clues 
from outsiders does seem to crop up often.  Of course, this is my 
assessment (others might read the situation differently) and I can only 
make this statement about the groups I've actually looked at, which is 
a small and nonrandom sample.

One could just as easily argue that such filtering would decrease the
tendency, because people would modify their behavior to subscribe to
groups they cared about.
You're incorrectly assuming that people with clues have the time to 
subscribe to and follow every single group that might do something 
harmful.

 Also, one could just as easily argue that
working groups are just as likely to be harmed by distracting comments
from outsiders...
You could argue that.  I haven't found it to be the case.

The occasional rejection
of #2 messages can be very harmful.
Seems more likely to me that the amount of harm would be lost in the
normal noise of ietf processes.
Some noise is more harmful than others.  Some WGs have more potential 
to do harm than others, and those are the very WGs that need outside 
input.




Dec03: Update on administration restructuring

2003-12-18 Thread Leslie Daigle
In following up the discussion of the IAB Advisory Committee
output, on December 1:
http://www.ietf.org/mail-archive/ietf-announce/Current/msg27463.html

I noted that I would endeavour to post monthly updates on progress,
around mid-month.  It's only been 2 weeks, but I thought it
was important to get the update process rolling.
There have been a few comments on the AdvComm document itself,
draft-iab-advcomm-00.txt, but it seems largely ready to consider
finished.  We're going through it to check for final consistency
and editorial fixes, with a view to having a new version out
shortly.
In terms of follow-on, Harald and I have started to have more
formative discussions --  we've met with folks from ISOC and
from CNRI to begin the discussions of what we might do, moving
forward, to address the concerns laid out in the AdvComm document.
There's nothing conclusive to report there, yet.
Leslie.

--

---
Reality:
Yours to discover.
   -- ThinkingCat
Leslie Daigle
[EMAIL PROTECTED]
---





Re: What eMail is legitimate

2003-12-18 Thread John Leslie
Vernon Schryver [EMAIL PROTECTED] wrote:
 From: John Leslie [EMAIL PROTECTED]
 
 This is where I must disagree. Whitelisting something as easily
 forged as the From address is simply wrong -- and if it is published
 rule, we're sure to see spammers forging whitelisted From addresses
 as their standard operating practice.
 
 As is true of many theories about what spammers do or will do, practice
 differs from (simplistic) theory. 

   You're welcome to build your organization on the assumption that
spammers will continue doing the same thing they do today -- I choose
to design for what they're likely to do next month...

 In the real world, whitelisting by sender works fine and is not abused
 often enough to matter. 

   Seems to be true today...

 Whether it works today because it is rarely used is a secondary issue
 good for no more than trying to predict the future.

   I draw a distinction between predicting the future and planning for
the future. Planning for the future requires being ready for things
which may not be the most likely outcome.

   Thus, I'm not betting on what spammers will do next month. I'm hoping 
to be prepared for a number of different scenarios.

 Yes, I know that spammers often forge source addresses.  I get more
 than my fair share of demands from lusers that I unsubscribe them from
 this or that stream of porn or other offensive spam.

   Good to know you're aware of this.

 Nevertheless, such problems are trivial in this context.

   Today, maybe...

 That reasoning involves a second error common to IETF talk about spam
 and mailing list noise.  It is the academic pretense that all failures
 are of equal gravity and completely unacceptable. 

   I really don't know who you're talking about here. Certainly I have
said nothing remotely like that.

 In this case, the failure mode that supposedly makes whitelisting by
 sender unacceptable is merely leaking a little spam.

   I work on a simple-minded principle: if you want more of something,
arrange to reward it. I choose not to reward spammers for forging
_obvious_ From addresses.

 If, OTOH, Vernon would like to whitelist the combination of From
 address and IP address of the sending SMTP server, that could be a
 very worthwhile practice, virtually immune to spammer forging.
 
 If you mean manual whitelisting,

   No, I don't.

 that sounds good in theory, but fails in practice.  I've experience
 with various sorts of whitelisting, because the DCC depends on
 whitelists to distinguish solicited from unsolicited bulk mail. 
 Whitelisting by IP address fails in practice because so much bulk
 mail comes from so many different and changing SMTP clients. 

   I'd actually enjoy discussing this issue.

   However what I was discussing was whitelisting the _combination_
of From address and the IP address that sender normally sends from.
Until senders figure out how to forge the IP addresses of sending
SMTP servers, this should make the whitelist pretty safe.

 For an example at the small end of the spectrum of bulk mail sources,
 I've had to regularly change the whitelisting for IETF mailings. 
 Bigger legitimate bulk mailer often have too many SMTP clients for
 outsiders to count, not to mention manually whitelist. 

   I seriously doubt we need to worry about IETF contributors who
send through more than a few dozen IP addresses. Even if we do, it's
still easily automated. I see no scaling problems even with 1,000
different IP addresses a contributor sends through.

 You must find other ways to whitelist them.

   Perhaps, if you're doing it manually...

 However, whitelisting bulk mail by IP address is trivial compared
 to whitelisting private mail by IP address.  I use greylisting (see
 http://www.dcc-servers.net/dcc/greylist.html ) which can be described
 as automated whitelisting by the triple (sender,sender-IP-address,
 target).

   An interesting concept. I haven't tried it -- though I have used
something a bit close -- imposing 50% packet loss on IP ranges which
seem to contain many open relays. I find that legitimate email gets
through, while spam is significantly slowed. (Obviously I don't
consider this a solution, just a stop-gap.)

 It works well, but only because it is automated and it uses 4yz
 soft failures. Many ISPs start sending a single message from one
 IP address and switch to another after a few minutes--lather and
 repeat for up to half a dozen different IP addresses for a single
 message.

   Certainly an ill-suited tactic. (I can't help thinking we'd do
well to write up an info RFC about stupid SMTP tricks...)

 It would be hopeless to try to manually whitelist the IP addresses
 used by customers of such ISPs. 

   Not really, the way I'm thinking about it. Any combination not
already evaluated goes to a spam-evaluator (whether automatic or
manual): if evaluated as spam, an error is returned, if evaluated
as valid, the combination is whitelisted. I see no reason for any
manual process, with the possible 

Re: Hashing spam

2003-12-18 Thread Keith Moore
It just strikes me as highly unlikely that a WG would ever change 
course
because of what would look like random comments from outsiders -- it's
not consistent with the dynamics of a WG, or with human nature.
and that just might be one of our biggest problems, in a nutshell.




Re: Adding [ietf] considered harmful

2003-12-18 Thread Valdis . Kletnieks
On Thu, 18 Dec 2003 13:19:29 EST, Mark Allman said:

 Um, my library slaps a helpful identification tag on the spine of every
 book to help me find it.  Your analogy, man ...

A quick sampling of 15 books from our local public library shows that:

a) All 15 have spine tags for on the shelves and barcodes for check in/out.

b) The exact location of neither tag is standardized - the height of the spine
tag is variable and attempts to not obstruct the author/title originally
printed.  The barcode is *usually* placed on the back in such a way as to avoid
obstructing text, but on 2 books is on the *front* because less information got
overlaid that way.

Obviously, the library is telling us to try to not munge existing information by
sticking stuff in the Subject: line. :)


pgp0.pgp
Description: PGP signature