Re: How Not To Filter Spam

2004-02-19 Thread Iljitsch van Beijnum
On 19-feb-04, at 1:18, Robert G. Brown wrote:

If a message comes in incorrectly addressed, yes, it will bounce.  It
should, shouldn't it?
Yes, but only by ejecting the message immediately during the SMTP 
session. Accepting the message, then realize it can't be delivered and 
sending a bounce message back to the email address listed in the from 
header is NOT the right thing to do for reasons that should be apparent 
by now.

I actually think that the spamassasin/procmail combination above is
nearly ideal on the MUA side,
It is not, because:

1. Bandwidth is used up by spam (which is fortunately usually not that 
big) and worms (which tend to be much bigger)
2. A lot of processing time is used on your system(s)
3. Either:
3a. Legitimate senders who are tagged as spam are blackholed and are 
unaware that you don't read their message
3b. You must manually sort through all messages that are flagged as spam




How TO Filter Spam

2004-02-19 Thread Robert G. Brown
On Thu, 19 Feb 2004, Iljitsch van Beijnum wrote:

  I actually think that the spamassasin/procmail combination above is
  nearly ideal on the MUA side,
 
 It is not, because:
 
 1. Bandwidth is used up by spam (which is fortunately usually not that 
 big) and worms (which tend to be much bigger)

Close to a MB/day (on average) for me personally -- html is not compact.
Bouncing these requires (at the MTA or beyond) requires examining them
and applying tests to the entire message, generally, so bouncing at the
MTA saves no wasted bandwidth -- it close to doubles it.  

More than doubles it if the bounce generates further autogenerated mail
as it caroms off of bogus return addresses.  Note that server/netowrk
load is likely dominated by latency, not bandwidth -- most email
messages are at order of a single packet or two of data with ethernet
MTUs, but negotiating a transaction requires several rounds of small
packets.

Not enormous, agreed (although that is partly a matter of the size of
your organization and number of users and their internet profile to
spammers) but a significant fraction of all mail and not trivial.

 2. A lot of processing time is used on your system(s)

This is the same for MTA-side and MUA-side processing as well.  In fact
it might be the same core tool used in the two cases.  Processing power
is linked to how sophisticated a filter you apply, and a stupid filter
is a cure FAR worse than the disease.  Either way you have to receive
the entire message at least to memory and apply the filter.

The processing power required, BTW, while non-trivial is easily within
the reach of modern CPUs for up to perhaps a thousand users per server
(I don't really know how much beyond -- we're not close to a boundary
here with hundreds of users).  A fully userspace solution permits it to
even be distributed to the user's processors and offload it altogether
from the mail server.

The one thing I can think of that an MTA-side bounce without any sort of
spooling or significant logging of the rejected messages saves relative
to user side sorting is disk spool, and this is really at the option of
the user, who can spool rejects into /dev/null.

 3. Either:
 3a. Legitimate senders who are tagged as spam are blackholed and are 
 unaware that you don't read their message
 3b. You must manually sort through all messages that are flagged as spam

Instead you have legitimate readers who may be unaware that a message to
them was bounced or blacklisted, you have one size fits all message
sorting, you have the aforementioned doubling of bandwidth consumption,
you have the possibility of your site being used as a reflector in a
DDOS attack.  And you (the user) CAN'T sort through all messages that
are flagged as spam because they aren't spooled, they are rejected!

In MUA-side sorting with no bounce you control the sorting process more
or less completely and can set rejection thresholds high enough that
your false positive rate is (in YOUR judgement, not mine, the IETF's, or
your local system's administrator) negligible and then neglect them.  At
5, SA's false positive rate is maybe one message in 100 days, from my
spot checks of it, and that rate actually decreases in time as they
refine the tests.  Its false negative rate is high -- a few percent of
the spam it sees makes it through -- but so does more or less all of the
spam-like legitimate mail I get including mail from vendors and vendor
quotes.  The spam that does make it through is generally stealth spam
(the kind with relatively short messages with lots of random words
intended to confuse word count filters, few or no graphics, and no
embedded html) and is easily rejected.  A weekly catalog newsletter from
e.g. musician's friend and a local bookstore make it through (I'm
subscribed on their lists but have NOT whitelisted their sites) and act
as coal miner birds for other desired spam-like traffic.

Those of my friends who use a lower spam threshold (one that lets almost
NO spam through to their primary spool at the expense of a larger false
positive rate) do manually sort their presorted spam folder.  However,
BECAUSE the folder is presorted, all they functionally have to do is
scan the subject/from lines and use their internal whitelists to pick
out the false positives, click a button or two to delete the rest en
masse, and move on.  As in it takes them maybe a minute or two to
process hundreds of rejects with very high reliability and without
having to look at message contents at all in almost all cases.  If you
like, they prefer a two state sort with human judgement used before
final rejection but using a computer to do 80% of the work (winnowing
all the spam out of their mainline mail spool where they DO read each
message and think about it one at a time.  I'm talking systems people,
mostly, who have a very low false positive threshold and who DON'T want
to explain to their user base (which might include their employer, for
example) why a message sent to 

Re: 59th IETF - FINAL AGENDA

2004-02-19 Thread Lars Eggert
Hi,

FYI, I've made an ical version of the agenda available at 
http://www.icalx.com/public/larse/IETF-59.ics.

Apple iCal users can directly subscribe at 
webcal://www.icalx.com/public/larse/IETF-59.ics. I hear this may work 
with Mozilla as well, but I have no firsthand experience with that.

(A perl script periodically updates the ical version based on the ASCII 
agenda at http://ietf.org/meetings/agenda_59.txt. No guarantees for 
accuracy!)

Lars
--
Lars Eggert NEC Network Laboratories


smime.p7s
Description: S/MIME Cryptographic Signature


Re: How TO Filter Spam

2004-02-19 Thread Dean Anderson
All quite sensible.

What I do on my personal mailbox, is 

1) refile all mailing lists and well-known corrspondents
2) Select all of the remaining mail not to dean@ and not from mail
delivery and give it a once over for non-spam messages. These would be
wildcards from certain domains or spam.  Refile this in a spam sortfolder.
3) refile all bounced mailed according to bounce reason.  I need
to separate spam bounces to incoming undeliverable addresses from real
bounces that may need attention.
4) refile all the spam to dean@
5) What remains will be personal messages to dean@

6) at my leisure, I read the rest of the email lists and other 
mail.

Steps one through four can usually be done in about 15 to 20 minutes
without interruption on about 1500 messages per day.

On Thu, 19 Feb 2004, Robert G. Brown wrote:

 On Thu, 19 Feb 2004, Iljitsch van Beijnum wrote:
 
   I actually think that the spamassasin/procmail combination above is
   nearly ideal on the MUA side,
  
  It is not, because:
  
  1. Bandwidth is used up by spam (which is fortunately usually not that 
  big) and worms (which tend to be much bigger)
 
 Close to a MB/day (on average) for me personally -- html is not compact.
 Bouncing these requires (at the MTA or beyond) requires examining them
 and applying tests to the entire message, generally, so bouncing at the
 MTA saves no wasted bandwidth -- it close to doubles it.  
 
 More than doubles it if the bounce generates further autogenerated mail
 as it caroms off of bogus return addresses.  Note that server/netowrk
 load is likely dominated by latency, not bandwidth -- most email
 messages are at order of a single packet or two of data with ethernet
 MTUs, but negotiating a transaction requires several rounds of small
 packets.
 
 Not enormous, agreed (although that is partly a matter of the size of
 your organization and number of users and their internet profile to
 spammers) but a significant fraction of all mail and not trivial.
 
  2. A lot of processing time is used on your system(s)
 
 This is the same for MTA-side and MUA-side processing as well.  In fact
 it might be the same core tool used in the two cases.  Processing power
 is linked to how sophisticated a filter you apply, and a stupid filter
 is a cure FAR worse than the disease.  Either way you have to receive
 the entire message at least to memory and apply the filter.
 
 The processing power required, BTW, while non-trivial is easily within
 the reach of modern CPUs for up to perhaps a thousand users per server
 (I don't really know how much beyond -- we're not close to a boundary
 here with hundreds of users).  A fully userspace solution permits it to
 even be distributed to the user's processors and offload it altogether
 from the mail server.
 
 The one thing I can think of that an MTA-side bounce without any sort of
 spooling or significant logging of the rejected messages saves relative
 to user side sorting is disk spool, and this is really at the option of
 the user, who can spool rejects into /dev/null.
 
  3. Either:
  3a. Legitimate senders who are tagged as spam are blackholed and are 
  unaware that you don't read their message
  3b. You must manually sort through all messages that are flagged as spam
 
 Instead you have legitimate readers who may be unaware that a message to
 them was bounced or blacklisted, you have one size fits all message
 sorting, you have the aforementioned doubling of bandwidth consumption,
 you have the possibility of your site being used as a reflector in a
 DDOS attack.  And you (the user) CAN'T sort through all messages that
 are flagged as spam because they aren't spooled, they are rejected!
 
 In MUA-side sorting with no bounce you control the sorting process more
 or less completely and can set rejection thresholds high enough that
 your false positive rate is (in YOUR judgement, not mine, the IETF's, or
 your local system's administrator) negligible and then neglect them.  At
 5, SA's false positive rate is maybe one message in 100 days, from my
 spot checks of it, and that rate actually decreases in time as they
 refine the tests.  Its false negative rate is high -- a few percent of
 the spam it sees makes it through -- but so does more or less all of the
 spam-like legitimate mail I get including mail from vendors and vendor
 quotes.  The spam that does make it through is generally stealth spam
 (the kind with relatively short messages with lots of random words
 intended to confuse word count filters, few or no graphics, and no
 embedded html) and is easily rejected.  A weekly catalog newsletter from
 e.g. musician's friend and a local bookstore make it through (I'm
 subscribed on their lists but have NOT whitelisted their sites) and act
 as coal miner birds for other desired spam-like traffic.
 
 Those of my friends who use a lower spam threshold (one that lets almost
 NO spam through to their primary spool at the expense of a 

Re: How Not To Filter Spam

2004-02-19 Thread Ed Gerck


Vernon Schryver wrote:
 
 If the envelope sender was forged as is common in spam, universal in
 worms, and practically nonexistent in legitimate mail, then your bounce
 will afflict third party's mailbox.  My mailbox receives enough worm
 bounces to make me say it is an awfully bad thing.

Yes. However, if your mailbox could automatically handle confirmation
requests based on messages that were actually sent by you (in much
the same way that NAT boxes work -- you only get a reply to a request 
you send), then you would not be bothered by the C-R traffic at all. 


 The only fix is to have your external MX servers know all valid
 addresses and so reject junk before it can be accepted and later
 need to be bounced.  That fix is often impractical or impolitic.

Yes, also because all valid addresses is a dynamic list.
 
 No, SPF, RMX, TOES, etc. etc. etc. cannot fix this problem unless you
 assume frictions (deployment resistence and delays) do not exist
 or you discard SMTP design goals including transporting messages among
 complete strangers.

Messages among complete strangers is a necessary feature, IMO, but  
shouldn't it behave in cyberspace as we learned to do it in the 
social space? Trust is earned. When a complete stranger calls me, 
I usually ask who or what introduced me to him before I start any 
conversation. If the complete stranger has no satisfactory answer, 
I ask him to take me off his database and not call again.

 People who know each other's crypto keys are not strangers.

It is possible for my MUA to automatically provide a complete stranger 
with my PK if I receive an email from him. The barrier to have my 
crypto keys does not have to be any higher than the barrier to have 
my email address.

 If you could someday trust organizations to vouch for strangers and
 not sell spam-for-a-day certs to Ralsky/Ricther/co, then today you
 could trust the same outfits to not sell spam-for-a-day/week/years IP
 bandwidth accounts.

Yes. TTPs cannot be trusted per se. The answer is not PKI as we know it.



Re: How Not To Filter Spam

2004-02-19 Thread Vernon Schryver
 From: Ed Gerck [EMAIL PROTECTED]

 Yes. However, if your mailbox could automatically handle confirmation
 requests based on messages that were actually sent by you (in much
 the same way that NAT boxes work -- you only get a reply to a request 
 you send), then you would not be bothered by the C-R traffic at all. 

As long as you are wishing for things with no prospect of reality
in the foreseeable future, why not wish for long jail terms for the
ROKSO 200?

Automatic C-R handling in MUAs would solve the spam problem much
as NAT boxes have solved the address shortage and routing table
size problems, by creating other problems that are worse in the
long run.  For example, C-R handling in MUAs would do nothing for
the problems C-R systems have with mail that is not simplistic
messages between individuals.

Someone recently wrote that challenge/response systems would be practical
if there were a way for C-R systems to identify and not challenge
mailing list traffic.  That made me choke, because all spam is mailing
list traffic.  Perhaps what was intended was making C-R systems recognize
solicited mailing list traffic.  If your C-R system could do that,
there would be no need for any challenging or responding.  You would
challenge neither non-bulk nor solicited bulk mail, and would simply
reject all unsolicited bulk or spam mailing list traffic.


 Messages among complete strangers is a necessary feature, IMO, but  
 shouldn't it behave in cyberspace as we learned to do it in the 
 social space? Trust is earned. When a complete stranger calls me, 
 I usually ask who or what introduced me to him before I start any 
 conversation. If the complete stranger has no satisfactory answer, 
 I ask him to take me off his database and not call again.

If that's good enough for you, then you already have it.  The start
of a phone call from a stranger corresponds to the initial mail
message.  The asking to be added to a DNC list corresponds to adding
an entry to your email blacklist.

You probably want PKI magic that will tell your MTA or MUA whether
substantially identical copies of an incoming message from a complete
stranger will soon be sent to 30,000,000 of your intitmate friends.
That magic would happen before you do the equivalent of answering
a phone call from a stranger.

If you are among those who configure their telephones to reject calls
with caller-ID values not in whitelist, then you can configure your
email system to do the same with IP addresses.  That will eliminate
essentially all spam.  It also eliminates messages from strangers.


  People who know each other's crypto keys are not strangers.

 It is possible for my MUA to automatically provide a complete stranger 
 with my PK if I receive an email from him. The barrier to have my 
 crypto keys does not have to be any higher than the barrier to have 
 my email address.

If a complete stranger is the sender of an incoming message, then
crypto keys are irrelevant to determining the message is unsolicited
bulk.  If the sender of spam is not a stranger, then you made a mistake
in handling keys.

The PGP mantra that a good key does not imply that the sender or the
message is good applies here.


Vernon Schryver[EMAIL PROTECTED]



Re: How Not To Filter Spam

2004-02-19 Thread Ed Gerck


Vernon Schryver wrote:
 
 If a complete stranger is the sender of an incoming message, then
 crypto keys are irrelevant to determining the message is unsolicited
 bulk.  

No. In PGP, for example, I accept a key based on who signed it and
when. If I can trust the signer(s), I may use a key from a stranger.

 The PGP mantra that a good key does not imply that the sender or the
 message is good applies here.

Define good key and you'll define what the key is good for.



Re: How TO Filter Spam

2004-02-19 Thread Doug Royer


Iljitsch van Beijnum wrote:

If you reject the message during the SMTP session you don't need to 
generate a bounce message, the other side will do this. So the 
bandwidth waste is the same in both cases.
Not only that,  bulk spammers (hacked or not) keep it in their queue and 
not yours when
it is not delivered. They might retry later, however they do that anyway.

--

Doug Royer |   http://INET-Consulting.com
---|-
[EMAIL PROTECTED] | Office: (208)520-4044
http://Royer.com/People/Doug   | Fax:(866)594-8574
  | Cell:   (208)520-4044
 We Do Standards - You Need Standards




smime.p7s
Description: S/MIME Cryptographic Signature


Re: How Not To Filter Spam

2004-02-19 Thread Vernon Schryver
 From: Ed Gerck [EMAIL PROTECTED]

  If a complete stranger is the sender of an incoming message, then
  crypto keys are irrelevant to determining the message is unsolicited
  bulk.  

 No. In PGP, for example, I accept a key based on who signed it and
 when. If I can trust the signer(s), I may use a key from a stranger.

That sounds like the old authentication solves spam hope.  It was
wrong before SMTP-AUTH and it is still wrong.  If the sender is a
stranger, then by the definition of stranger you can know nothing
more than that the key works.  You cannot know whether the stranger
is one of Alan Ralsky's myriad of aliases delivering spam.


  The PGP mantra that a good key does not imply that the sender or the
  message is good applies here.

 Define good key and you'll define what the key is good for.

The ancient PGP mantra refers to keys that work, as in the results
of decoding using the indicated public keys yield a valid messages.
The key can be good, but a good key tells you nothing more than that
the sender of the message knows the corresponding private key. 

Would you trust every PGP key from the IETF key signings to guarantee
that a message is not spam?  Some IETF participants have been unashamed
senders of unsolicited bulk commercial advertisements.  The person I'm
thinking of objected to his entry in my blacklist by insisting that
although he had sent the triggering message, it was not spam because
he had not sent more than one copy per mailbox.  He might have since
changed his definition and stopped sending unsolicited bulk mail, but
it would be silly to think everyone who gets a PGP key signed at an
IETF key signing party is someone from whom you want to receive mail.

Given who will pay certifiers, the IETF key signings are far less
bad guarantors of non-spam than commercial certifiers.  Consider
privacy policy certifiers and see one of the several versions of
http://enterprise-security-today.newsfactor.com/story.xhtml?story_title=Online_Privacy_Policies_Misleading

] An analysis of Web sites carrying those seals found that the
] companies running them ask for more personal information -- and
] protect it less -- than sites that have no seals.


Vernon Schryver[EMAIL PROTECTED]



Re: How Not To Filter Spam

2004-02-19 Thread Ed Gerck


Vernon Schryver wrote:
 
  From: Ed Gerck [EMAIL PROTECTED]
 
   If a complete stranger is the sender of an incoming message, then
   crypto keys are irrelevant to determining the message is unsolicited
   bulk.
 
  No. In PGP, for example, I accept a key based on who signed it and
  when. If I can trust the signer(s), I may use a key from a stranger.
 
 That sounds like the old authentication solves spam hope.  It was
 wrong before SMTP-AUTH and it is still wrong.  If the sender is a
 stranger, then by the definition of stranger you can know nothing
 more than that the key works. 

It seems that you're not a PGP user. A signed PGP key has more useful 
information than just the key value. PGP keys can and should be signed 
by the key-holder and by one or more introducer(s). If you can trust 
those signer(s) as introducer(s), you may use a key from a stranger.  

BTW, this has nothing to do with authentication solves spam. Spam is a 
complex problem that can only be solved by an array of measures where, 
IMO, PK encryption is more useful than PK signatures.

   The PGP mantra that a good key does not imply that the sender or the
   message is good applies here.
 
  Define good key and you'll define what the key is good for.
 
 The ancient PGP mantra refers to keys that work, as in the results
 of decoding using the indicated public keys yield a valid messages.

No, this is not how PGP keys should be accepted and considered good.
Of course, since the rules of PGP are user-centric, you may define
whatever you want as good keys.



is there any other alt. hotel for IETF

2004-02-19 Thread James Seng
I know I shouldnt wait last minute to book but...

Is there any other hotel other then lotte which charge a reasonable rate?

-James Seng



Re: is there any other alt. hotel for IETF

2004-02-19 Thread Dave Crocker
James,

JS Is there any other hotel other then lotte which charge a reasonable rate?

I'll be staying at:

Best Western New Seoul Hotel
#29-1, 1-Ga, Tyaepyeong-No, Jung-Gu
Seoul, 100-101, Korea (South)
Phone: 82 2 735 8800
Fax: 82 2 735 6927

I'm told it is nearby the ietf venue.

The online comments about it sound reasonable, although it has some
quirks, such as windowless rooms. On the other hand, those rooms are the
quiet ones.

d/
--
 Dave Crocker dcrocker-at-brandenburg-dot-com
 Brandenburg InternetWorking www.brandenburg.com
 Sunnyvale, CA  USA tel:+1.408.246.8253