New BOF in Application Area (Internet information retrieval infrastructure)

2004-02-17 Thread wang liang
There has been some discussion for Internet information retrieval service in
work groups of IRTF. Now this issue will be discussed in the BOF of
Application Area.

As one of most important services of Internet, current information retrieval
service is still far from our expectation---precise, comprehensive and fresh
information. This problem may become more serious with the rapid development
of Internet. We need pay more attention to this issue. This BOF is just for
it. Wish you can participate in.

The date of this BOF will be determined soon.

The content of BOF:

Internet Information Retrieval Infrastructure(iiri)
=
CHAIRS: Guo Yiping ([EMAIL PROTECTED])
Wang Liang ([EMAIL PROTECTED])

Co-Chair: Andrew Newton([EMAIL PROTECTED])

What's the main purpose of Internet? Information retrieval and exchanging.
But what's the most important principle to judge a network? Maybe
communication speed. For common user, they can't feel search service in GB
Internet is better than that in MB networks. The great progress of Internet
didn't bring the great improvement in its main service.
Now Internet is going far and far from their original aim, knowledge source
of human being, and transforming into a jumbled information sea. Many
experts are mainly concerned with physical Internet, but now we need pay
more attention to information Internet.
Information retrieval services may be the most important service of
Internet, but there is till no a special work group for it. Current
commercial search engines meet many bottleneck problems in coverage and
recency. Its service is far from our expectation. It's just a web pages
search system. We can also get information from many other information
resources such as special databases, FTP search engine, P2P, etc. So a work
group for Internet information retrieval system is very necessary. We have
proposed a basic information retrieval frame, DRIS (Domain Resource
Integration System), for this issue. Any related topic could be discussed in
this group.

AGENDA:
Draft agenda for the BOF:
--
History of IETF work in this area
15 min
Introduction to problem space and DRIS15 min
Internet Information retrieval infrastructure and digital library15 min
draft-liang-irpdl-03.txt
15 min
IPv6 and information retrieval system
15 min
Discussion
remaining time

Description of this work group(DRIS):
With the rapid increase of the web pages, the coverage of search engines
will become poorer and the update interval will be much longer. If the
current architecture of search engines is still in use, it will be an
impossible mission to find the precise and comprehensive information in the
future. This problem will be more serious when IPV6 technology is widely
implemented in communication networks. The problem of Too much information
means no information may become a disaster with information explosion. To
solve this problem, there should be an efficient information management
system for Internet.
In this group, Domain Resource Integrated System--DRIS will be proposed.
DRIS is a distributed information retrieval system, which will build the
information retrieval infrastructure for the Internet and also can be
regarded as a kind of Internet information management system.
DRIS is a hierarchical distributed search system and comprise three kinds of
information retrieval system, conventional database system, distributed
search system and metadata harvest system. We will first define the basic
search system and then define the entire DRIS.
Specific work items are:
1 Standard distributed search system. It defines the platform-independent
search interface and a collection description standard for heterogeneous
information resources. An I-D information retrieval protocol for digital
resources has been proposed.
2 Standard metadata harvest system. A protocol based some available opening
standard like OAI will be proposed. It will define a standard metadata that
can be compatible with most database system.
3 Standard public web pages search system.
4 DRIS. It will define entire DRIS. It includes its whole architecture, the
relation between different nodes, etc.
5 DRIS and IPV6. The cooperation with IPV6 WG will be proposed. IPV6 will be
the most distinct feather of next generation Internet.IPV6 is still in
improving and any technology that can benefit the Internet all can be added
to the IPV6 system. Since the searching is the main service of most user of
Internet and this service is not so satisfied to us in current Internet, why
not take this request into account when build the new Internet. For example,
in IPV6, all kinds of data flows are assigned a priority, and then Internet
can guarantee a high priority to the data flow of DRIS. So there may need
some considerations for the relation between DRIS and IPV6.

Mailing list
[EMAIL PROTECTED]

Archive and general 

WG Review: Routing Area Working Group (rtgwg)

2004-02-17 Thread The IESG
A new IETF working group has been proposed in the Routing Area. The IESG has not made 
any determination as yet. The following description was submitted, and is provided for 
informational purposes only. Please send your comments to the IESG mailing list 
([EMAIL PROTECTED]) by February 19.


Routing Area Working Group (rtgwg)
--

 Current Status: Proposed Working Group

 Description of Working Group:

 The Routing area receives occasional proposals for the development and
 publication of RFCs dealing with routing topics, but for which the
 required work does not rise to the level where a new working group is
 justified, yet the topic does not fit with an existing working group,
 and a single BOF would not provide the time to ensure a mature
 proposal. The rtgwg will serve as the forum for developing these types
 of proposals.

 The rtgwg mailing list will be used to discuss the proposals as they
 arise. The working group will meet if there are one or more active
 proposals that require discussion.

 The working group milestones will be updated as needed to reflect the
 proposals currently being worked on and the target dates for their
 completion. New milestones will be first reviewed by the IESG. The
 working group will be on-going as long as the ADs believe it serves a
 useful purpose.




Re: New BOF in Application Area (Internet information retrieval infrastructure)

2004-02-17 Thread Ted Hardie
At 6:38 PM +0800 02/17/2004, wang liang wrote:
There has been some discussion for Internet information retrieval service in
work groups of IRTF. Now this issue will be discussed in the BOF of
Application Area.
As one of most important services of Internet, current information retrieval
service is still far from our expectation---precise, comprehensive and fresh
information. This problem may become more serious with the rapid development
of Internet. We need pay more attention to this issue. This BOF is just for
it. Wish you can participate in.
The date of this BOF will be determined soon.
The timing and full agenda are available at:

http://www.ietf.org/ietf/04mar/iiri.txt

Note that the chairs are different from that listed in Wang Liang's
recent note; Guo Yiping and Andy Newton will chair the meeting,
since Wang Liang will be a major presenter.  John Klensin will
be presenting the history of IETF work in the area, but it would be
useful if those present were familiar with the previous work
in METAD and FIND as well as the citations given in the BoF
agenda.
regards,
Ted Hardie



Re: covert channel and noise -- was Re: proposal ...

2004-02-17 Thread John Leslie
Vernon Schryver [EMAIL PROTECTED] wrote:
 
 I know of many millions of spam that are filtred during the DATA command
 every day, and I don't claim to know about any really big sites.
 
 The only problems are:
   - local administrative choices that keep bastion SMTP servers ignorant
   of per-user filter preferences

   This is a feature, not a problem. If the end user wants a filtering
process individualized that much, s/he should choose to use a SMTP
server which agrees to do so.

   - filtering at the DATA command requires either (1) rejecting for
  all or no targets or (2) accepting for all targets and siliently
  discarding the message for those targets that want it filtered.

   Alternatively, the receiving SMTP server could reject any multiply-
addressed email.

   Is it actually that unreasonable to apply the most-restrictive
filtering rules in the case of multiply-addressed email?

   (Silently discarding _is_ a bad idea, when done by the SMTP server
itself. IMHO, it's better to mark for later discard -- which actually
could be done in such a way as to mark only for those recipients who
requested the more restrictive filtering.)

 In theory the second problem could be fixed if the DATA command could
 accept a vector of 250-OK/4yz-try-later/5yz-fatal responses, one for
 each target named with a Rcpt_To command.  In practice the spam problem
 will be solved one way or another long before such a protocol change
 would be sufficiently widely deployed to matter.

   Agreed: that radical a change in SMTP wouldn't percolate through
quickly enough.

--
John Leslie [EMAIL PROTECTED]



Re: New BOF in Application Area (Internet information retrieval infrastructure)

2004-02-17 Thread Ted Hardie
At 8:55 AM -0800 02/17/2004, Ted Hardie wrote:
It would be useful if those present were familiar with the previous work
in METAD and FIND as well as the citations given in the BoF
agenda.
By the way,  the old FIND charter and documents are listed here:

http://www.ietf.org./html.charters/OLD/find-charter.html

and METAD mailing list archives are at:

http://www.usrlocalsrc.org/bunyip/metad.archive/

regards,
Ted Hardie


Re: covert channel and noise -- was Re: proposal ...

2004-02-17 Thread Vernon Schryver
 From: John Leslie 

 ...
- local administrative choices that keep bastion SMTP servers ignorant
of per-user filter preferences

This is a feature, not a problem. If the end user wants a filtering
 process individualized that much, s/he should choose to use a SMTP
 server which agrees to do so.

That is a feature only if the user accepts the consequences of discarding
mail without generating bounces, including not informing senders of false
positives.  Bounces from internal spam filters (either in MUAs or MTAs
inside organizations) are a major source of unsolicited bulk mail or spam.


- filtering at the DATA command requires either (1) rejecting for
   all or no targets or (2) accepting for all targets and siliently
   discarding the message for those targets that want it filtered.

Alternatively, the receiving SMTP server could reject any multiply-
 addressed email.

People running SMTP servers that handle 100K or more msgs/day have
been uniformly horrified when I've suggested that.  I don't really
understand why, but I have given up on the idea.



(Silently discarding _is_ a bad idea, when done by the SMTP server
 itself. IMHO, it's better to mark for later discard -- which actually
 could be done in such a way as to mark only for those recipients who
 requested the more restrictive filtering.)

A better positition is that everything should be logged, particularly
including discarded mail, and in that case, enough of bodies to allow
targets to identify senders and the nature of the discarded messages.
Of course, one should assume users won't normally look at those logs.
Spam you read is not filtered, but at most categorized and stigmatized.


Vernon Schryver[EMAIL PROTECTED]



Re: covert channel and noise -- was Re: proposal ...

2004-02-17 Thread Vernon Schryver
 From: Robert G. Brown 

 ...
 Or, mark for later accept/reject decisioning AFTER the SMTP server per
 se, in the filter pipeline between the server and the mail spool of the
 addressee.  Spam assassin does the right thing already (and this is
 exactly what it does).

***NO***!  Except when run as a milter or otherwise during the SMTP
transaction, SpamAssassin does the WRONG thing.  As run almost everywhere,
after the SMTP transaction, SpamAssassin can only iether silently discard
spam or generate new spam by sending bounces to innocent people.


  A better positition is that everything should be logged, particularly
  including discarded mail, ...

 Logging a message you reject is nearly a waste of time.  

Based on my experience, people running ISPs or other large mail system
strongly disagree with your position.  Besides, I intentionally wrote
about logging ***discarded** mail.

Many institutions do turn off logging of greylisted messages, reduce
the default per-message logging limit of 32K Bytes, or delete log files
far sooner than the 14 default in the DCC source.  Still, logging
is seen as vital to be able to answer questions about which messages
were filtered and why, including being able to say that message was
never sent or substantially identical copies of that message were
sent to 310 other users here and 433,797 users elsewhere; it was spam.


  In order to
 recover the message (as you note, nobody ever looks at the logs, which
 are VERY LARGE for a busy mailer and beyond human capacity to scan),

I said nothing about humans scanning everything.  Besides, giving
users the sense that they can see what's happening with spam filtering
on their mail and control it is a requirement for getting users to
accept filtering.

 ..
 This is where, and why, I take issue with filtering and discarding at
 the level of the SMTP server, unless the accept/reject decision can be
 made with 100% precision (no false positives, no false negatives, and it
 may not be good even then because MY idea of the correct basis for the
 decision may not be the same as YOURS).

What you describe is a broken version of what I advocate, if you
consistently look at your personal log of rejected mail.  Your version
is broken because a reject decision after the SMTP transaction must
at least sometimes result in sending spam to innocent people.


 ...
 It's not that filtering based on non-header-linked aspects of content is
 or isn't a good idea in some cases.  It is that it has no business being
 in the specification of TCP.  ...

 pure chance have a byte sequence like SEX that caused it to be rejected
 ..

Nice straw man.  I've never heard anyone with a taint of technical
clues talk about looking for SEX in raw TCP segments.

 ...
 For nearly all filtering programs, it is too easy to create a message
 that is filtered but shouldn't be.  ...

You evidently lack experience with the filters used by commercial
institutions.  Commersical SMPT servers cannot tolerate false positives
(legitimate rejected/total legitimate) of more than 0.01%, and even
that's pushing it.  The design requirement for filtering mail on which
money depends is that false positives must not be much worse than the
underlying SMTP error rate due to problems such as full disks and
broken DNS servers--not to mention mail recipients too quick to delete.

A lot of current talk about false positives is self-serving nonsense
from such as the Direct Marketing Association.  Manual spam filtering
also has false positives.  A human suffering a common spam load of 100
spam/day has trouble not deleting legitimate mail.  My filters are
rejecting about 300 spam/day sent in my direction, 12227 in the last
40 days.  Mechanical filtering even with a significant false positive
rate can reduce the overall false positive rate.


 ...
 SMTP was designed to permit reasonably RELIABLE (simple) transport of

It would be good to skip the networking 101 tutorials.  Those of us who
don't know all of that about TCP, SMTP, CSMA/CD, etc. often thanks to
decades of personal experience should apply elsewhere to learn the basics.
It is may hard to imagine how old farts like me see such tutorials, but
please try.  I've been receiving email as vjs since 1968.


 It seems to me to be highly unacceptable to attempt to insert
 content-based accept/reject decisioning in at this PROTOCOL level in the
 delivery process. 

That use of level confuses me.  It does not seem to conform to ISO
OSI architecture.

 ...
 reliable transport mechanism for important messages.  Filtering it for
 me according to ANY CONTENT-BASED RULESET risks discarding at least some
 messages that are not correctly classified when they are rejected.
 Important messages can be lost.  Bad things can result.  Who is
 responsible when this occurs?  Who do I get to sue?

You don't get too sue anyone, because a reasonably designed system
lets you choose to do all of your spam 

Educational Sessions in Seoul

2004-02-17 Thread Margaret Wasserman
Hi All,

The EDU Team is offering several training sessions on Sunday afternoon in Seoul
that are open to ALL IETF ATTENDEES.  These sessions include:
- Newcomers Training (in both English and Korean)
- Editors Training
- Introductory WG Chairs Training
- Security Tutorial
Details about these sessions can be found below.

Please note that you do not need to be a current Editor or WG Chair to attend
those sessions, these sessions are all open to everyone!  So, show up early
and learn more about how to work effectively within the IETF.
Margaret



Sunday, February 29, 2004
=
1300-1400  Newcomer's Training in English -- Location??  (Spencer Dawkins)

   An introduction to the IETF for new or recent IETF
   attendees.  Covers the IETF document processes, the
   structure of the IETF, and tips for new attendees to
   be successful in the IETF environment.
1400-1500  Newcomer's Training in Korean -- Location??  (ChangJoon Kim)

   [Include description on agenda in Korean, if possible.]

   An introduction to the IETF for new or recent IETF
   attendees.  Covers the IETF document processes, the
   structure of the IETF, and tips for new attendees to
   be successful in the IETF environment.
1300-1500  Editor's Training -- Location??  (Avri Doria)

   Training for current or aspiring IETF document
   editors.  Covers the roles and responsibilities
   of a document editor, and includes advice on
   producing a high-quality IETF specification.
1300-1500  Intro WG Chairs Training -- Location??  (Margaret Wasserman)

   Introductory training for new or aspiring WG
   chairs.  Covers the role and responsibilities of
   a WG chair, and includes advice on how to run a
   WG that is fair, open and productive.  This class
   is open to all IETF attendees.
1500-1700  Security Tutorial -- Location??  (Radia Perlman)

   All IETF attendees need to be aware of the security
   implications of our design choices.  This session
   offers a basic primer in protocol security, as well
   as advice on how to write the Security Considerations
   sections required for all IETF documents.



How Not To Filter Spam

2004-02-17 Thread Vernon Schryver
Thn enclosed example of how not to filter spam is offered for those
who might want to preemptively add accuspam.com or downloadfast.com
to their blacklists.

It is also a classic example of what is wrong with the MUA filtering
tactics Robert Brown advocates.


I certainly did not try to contact anyone at 3dize.com.  A few readers
of this mailing list might recall that Shelby H. Moore III and I don't
see eye to eye.
I do not know what the From: header of [EMAIL PROTECTED] is about.
Perhaps it was forged.  Or perhaps someone at that address wired a
subscription to this mailing list through an unusually braindea)
challenge/response spam filter at DownloadFAST.com.  If so, the
Secretariat should blacklist accuspam.com and/or downloadfast.com from
subscriptions to IETF mailing lists.  (It would need to be unusually
braindead to send me the challenge instead of the envelope sender.)


Vernon Schryver[EMAIL PROTECTED]


 From [EMAIL PROTECTED]  Tue Feb 17 19:26:14 2004
 Received: from www.DownloadFAST.com (www.downloadfast.com [65.61.155.11])
   by calcite.rhyolite.com (8.12.11/8.12.11) with ESMTP id i1I2QDiM048994
   for [EMAIL PROTECTED] env-from [EMAIL PROTECTED];
   Tue, 17 Feb 2004 19:26:13 -0700 (MST)
 Received: from www.downloadfast.com (localhost.downloadfast.com [127.0.0.1])
   by www.DownloadFAST.com (8.12.10/8.12.10) with ESMTP id i1I1hIQf002492
   for [EMAIL PROTECTED]; Tue, 17 Feb 2004 19:43:18 -0600 (CST)
 Received: (from [EMAIL PROTECTED])
   by www.downloadfast.com (8.12.10/8.12.6/Submit) id i1I1hITA002491;
   Tue, 17 Feb 2004 19:43:18 -0600 (CST)
 Date: Tue, 17 Feb 2004 19:43:18 -0600 (CST)
 Message-Id: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: Re: Re: covert channel and noise -- was Re: proposal ...
 From: [EMAIL PROTECTED]
 Reply-To: [EMAIL PROTECTED]

 You attempted to contact us via email, and our anti-spam system is returning your 
 email below.

 Please kindly resend your email below, using our Contact Form on our web page:

 http://accuspam.com?kM,vbrJT

 After using our Contact Form only once, your future emails will not be returned.

 If you do not use our Contact Form to re-send your email below, then we can not read 
 it.

 __
 Powered by http://AccuSpam.com. Signup instantly for FREE!


 Your Returned Message:
 From: Robert G. Brown 

  ...
  Or, mark for later accept/reject decisioning AFTER the SMTP server per
  se, in the filter pipeline between the server and the mail spool of the
  addressee.  Spam assassin does the right thing already (and this is
  exactly what it does).

 ***NO***!  Except when run as a milter or otherwise during the SMTP
 ...



Re: How Not To Filter Spam

2004-02-17 Thread william(at)elan.net

On Tue, 17 Feb 2004, Vernon Schryver wrote:

 It is also a classic example of what is wrong with the MUA filtering

You certain dont assume that there is nothing wrong with the filtering
system you use and others may try duplicate as well. Otherwise how would 
you explain that you have Elan and completewhois.com listed as filtered
on your site. Do you honestly believe we ever sent you any SPAM? Or maybe 
you're making certain assumptions about envelope from or normal From: 
headers and complaining when others are making the similar assumptions?
 
-- 
William Leibzon
Elan Networks
[EMAIL PROTECTED]




Re: How Not To Filter Spam

2004-02-17 Thread Vernon Schryver
 From: william(at)elan.net 

  It is also a classic example of what is wrong with the MUA filtering

 You certain dont assume that there is nothing wrong with the filtering
 system you use and others may try duplicate as well. Otherwise how would 
 you explain that you have Elan and completewhois.com listed as filtered
 on your site. Do you honestly believe we ever sent you any SPAM? Or maybe 
 you're making certain assumptions about envelope from or normal From: 
 headers and complaining when others are making the similar assumptions?

Mail from Elan and completewhois.com is unwelcome at rhyolite.com in
patt because of a message that said:

] Elan.Net Internet
] T.1 T.3 Frame Relay
] If you need more information about us or are interested in network services 
] (managed hosting, collocation, dedicated servers, t1, t3), please send email to 
[EMAIL PROTECTED] 
] 
] For More info 
] http://www.elan.net
] [EMAIL PROTECTED]

There are additional, independent, sufficient reasons for that listing
that we do not need to explore.  If you will read my web pages, you'll
see that my list of unwelcome domains is not only about senders of
unsolicited bulk email.

An advantage of a vanity or other tiny domain is that it can use
blacklists that would have intolerable false positive rates at other
or larger outfits but that have 0.000% local false positive rates.


Vernon Schryver[EMAIL PROTECTED]



Re: How Not To Filter Spam

2004-02-17 Thread william(at)elan.net
On Tue, 17 Feb 2004, Vernon Schryver wrote:

  From: william(at)elan.net 
 
   It is also a classic example of what is wrong with the MUA filtering
 
  You certain dont assume that there is nothing wrong with the filtering
  system you use and others may try duplicate as well. Otherwise how would 
  you explain that you have Elan and completewhois.com listed as filtered
  on your site. Do you honestly believe we ever sent you any SPAM? Or maybe 
  you're making certain assumptions about envelope from or normal From: 
  headers and complaining when others are making the similar assumptions?
 
 Mail from Elan and completewhois.com is unwelcome at rhyolite.com in
 patt because of a message that said:
You might want to post headers that show it being sent from some open-proxy
in pacific and showing use of email accounts that were never there.
 
 ] Elan.Net Internet
 ] T.1 T.3 Frame Relay
 ] If you need more information about us or are interested in network services 
 ] (managed hosting, collocation, dedicated servers, t1, t3), please send email to 
 [EMAIL PROTECTED] 
 ] 
 ] For More info 
 ] http://www.elan.net
 ] [EMAIL PROTECTED]
 
 There are additional, independent, sufficient reasons for that listing
 that we do not need to explore. 
Except that we did not send that message and anti-spam community was 
informed it was a joe-job (95% guessed as much on their own). Considering 
you knew who I'm am, I suspect you knew all this as well and if you did 
not you might have notice a warning about this on the website. 

 An advantage of a vanity or other tiny domain is that it can use
 blacklists that would have intolerable false positive rates at other
 or larger outfits but that have 0.000% local false positive rates.
That is unlikely considering what I know about your list. 

Unless you make a good effort to research anything you probably have as
many problems as some of the most agressive ip-based filtering sites (like 
spews), you might make a number of good hits but would have number of misses
as well. Besides that as has been shown many times, for spammers its a 
lot easier and less expensive to get new domain that to get new ip block - 
and many do and setup hundreds of new domains on monthly basis. As such 
the value of domain-based filtering is very minimal indeed.

-- 
William Leibzon
Elan Networks
[EMAIL PROTECTED]