New BOF in Application Area (Internet information retrieval infrastructure)
There has been some discussion for Internet information retrieval service in work groups of IRTF. Now this issue will be discussed in the BOF of Application Area. As one of most important services of Internet, current information retrieval service is still far from our expectation---precise, comprehensive and fresh information. This problem may become more serious with the rapid development of Internet. We need pay more attention to this issue. This BOF is just for it. Wish you can participate in. The date of this BOF will be determined soon. The content of BOF: Internet Information Retrieval Infrastructure(iiri) = CHAIRS: Guo Yiping ([EMAIL PROTECTED]) Wang Liang ([EMAIL PROTECTED]) Co-Chair: Andrew Newton([EMAIL PROTECTED]) What's the main purpose of Internet? Information retrieval and exchanging. But what's the most important principle to judge a network? Maybe communication speed. For common user, they can't feel search service in GB Internet is better than that in MB networks. The great progress of Internet didn't bring the great improvement in its main service. Now Internet is going far and far from their original aim, knowledge source of human being, and transforming into a jumbled information sea. Many experts are mainly concerned with physical Internet, but now we need pay more attention to information Internet. Information retrieval services may be the most important service of Internet, but there is till no a special work group for it. Current commercial search engines meet many bottleneck problems in coverage and recency. Its service is far from our expectation. It's just a web pages search system. We can also get information from many other information resources such as special databases, FTP search engine, P2P, etc. So a work group for Internet information retrieval system is very necessary. We have proposed a basic information retrieval frame, DRIS (Domain Resource Integration System), for this issue. Any related topic could be discussed in this group. AGENDA: Draft agenda for the BOF: -- History of IETF work in this area 15 min Introduction to problem space and DRIS15 min Internet Information retrieval infrastructure and digital library15 min draft-liang-irpdl-03.txt 15 min IPv6 and information retrieval system 15 min Discussion remaining time Description of this work group(DRIS): With the rapid increase of the web pages, the coverage of search engines will become poorer and the update interval will be much longer. If the current architecture of search engines is still in use, it will be an impossible mission to find the precise and comprehensive information in the future. This problem will be more serious when IPV6 technology is widely implemented in communication networks. The problem of Too much information means no information may become a disaster with information explosion. To solve this problem, there should be an efficient information management system for Internet. In this group, Domain Resource Integrated System--DRIS will be proposed. DRIS is a distributed information retrieval system, which will build the information retrieval infrastructure for the Internet and also can be regarded as a kind of Internet information management system. DRIS is a hierarchical distributed search system and comprise three kinds of information retrieval system, conventional database system, distributed search system and metadata harvest system. We will first define the basic search system and then define the entire DRIS. Specific work items are: 1 Standard distributed search system. It defines the platform-independent search interface and a collection description standard for heterogeneous information resources. An I-D information retrieval protocol for digital resources has been proposed. 2 Standard metadata harvest system. A protocol based some available opening standard like OAI will be proposed. It will define a standard metadata that can be compatible with most database system. 3 Standard public web pages search system. 4 DRIS. It will define entire DRIS. It includes its whole architecture, the relation between different nodes, etc. 5 DRIS and IPV6. The cooperation with IPV6 WG will be proposed. IPV6 will be the most distinct feather of next generation Internet.IPV6 is still in improving and any technology that can benefit the Internet all can be added to the IPV6 system. Since the searching is the main service of most user of Internet and this service is not so satisfied to us in current Internet, why not take this request into account when build the new Internet. For example, in IPV6, all kinds of data flows are assigned a priority, and then Internet can guarantee a high priority to the data flow of DRIS. So there may need some considerations for the relation between DRIS and IPV6. Mailing list [EMAIL PROTECTED] Archive and general
WG Review: Routing Area Working Group (rtgwg)
A new IETF working group has been proposed in the Routing Area. The IESG has not made any determination as yet. The following description was submitted, and is provided for informational purposes only. Please send your comments to the IESG mailing list ([EMAIL PROTECTED]) by February 19. Routing Area Working Group (rtgwg) -- Current Status: Proposed Working Group Description of Working Group: The Routing area receives occasional proposals for the development and publication of RFCs dealing with routing topics, but for which the required work does not rise to the level where a new working group is justified, yet the topic does not fit with an existing working group, and a single BOF would not provide the time to ensure a mature proposal. The rtgwg will serve as the forum for developing these types of proposals. The rtgwg mailing list will be used to discuss the proposals as they arise. The working group will meet if there are one or more active proposals that require discussion. The working group milestones will be updated as needed to reflect the proposals currently being worked on and the target dates for their completion. New milestones will be first reviewed by the IESG. The working group will be on-going as long as the ADs believe it serves a useful purpose.
Re: New BOF in Application Area (Internet information retrieval infrastructure)
At 6:38 PM +0800 02/17/2004, wang liang wrote: There has been some discussion for Internet information retrieval service in work groups of IRTF. Now this issue will be discussed in the BOF of Application Area. As one of most important services of Internet, current information retrieval service is still far from our expectation---precise, comprehensive and fresh information. This problem may become more serious with the rapid development of Internet. We need pay more attention to this issue. This BOF is just for it. Wish you can participate in. The date of this BOF will be determined soon. The timing and full agenda are available at: http://www.ietf.org/ietf/04mar/iiri.txt Note that the chairs are different from that listed in Wang Liang's recent note; Guo Yiping and Andy Newton will chair the meeting, since Wang Liang will be a major presenter. John Klensin will be presenting the history of IETF work in the area, but it would be useful if those present were familiar with the previous work in METAD and FIND as well as the citations given in the BoF agenda. regards, Ted Hardie
Re: covert channel and noise -- was Re: proposal ...
Vernon Schryver [EMAIL PROTECTED] wrote: I know of many millions of spam that are filtred during the DATA command every day, and I don't claim to know about any really big sites. The only problems are: - local administrative choices that keep bastion SMTP servers ignorant of per-user filter preferences This is a feature, not a problem. If the end user wants a filtering process individualized that much, s/he should choose to use a SMTP server which agrees to do so. - filtering at the DATA command requires either (1) rejecting for all or no targets or (2) accepting for all targets and siliently discarding the message for those targets that want it filtered. Alternatively, the receiving SMTP server could reject any multiply- addressed email. Is it actually that unreasonable to apply the most-restrictive filtering rules in the case of multiply-addressed email? (Silently discarding _is_ a bad idea, when done by the SMTP server itself. IMHO, it's better to mark for later discard -- which actually could be done in such a way as to mark only for those recipients who requested the more restrictive filtering.) In theory the second problem could be fixed if the DATA command could accept a vector of 250-OK/4yz-try-later/5yz-fatal responses, one for each target named with a Rcpt_To command. In practice the spam problem will be solved one way or another long before such a protocol change would be sufficiently widely deployed to matter. Agreed: that radical a change in SMTP wouldn't percolate through quickly enough. -- John Leslie [EMAIL PROTECTED]
Re: New BOF in Application Area (Internet information retrieval infrastructure)
At 8:55 AM -0800 02/17/2004, Ted Hardie wrote: It would be useful if those present were familiar with the previous work in METAD and FIND as well as the citations given in the BoF agenda. By the way, the old FIND charter and documents are listed here: http://www.ietf.org./html.charters/OLD/find-charter.html and METAD mailing list archives are at: http://www.usrlocalsrc.org/bunyip/metad.archive/ regards, Ted Hardie
Re: covert channel and noise -- was Re: proposal ...
From: John Leslie ... - local administrative choices that keep bastion SMTP servers ignorant of per-user filter preferences This is a feature, not a problem. If the end user wants a filtering process individualized that much, s/he should choose to use a SMTP server which agrees to do so. That is a feature only if the user accepts the consequences of discarding mail without generating bounces, including not informing senders of false positives. Bounces from internal spam filters (either in MUAs or MTAs inside organizations) are a major source of unsolicited bulk mail or spam. - filtering at the DATA command requires either (1) rejecting for all or no targets or (2) accepting for all targets and siliently discarding the message for those targets that want it filtered. Alternatively, the receiving SMTP server could reject any multiply- addressed email. People running SMTP servers that handle 100K or more msgs/day have been uniformly horrified when I've suggested that. I don't really understand why, but I have given up on the idea. (Silently discarding _is_ a bad idea, when done by the SMTP server itself. IMHO, it's better to mark for later discard -- which actually could be done in such a way as to mark only for those recipients who requested the more restrictive filtering.) A better positition is that everything should be logged, particularly including discarded mail, and in that case, enough of bodies to allow targets to identify senders and the nature of the discarded messages. Of course, one should assume users won't normally look at those logs. Spam you read is not filtered, but at most categorized and stigmatized. Vernon Schryver[EMAIL PROTECTED]
Re: covert channel and noise -- was Re: proposal ...
From: Robert G. Brown ... Or, mark for later accept/reject decisioning AFTER the SMTP server per se, in the filter pipeline between the server and the mail spool of the addressee. Spam assassin does the right thing already (and this is exactly what it does). ***NO***! Except when run as a milter or otherwise during the SMTP transaction, SpamAssassin does the WRONG thing. As run almost everywhere, after the SMTP transaction, SpamAssassin can only iether silently discard spam or generate new spam by sending bounces to innocent people. A better positition is that everything should be logged, particularly including discarded mail, ... Logging a message you reject is nearly a waste of time. Based on my experience, people running ISPs or other large mail system strongly disagree with your position. Besides, I intentionally wrote about logging ***discarded** mail. Many institutions do turn off logging of greylisted messages, reduce the default per-message logging limit of 32K Bytes, or delete log files far sooner than the 14 default in the DCC source. Still, logging is seen as vital to be able to answer questions about which messages were filtered and why, including being able to say that message was never sent or substantially identical copies of that message were sent to 310 other users here and 433,797 users elsewhere; it was spam. In order to recover the message (as you note, nobody ever looks at the logs, which are VERY LARGE for a busy mailer and beyond human capacity to scan), I said nothing about humans scanning everything. Besides, giving users the sense that they can see what's happening with spam filtering on their mail and control it is a requirement for getting users to accept filtering. .. This is where, and why, I take issue with filtering and discarding at the level of the SMTP server, unless the accept/reject decision can be made with 100% precision (no false positives, no false negatives, and it may not be good even then because MY idea of the correct basis for the decision may not be the same as YOURS). What you describe is a broken version of what I advocate, if you consistently look at your personal log of rejected mail. Your version is broken because a reject decision after the SMTP transaction must at least sometimes result in sending spam to innocent people. ... It's not that filtering based on non-header-linked aspects of content is or isn't a good idea in some cases. It is that it has no business being in the specification of TCP. ... pure chance have a byte sequence like SEX that caused it to be rejected .. Nice straw man. I've never heard anyone with a taint of technical clues talk about looking for SEX in raw TCP segments. ... For nearly all filtering programs, it is too easy to create a message that is filtered but shouldn't be. ... You evidently lack experience with the filters used by commercial institutions. Commersical SMPT servers cannot tolerate false positives (legitimate rejected/total legitimate) of more than 0.01%, and even that's pushing it. The design requirement for filtering mail on which money depends is that false positives must not be much worse than the underlying SMTP error rate due to problems such as full disks and broken DNS servers--not to mention mail recipients too quick to delete. A lot of current talk about false positives is self-serving nonsense from such as the Direct Marketing Association. Manual spam filtering also has false positives. A human suffering a common spam load of 100 spam/day has trouble not deleting legitimate mail. My filters are rejecting about 300 spam/day sent in my direction, 12227 in the last 40 days. Mechanical filtering even with a significant false positive rate can reduce the overall false positive rate. ... SMTP was designed to permit reasonably RELIABLE (simple) transport of It would be good to skip the networking 101 tutorials. Those of us who don't know all of that about TCP, SMTP, CSMA/CD, etc. often thanks to decades of personal experience should apply elsewhere to learn the basics. It is may hard to imagine how old farts like me see such tutorials, but please try. I've been receiving email as vjs since 1968. It seems to me to be highly unacceptable to attempt to insert content-based accept/reject decisioning in at this PROTOCOL level in the delivery process. That use of level confuses me. It does not seem to conform to ISO OSI architecture. ... reliable transport mechanism for important messages. Filtering it for me according to ANY CONTENT-BASED RULESET risks discarding at least some messages that are not correctly classified when they are rejected. Important messages can be lost. Bad things can result. Who is responsible when this occurs? Who do I get to sue? You don't get too sue anyone, because a reasonably designed system lets you choose to do all of your spam
Educational Sessions in Seoul
Hi All, The EDU Team is offering several training sessions on Sunday afternoon in Seoul that are open to ALL IETF ATTENDEES. These sessions include: - Newcomers Training (in both English and Korean) - Editors Training - Introductory WG Chairs Training - Security Tutorial Details about these sessions can be found below. Please note that you do not need to be a current Editor or WG Chair to attend those sessions, these sessions are all open to everyone! So, show up early and learn more about how to work effectively within the IETF. Margaret Sunday, February 29, 2004 = 1300-1400 Newcomer's Training in English -- Location?? (Spencer Dawkins) An introduction to the IETF for new or recent IETF attendees. Covers the IETF document processes, the structure of the IETF, and tips for new attendees to be successful in the IETF environment. 1400-1500 Newcomer's Training in Korean -- Location?? (ChangJoon Kim) [Include description on agenda in Korean, if possible.] An introduction to the IETF for new or recent IETF attendees. Covers the IETF document processes, the structure of the IETF, and tips for new attendees to be successful in the IETF environment. 1300-1500 Editor's Training -- Location?? (Avri Doria) Training for current or aspiring IETF document editors. Covers the roles and responsibilities of a document editor, and includes advice on producing a high-quality IETF specification. 1300-1500 Intro WG Chairs Training -- Location?? (Margaret Wasserman) Introductory training for new or aspiring WG chairs. Covers the role and responsibilities of a WG chair, and includes advice on how to run a WG that is fair, open and productive. This class is open to all IETF attendees. 1500-1700 Security Tutorial -- Location?? (Radia Perlman) All IETF attendees need to be aware of the security implications of our design choices. This session offers a basic primer in protocol security, as well as advice on how to write the Security Considerations sections required for all IETF documents.
How Not To Filter Spam
Thn enclosed example of how not to filter spam is offered for those who might want to preemptively add accuspam.com or downloadfast.com to their blacklists. It is also a classic example of what is wrong with the MUA filtering tactics Robert Brown advocates. I certainly did not try to contact anyone at 3dize.com. A few readers of this mailing list might recall that Shelby H. Moore III and I don't see eye to eye. I do not know what the From: header of [EMAIL PROTECTED] is about. Perhaps it was forged. Or perhaps someone at that address wired a subscription to this mailing list through an unusually braindea) challenge/response spam filter at DownloadFAST.com. If so, the Secretariat should blacklist accuspam.com and/or downloadfast.com from subscriptions to IETF mailing lists. (It would need to be unusually braindead to send me the challenge instead of the envelope sender.) Vernon Schryver[EMAIL PROTECTED] From [EMAIL PROTECTED] Tue Feb 17 19:26:14 2004 Received: from www.DownloadFAST.com (www.downloadfast.com [65.61.155.11]) by calcite.rhyolite.com (8.12.11/8.12.11) with ESMTP id i1I2QDiM048994 for [EMAIL PROTECTED] env-from [EMAIL PROTECTED]; Tue, 17 Feb 2004 19:26:13 -0700 (MST) Received: from www.downloadfast.com (localhost.downloadfast.com [127.0.0.1]) by www.DownloadFAST.com (8.12.10/8.12.10) with ESMTP id i1I1hIQf002492 for [EMAIL PROTECTED]; Tue, 17 Feb 2004 19:43:18 -0600 (CST) Received: (from [EMAIL PROTECTED]) by www.downloadfast.com (8.12.10/8.12.6/Submit) id i1I1hITA002491; Tue, 17 Feb 2004 19:43:18 -0600 (CST) Date: Tue, 17 Feb 2004 19:43:18 -0600 (CST) Message-Id: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Re: Re: covert channel and noise -- was Re: proposal ... From: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] You attempted to contact us via email, and our anti-spam system is returning your email below. Please kindly resend your email below, using our Contact Form on our web page: http://accuspam.com?kM,vbrJT After using our Contact Form only once, your future emails will not be returned. If you do not use our Contact Form to re-send your email below, then we can not read it. __ Powered by http://AccuSpam.com. Signup instantly for FREE! Your Returned Message: From: Robert G. Brown ... Or, mark for later accept/reject decisioning AFTER the SMTP server per se, in the filter pipeline between the server and the mail spool of the addressee. Spam assassin does the right thing already (and this is exactly what it does). ***NO***! Except when run as a milter or otherwise during the SMTP ...
Re: How Not To Filter Spam
On Tue, 17 Feb 2004, Vernon Schryver wrote: It is also a classic example of what is wrong with the MUA filtering You certain dont assume that there is nothing wrong with the filtering system you use and others may try duplicate as well. Otherwise how would you explain that you have Elan and completewhois.com listed as filtered on your site. Do you honestly believe we ever sent you any SPAM? Or maybe you're making certain assumptions about envelope from or normal From: headers and complaining when others are making the similar assumptions? -- William Leibzon Elan Networks [EMAIL PROTECTED]
Re: How Not To Filter Spam
From: william(at)elan.net It is also a classic example of what is wrong with the MUA filtering You certain dont assume that there is nothing wrong with the filtering system you use and others may try duplicate as well. Otherwise how would you explain that you have Elan and completewhois.com listed as filtered on your site. Do you honestly believe we ever sent you any SPAM? Or maybe you're making certain assumptions about envelope from or normal From: headers and complaining when others are making the similar assumptions? Mail from Elan and completewhois.com is unwelcome at rhyolite.com in patt because of a message that said: ] Elan.Net Internet ] T.1 T.3 Frame Relay ] If you need more information about us or are interested in network services ] (managed hosting, collocation, dedicated servers, t1, t3), please send email to [EMAIL PROTECTED] ] ] For More info ] http://www.elan.net ] [EMAIL PROTECTED] There are additional, independent, sufficient reasons for that listing that we do not need to explore. If you will read my web pages, you'll see that my list of unwelcome domains is not only about senders of unsolicited bulk email. An advantage of a vanity or other tiny domain is that it can use blacklists that would have intolerable false positive rates at other or larger outfits but that have 0.000% local false positive rates. Vernon Schryver[EMAIL PROTECTED]
Re: How Not To Filter Spam
On Tue, 17 Feb 2004, Vernon Schryver wrote: From: william(at)elan.net It is also a classic example of what is wrong with the MUA filtering You certain dont assume that there is nothing wrong with the filtering system you use and others may try duplicate as well. Otherwise how would you explain that you have Elan and completewhois.com listed as filtered on your site. Do you honestly believe we ever sent you any SPAM? Or maybe you're making certain assumptions about envelope from or normal From: headers and complaining when others are making the similar assumptions? Mail from Elan and completewhois.com is unwelcome at rhyolite.com in patt because of a message that said: You might want to post headers that show it being sent from some open-proxy in pacific and showing use of email accounts that were never there. ] Elan.Net Internet ] T.1 T.3 Frame Relay ] If you need more information about us or are interested in network services ] (managed hosting, collocation, dedicated servers, t1, t3), please send email to [EMAIL PROTECTED] ] ] For More info ] http://www.elan.net ] [EMAIL PROTECTED] There are additional, independent, sufficient reasons for that listing that we do not need to explore. Except that we did not send that message and anti-spam community was informed it was a joe-job (95% guessed as much on their own). Considering you knew who I'm am, I suspect you knew all this as well and if you did not you might have notice a warning about this on the website. An advantage of a vanity or other tiny domain is that it can use blacklists that would have intolerable false positive rates at other or larger outfits but that have 0.000% local false positive rates. That is unlikely considering what I know about your list. Unless you make a good effort to research anything you probably have as many problems as some of the most agressive ip-based filtering sites (like spews), you might make a number of good hits but would have number of misses as well. Besides that as has been shown many times, for spammers its a lot easier and less expensive to get new domain that to get new ip block - and many do and setup hundreds of new domains on monthly basis. As such the value of domain-based filtering is very minimal indeed. -- William Leibzon Elan Networks [EMAIL PROTECTED]