In message <[EMAIL PROTECTED]>, "John R Levine" <[EMAIL PROTECTED]> wrote:
>> > For purposes unrelated to mailing list administration, I have >> > developed a C language program whose purpose is to automatically >> > differentiate automated response e-mail messages (e.g. bounce >> > messages) from other types of e-mail messages. >> >> Bounces have the unique quality of have a null return envelope. >> Filter on that and you'll have no problem. ( > >In theory, you are entirely correct. In practice, the variety of broken >bounces staggers the imagination, and you need to catch all sorts of >garbage and do complex pattern matches to try and figure out what >happened. I would agree with what John has said, except that I would remove the word `complex'. Nothing I'm doing in my automatic response recognizer/differentiator is particularly complex. I'm not even using regular expressions. Just plain old verbatim text matching against the Subject: line and the first several lines of body text. The only complex part is the painstaking work of analyzing hundreds of thousands of bounce and non-bounces mail messages and coming up with the list of `stigmata' strings (that are distinctively found only in bounce and autoresponse messages) to search for.
