On Tue, Sep 24, 2002 at 09:40:08PM -0500, The Fool wrote: > > From: Erik Reuter <[EMAIL PROTECTED]> > > You're joking, right? You CANNOT KNOW THAT THEY ARE DUPLICATES. They > > look like the start of a new message. > > Only in the case where the first message has no body, ie no message > header like statements, in which case the second message would show up as > the body of the first message.
No. Below is an mbox file containing what was meant to be 2 messages, where one message quotes, in entirety, a third message. The correct parsing would come up with 2 emails, with the 1st email ending with "...message #1." But without some standard (such as mangling From to >From), what the parser will come up with is 3 messages, with the 1st email ending with "...body text." ----------------- From X Y Header1: akdjf Header2: kjdfj Subject: This is message #1 This is message #1's body text. From A B Header1: adkjfk Subject: This is a message nested into another message This is some body text of the nested message. This is more nested message body text. This is the last line of body text of message #1. From Q F Header1: kjdklfj Header2: kjdfkjf Subject: This is message #2 This is the body text of message #2 --------------- >I just said it wasn't 100% and a kludge. And you claimed that it was possible to do something that is impossible. You made a number of incorrect statements. And nonsensical claims (could write a better parser in 1 minute). In order to write a parser, you need to know all the available message syntaxes that your parser might encounter, and you obviously lack the background and knowledge of the possibilities. You could be the best programmer in the world, but if your program doesn't solve the right problem, then it isn't worth much. You need to listen (read) and think before you type if you want to be productive. > You can't tell what the first message header will be, nor the last. > Which is why you always design things that specify size, always. In any > case, this kludge would produce something that would show both messages, > whether they ended up nested or not. It would split one message, incorrectly, into two. And the size is not specified in the original mbox format, so you can't use the size and maintain backwards compatibility. > > > Also you would assume that when you started getting duplicate headers > > > that it would be a possible message start. > > > > How can you ignore AND assume it is a message start? Have you been > > drinking? > > Message BODY start. There is no standard delimiter for message body start nor for message body end. -- "Erik Reuter" <[EMAIL PROTECTED]> http://www.erikreuter.net/ _______________________________________________ http://www.mccmedia.com/mailman/listinfo/brin-l
