At 19:41 18-3-02 -0800, Nick Arnett wrote: > > Of course, this sort of thing could also be done on the historical data > > available, but I think that would be far more boring than trying to do it > > with current data.... > >Hey, that's my software you're calling boring! But seriously, the only way >to test and identify potentially interesting patterns is to run algorithms >against historical data. > >Thoughts and suggestions are welcome, especially right now, since I'm >designing data structures that may be a real pain to modify later.
Well, now that you mention it... How are you going to deal with the following situations: 1. When people who receive the Digest reply to a post, their messages often do not have the subject header "Re: <subject>" but the subject header "Re: Brin-L Digest ####". Although the headers differ, the posts are essentially part of the same thread. 2. When scanning for replies, you may look at subject headers that start with "Re:". However, people who receive the Digest and actually bother to change the subject header will often simply copy & paste the header from the original post. This results in subject headers that do not start with "Re:" and are therefore not recognisable as replies. Rather, the scanning software will interpret such posts as the first message in a new thread. 3. What happens when someone starts a new thread and uses a subject header that has been used before? Let's say that someone starts a thread "Uplift Universe" at one point in time, and a year later someone starts a new thread with the exact same header. Given the time between the two threads, they are clearly separate threads; but will your program recognise them as such? 4. The abbreviation "Re:" appears in a four different forms: "Re: <subject>", "RE: <subject>" (with a capital E), and both versions also without a blank between the colon and the title. The solution to this particular problem should be obvious... Jeroen "hey, you asked for it" van Baardwijk _________________________________________________________________________ Wonderful World of Brin-L Website: http://www.Brin-L.com Tom's Photo Gallery: http://tom.vanbaardwijk.com
