On 12-Mar-05 09:29, Hui Zhou wrote: > Reading my own mail (this one that I just sent :)) and I realize that > simple token treatment definitely won't work good enough to mark sort > my post into interesting (How shameless :). It may work for > categorization of regular notifications and alerts, but for general > chatting list, something more need to be taken into account. Maybe the > the lengh of original post? or proportion of quotes against reply? or > average length of sentences?
I think the hard part is really to come up with the heuristics that do the sorting. Beyond that, it's just separating those heuristics into classes that each do the sort. I personally find it harder to come up with regexes that generically match non-spam mail because I seem to think more in terms of what I don't want. Maybe you can take a similar approach in a hierarchy from "least want to read" to "most want to read" You may even want to look at something like MIMEDefang which gives you access via perl to many different message qualities. Number of recipients, time it was sent, envelope From:, etc.... That may give you more options in developing the heuristics and then you can just use it to add a custom header which procmail will then use for it's sorting job. Sounds like an interesting project anyway. ~Jason -- -- http://linuxfromscratch.org/mailman/listinfo/lfs-chat FAQ: http://www.linuxfromscratch.org/faq/ Unsubscribe: See the above information page
