Folks, I'm working on an implementation of RFC 5256 email threading, designed so that it could fit as a submodule in the "email" package, if such a think was ever seen to be useful.
I'd like to ask "the wisdom of the crowd" what they think an appropriate interface to such a thing would be? The basic operation is that you create a collection (type C) of email threads (type T) by passing a set of messages (type M) to the constructor. * Should M be required to be "email.message.Message", or perhaps some less restrictive type, say "ThreadableMessageAPI"? All that's strictly required is the ability to retrieve the Message-ID, Subject, Date, References, and In-Reply-To fields. * What operations should be possible on C? Some that come to mind: * retrieve_thread (M or message-id) => T * add_message (M) => T * add_messages (set of M) => None * remove_message (M or message-id) => T (or None) ? * What's the interface for T? It's a tree with possible dummy nodes, so a tuple of messages plus nested tuples would do it. What should the nodes in the tree be? Normalized (see RFC 5256) Message-IDs? email.message.Message instances? * For large sets of threads (millions of messages) a persistence mechanism would be useful. Should there be a standard interface to such a mechanism, perhaps as class methods on C? If so, what should it look like? Should the implementation contain a default persistent subclass of C, based on sqlite3? What side-effects would persistence requirements have on the other design considerations? For instance, would you have to save the entire text of a message for each node? Just the headers? Just some of the headers? Just the Message-ID? Have at it! Advise away! Bill _______________________________________________ Email-SIG mailing list Email-SIG@python.org Your options: http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com