Meeow miaou* We spoke on IRC about the archiver the other day and I said that I should present here my thoughts about it. So here they are (beware that might be long).
First I think we should think about the structure/architecture of things. We have a number of component which need to be archives aware, without being exhaustive I'm thinking about: - the archiver itself (which present the archive (ie: mails and threads) - the NNTP bits which should be able to return emails and/or threads - the stats module which want to give information to the user about the health of the list itself (emails/month, last threads, biggest threads...) - archives retrieval (we probably want to give the user a way to download the archives since the creation of the list/the last year/month) All of these components needs to be aware about the archives. We agreed that the core does not want to know about it. So we have several solutions: - each module becomes an "archiver" wrt to core, meaning each module has its own way to storing the archives (and eventually its own system to do so) - we create a archive-core module which manage the archives and provides an API to access, modify, extend them. Of course, we prefer the second solution :) So we would have the following architecture: mm-core (handles the lists themselves) --send emails to archivers--> archive-core (store the emails and expose them through an API) --> archivers/stats/NNTP The questions are then: - how do we store the emails ? - how do we expose the API ? - how to make it such that it becomes easy to extend ? (ie: the stats module wants to read the db, but probably also to store information on it) Having played with mongodb (HK relies on it atm), I quite like the possibilities it gives us. We can easily store the emails in it, query them and since it is a NoSQL database system extending it becomes also easy. On the other hand, having the archiver-core relying on the same system as the core itself would be nicer from a sysadmin pov. I have not tried to upload archives to a RDBMS and test its speed, but for mongodb the results of the tests are presented at [1]. The challenge will be speed and designing an API which allow each component to do its work. I think it would be nice if we could reach some kind of agreement before the GSoC starts (even if we change our mind later on) to be sure that if we get students their work don't overlap too much. The second point I want to present is with respect to the archiver itself. At the moment we have HyperKitty (HK), the current version: - exposes single emails - exposes single threads - presents the archives for one month or day - allows to search the archives using the sender, subject, content or subject and content - presents a summary of the recent activities on the list (including the evolution of the number of post sent over the last month) I think these are the basis functionality that we would like to see in an archiver. But HK aims at much more, the ultimate goal of HK is to provide a "forum-like" interface to the mailing-lists, with it HK would provide a number of option (social-web like) allowing to "like" or "dislike" a post or a thread, allowing to "+1" someone, allowing to tag the mails or assign them categories. These are all nice feature but, imho, they go beyond what one would want from a basic archiver. So what I would like to propose is to split HK into a sub-project (MiniKitty?) which would provide these basic functionality. We would keep HyperKitty as a more extensive archiver and try to bring HK to its ultimate goal. This will need some more work and time as we will have to make HK speak with core for authentication, find a way to send emails to core/the lists and of course add all the other features (tags, categories...) Comments welcome :) Thanks, Pierre [1] http://blog.pingoured.fr/index.php?post/2012/03/16/Mailman-archives-and-mongodb * Hi everyone _______________________________________________ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9