On May 6, 2005 at 02:06, East Coast Coder wrote: > I should add that storage is not such a concern (as it is only an > increase of 2x over the html archives anyway), but processing is...
Where processing may be an issue is if you use a single mhonarc archive for all messages. Mhonarc archives do not scale well as the number of messages increase, but I have had reports of users with a single archive over 60,000 messages. The common solution to get around this problem is to use period-based archives. For example, a mailing list may be broken up by multiple mhonarc archives, where each archive represents a well-known time period, like a month. Mharc uses this approach. Unfortunately, such a scheme is not nice for threads the span across multiple time periods. Mharc provides a link on message pages to other messages with the same subject. This does an automatic search across all periods. Unfortunately, it is definitely not a usable as the threading navigation mhonarc does and it does not help if a thread has messages with different subjects. The mail-archive.com uses an alternative approach that the maintainer's have dubbed "poor-man windowing". In this model, they have a single archive, but only have mhonarc remember the last X messages. Older messages are not in the .mhonarc.db file, but the html files are not deleted (KEEPONRMM resource). There are limitations to this scheme: * Index pages do not exist for older messages. For mail-archive, this has not been a big problem since $TSLICE$ is utilized to still allow for thread navigation at the message page level. Also, for older messages, they are normally reached via searching (either via Google or via mail-archive's search) vs through an index page. Since older messages do not have index pages, links to the index page from an old message goes to a page that does not have the message listed. * Archive edit operations are more complicated. This is minimized by utilizing CSS as much as possible (requiring careful planning of mhonarc resource settings early on). A custom script was required to edit old html message files that could not be easily modified via stylesheet changes. * If a thread is very long and spans more than X messages, you will have a break in the thread on the index pages. However, via $TSCLICE$, a reader could still navigate the complete thread via the message pages. There have been discussions about over-coming these limitations with the existing mhonarc code base, but nothing has been implemented yet. --ewh
