East Coast Coder
Thu, 05 May 2005 23:13:51 -0700
Okay, I'm back to work on this (adding threadid's). A few questions: 1) Which file/sub does the initial determination if a message is a response or a new thread? 2) For the threadid, I'd like to store it in MHonArc's own message.db. That is, I'd like to be able to use the db so for any messageid, we can retreive the threadid. Are there simple routines to add a new field to the db, and to retreive it given a messageid? (If not, I can always use another store, like sqllite or something, but I would not want to have to do this unless needed). 3) What is the best way to persist the counter? Again, can the message.db do this? Are there any race conditions to be wary of (doesn't seem like it, since MHonArc works in serial, and locks anyway - but I want to double check.) --ECC On 4/21/05, Earl Hood <[EMAIL PROTECTED]> wrote: > On April 19, 2005 at 19:37, East Coast Coder wrote: > > > > Have you looked at the $TSLICE$ resource variable? I.e. If you > > > provide more info on what you are trying to do, there may already > > > be features that do what you need. > > > > I'm experimenting with a format where, instead of rebuilding the > > entire tree, new messages are output incrementally. I think this > > would be necessary for using MHonArc for formats other than HTML > > archives (RSS feeds, perhaps, or text messages). I'd like to be able > > to process a mbox, take the new messages, and output them - but > > identify each message with a unique id to its thread, so that the > > final output device (aggregator, phone, whatever) can associate it. > > Mhonarc does not assume that it will process messages in the correct > order. For example, a follow-up to a message may be processed before > its reference(s). Also, mhonarc was initially coded in Perl 4, > so leveraging complex data structures was not necessarily trivial. > Therefore, mhonarc just recomputes threads after each update to > an archive. There may be ways to optimize threading computation > with the existing code base, but I have not bothered to look into it. > Some of the main hash structures have been "Perl 5'ed", but in general, > most of mhonarc's data structures are flat. > > As for your immediate need, you can write a wrapper program that > invokes mhonarc for the main work and then does some post-processing > to get what you need. With the minimal API facilities of mhonarc, > you can determine which messages are new. > > You will need to maintain your own thread IDs. A simple map can > be used to maintain the message to thread ID association. Thread ID > generation can be a simple counter. > > You can check out the mha-preview program in the examples/ directory > of the mhonarc distribution for an example of how to develop a > wrapper program. Your wrapper will require some more knowledge of > the internals of mhonarc that is not documented in the API appendix > of the docs. > > You can examine the library mhinit.pl for a list of the internal > data structures used by mhonarc. The ones under "Message information > variables" will probably be of the most interest to you. You can > even examine the .mhonarc.db file of an archive to get a clearer > picture of how the various hashes are structured. > > Side Note: I like the idea of having thread IDs. Something to consider > if/when mhonarc is rewritten, and maybe something possible with the > existing code base. Mhonarc is an old program and various users are > definitely hitting up against its limitations. Motivation is my main > enemy in doing a complete re-implementation. > > --ewh > > --------------------------------------------------------------------- > To sign-off this list, send email to [EMAIL PROTECTED] with the > message text UNSUBSCRIBE MHONARC-DEV > > --------------------------------------------------------------------- To sign-off this list, send email to [EMAIL PROTECTED] with the message text UNSUBSCRIBE MHONARC-DEV