Yeah, you are right. I find it useful to have a common "thread-id" but this is optional.
//Thomas 2008/11/27 Chris Anderson <[EMAIL PROTECTED]> > On Thu, Nov 27, 2008 at 6:06 AM, Thomas Kerpe > <[EMAIL PROTECTED]> wrote: > > > > function(doc) { > > if (doc.type == "email") { > > thread = []; > > if (doc.header.references){ > > thread = doc.header.references.split(" "); > > } > > thread.push(doc.header['message-id']); > > thread_id = thread[0]; > > emit([thread_id, thread,], doc.header.subject); > > } > > } > > > > //Thomas > > > > I'd like to dig into this deeper. I'm not sure I understand what the > second element in the emitted key is useful for. Could you just > emit(thread, doc.header.subject) and skip the explicit thread_id? > Group reduce should will interact nicely with that. > > I think this problem is similar to that of storing document versions > (like for a wiki). Each version must link to it's parent, but also to > the original version. I'm not familiar enough with email but > doc.header.references sounds like it can do the job. > > I suppose the analog in versioned docs would be keeping a list of the > chain of parents in each new doc (and not just the head of the chain > and immediate parent.) > > -- > Chris Anderson > http://jchris.mfdz.com >