On Thu, Nov 27, 2008 at 6:06 AM, Thomas Kerpe <[EMAIL PROTECTED]> wrote: > > function(doc) { > if (doc.type == "email") { > thread = []; > if (doc.header.references){ > thread = doc.header.references.split(" "); > } > thread.push(doc.header['message-id']); > thread_id = thread[0]; > emit([thread_id, thread,], doc.header.subject); > } > } > > //Thomas >
I'd like to dig into this deeper. I'm not sure I understand what the second element in the emitted key is useful for. Could you just emit(thread, doc.header.subject) and skip the explicit thread_id? Group reduce should will interact nicely with that. I think this problem is similar to that of storing document versions (like for a wiki). Each version must link to it's parent, but also to the original version. I'm not familiar enough with email but doc.header.references sounds like it can do the job. I suppose the analog in versioned docs would be keeping a list of the chain of parents in each new doc (and not just the head of the chain and immediate parent.) -- Chris Anderson http://jchris.mfdz.com