At least some of this mail data is public, but I'm not sure if the bad
threading is reproducible or not; I want to run a complete census
overnight before I reindex.

Even if the bug is non-deterministic, it probably lives in lib/add-message.cc

I have a reproducible test for this bug now


I still need to analyze the mails a bit more, but it looks like at least
one of the strange results is caused by multiple mail files sharing the
same message-id, but with different References headers (and no
In-Reply-To headers).

In my case, I seem to be having the In-Reply-To headers. I end up with two files per message: one from my inbox and one from the gmane archive that I pull in. All the messages from the gmane archive seem to have a re-written 'In-Reply-To' header, but 'Message-Id' and 'References' are the same.

In the problematic email thread, all other files/messages get allotted a single thread except for one of the messages. The offending message has 3 references compared to 1 or 2 references for the rest, but I don't know if that's relevant here.

