On Mon, Feb 17, 2014 at 03:12:48PM -0500, Murtuza Mukadam wrote:

> We have linked peer review discussions on
> git@vger.kernel.org to their respective commits within the main
> git.git repository. You can view the linked reviews from 2012
> until present in the GitHub repo at:
> https://github.com/mmukadam/git/tree/review

Neat. We've experimented in the past with mapping commits back to
mailing list discussions.  Thomas (cc'd) has a script that creates
git-notes trees mapping commits to the relevant message-id, which can
then be found in the list archive.

To me, the interesting bits of such a project are:

  1. How do we decide which messages led to which commits? There is
     definitely some room for heuristics here, as patches are sometimes
     tweaked in transit, or come in multiple stages (e.g., the original
     patch, then somebody suggests a fixup on top). You might want to
     compare your work with the script from Thomas here:

       http://repo.or.cz/w/trackgit.git

  2. How do we store the mapping? I think git-notes are a natural fit
     here, but you don't seem to use them. Is there a reason?

  3. How do we present the emails to the user (including showing
     threads, letting them dig deeper, etc)?

     The existing solution has no support at all for 3. Personally, I
     keep my own git-list archive locally, so I can search it (by
     message-id or other features), dump the result into an mbox
     (optionally including the surrounding thread), and then view the
     result in mutt.

Having had this solution for a while, my experience has been that I
don't use it that often. It's not that I don't refer to the archive to
see more backstory on a commit; I probably do that once a week or so.
But since I have a decent searchable archive, I tend to just do it "by
hand", searching for keywords from the commit message, and limiting by
date if necessary.

Going straight to the message by id might be a little faster, but I
often pick up stray bits in my search that were not part of the original
thread. E.g., somebody reports a bug, then 3 days later, somebody else
posts a patch (but does not do it as a reply to the bug). There's
nothing in the message headers or the commit mapping to say that those
two messages are related. But because a search of the relevant terms
finds both, and because the result is date-sorted, they end up near each
other and it's easy for me to peruse.

It would be interesting to apply some kind of clustering algorithm that
automatically determines the messages related to a commit, including
both the patch but also any discussion leading up to it. I realize that
may be getting far afield of your original goals, but hey, you said you
wanted feedback. I can reach for the stars. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to