On Thu, Apr 11, 2013 at 6:05 PM, Junio C Hamano <gits...@pobox.com> wrote:
> Felipe Contreras <felipe.contre...@gmail.com> writes:
>> And if you must, you might was well label them with "REMINDER", no,
>> wait, that's what "TODO" comments are for, where people can see them,
>> and not *forget* them.
> Yeah, good point.
Moreover, I think there's a clear double standard. Consider this commit:
Author: Antoine Pelisse <apeli...@gmail.com>
Date: Sat Mar 23 18:23:28 2013 +0100
combine-diff: coalesce lost lines optimally
This replaces the greedy implementation to coalesce lost lines by using
dynamic programming to find the Longest Common Subsequence.
The O(n²) time complexity is obviously bigger than previous
implementation but it can produce shorter diff results (and most likely
easier to read).
List of lost lines is now doubly-linked because we reverse-read it when
reading the direction matrix.
The commit message is 9 lines, and the diffstat 320 insertions(+), 64
deletions(-). Moreover, there are some important bits of information
on the mailing list that never made it to the commit message:
All p parents have the same n lines.
We will find LCS and provide a n lines (the same lines) new list in
O(n²), and then run it again in O(n²) with the next parent, etc.
It will end-up being O(pn²).
All p parents have no lines in common.
We will find LCS and provide a 2n new list in O(n²).
Then we run it again in O(2n x n), and again O(3n x n), etc, until
O(pn x n).
When we sum these all, we end-up with O(p² x n²)
Unfortunately on a commit that would remove A LOT of lines (10000)
from 7 parents, the times goes from 0.01s to 1.5s... I'm pretty sure
that scenario is quite uncommon though.
This is not mentioned in the commit message; on which situations this
implementation would be worst and why it's OK either way.
As you can see the last test is broken because the solution is not
optimal for more than two parents. It would probably require to extend
the dynamic programming to a k-dimension matrix (for k parents) but the
result would end-up being O(n^k) (when removing n consecutives lines
from p parents). I'm not sure there is any better solution known yet to
the k-LCS problem.
Implementing the dynamic solution with the k-dimension matrix would
probably require to re-hash the strings (I guess it's already done by
xdiff), as the number of string comparisons would increase.
The fact that the last test is broken is not mentioned at all.
Now let's compare to the final version of my patch which is 19 lines
40 insertions(+), 1 deletion(-). The ration of commit message lines
vs. code changed lines is 19/41(0.46) whereas Antoine's patch is
3/128(0.02), a difference of over 19 times. Granted, some single-line
changes do require a good chunk of explanation, but this is not one of
them; this single line patch doesn't even change the behavior of the
code, simply changes a silent error exit to a verbose error exit,
that's all. Antoine's patch has a lot more potential to trigger
And the chances that somebody would have to look at Antoine's patch is
quite high, especially since a failing test-case is introduced. The
chances that anybody would look at mine are very very low.
So either Antoine's commit message was fine, and so was mine, or it
was sorely lacking explanation.
To me, the reality is obvious: my patch didn't require such a big
commit message, the short version was fine, the only reason Jeff King
insisted on a longer version is because the patch came from me.
Antoine's patch might have benefited from a little more explanation,
but not every issue that was discussed in the mailing list was
necessary (in my patch virtually every issue discussed was added to
the commit message).
This is the definition of double standard.
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html