From: Jeff King <p...@peff.net>

Most of git's traversals are robust against minor breakages
in commit data. For example, "git log" will still output an
entry for a commit that has a broken encoding or missing
author, and will not abort the whole operation.

Shortlog, on the other hand, will die as soon as it sees a
commit without an author, meaning that a repository with
a broken commit cannot get any shortlog output at all.

Let's downgrade this fatal error to a warning, and continue
the operation.

We simply ignore the commit and do not count it in the total
(since we do not have any author under which to file it).
Alternatively, we could output some kind of "<empty>" record
to collect these bogus commits. It is probably not worth it,
though; we have already warned to stderr, so the user is
aware that such bogosities exist, and any placeholder we
came up with would either be syntactically invalid, or would
potentially conflict with real data.

Signed-off-by: Jeff King <p...@peff.net>
---
The real-world breakage I saw that caused this was not a missing author
line, but rather a commit that was in proper utf8, but whose encoding
header said "utf16". We can tweak the test to reproduce the exact case
by making the sed line:

  sed "/^\$/iencoding utf16" commit.tmp >broken.tmp

But I think that may be less portable, as systems with iconv that does
not understand "utf16" might leave the text intact. Simply removing the
author line, as the test below does, has the same effect.

 builtin/shortlog.c  |  6 ++++--
 t/t4201-shortlog.sh | 16 ++++++++++++++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index ae73d17..c226f76 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -127,9 +127,11 @@ void shortlog_add_commit(struct shortlog *log, struct 
commit *commit)
                        author = buffer + 7;
                buffer = eol;
        }
-       if (!author)
-               die(_("Missing author: %s"),
+       if (!author) {
+               warning(_("Missing author: %s"),
                    sha1_to_hex(commit->object.sha1));
+               return;
+       }
        if (log->user_format) {
                struct pretty_print_context ctx = {0};
                ctx.fmt = CMIT_FMT_USERFORMAT;
diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh
index 5493500..4286699 100755
--- a/t/t4201-shortlog.sh
+++ b/t/t4201-shortlog.sh
@@ -172,4 +172,20 @@ test_cmp expect out'
        git shortlog HEAD~2.. > out &&
 test_cmp expect out'
 
+test_expect_success 'shortlog ignores commits with missing authors' '
+       git commit --allow-empty -m normal &&
+       git commit --allow-empty -m soon-to-be-broken &&
+       git cat-file commit HEAD >commit.tmp &&
+       sed "/^author/d" commit.tmp >broken.tmp &&
+       commit=$(git hash-object -w -t commit --stdin <broken.tmp) &&
+       git update-ref HEAD $commit &&
+       cat >expect <<-\EOF &&
+       A U Thor (1):
+             normal
+
+       EOF
+       git shortlog HEAD~2.. >actual &&
+       test_cmp expect actual
+'
+
 test_done
-- 
1.8.4.rc4.16.g228394f
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to