Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log
Hi Joe, On Thu, Jul 3, 2014 at 12:00 AM, Joe Perches j...@perches.com wrote: Commit logs have various forms of commit id references. Try to standardize on a 12 character long lower case commit id along with a description of parentheses and the quoted subject line ie: commit 0123456789ab (commit description) Now this is in mainline, checkpatch starts complaining about my too long (40 chars) commit IDs in commit messages :-( 40 chars may be too long (but it's quick to copy-and-paste, as git show shows that by default), but 12 sounds a bit short, as that's only 48 bits. According to the Birthday Paradox (en.wikipedia.org/wiki/Birthday_problem), there's a probability of 50% of a collision if you use 48 bits IDs in a repository with ca. 16 milion (2^24) objects. A Linux kernel repository counts ca. 4 million objects, so we're getting close... So soon we'll get error: short SHA1 is ambiguous. BTW, is there actually an easy way to make git show show all options for an ambiguous SHA1? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log
On Sun, 10 Aug 2014 14:28:01 -0700 Joe Perches j...@perches.com wrote: On Thu, Jul 3, 2014 at 12:00 AM, Joe Perches j...@perches.com wrote: Commit logs have various forms of commit id references. Try to standardize on a 12 character long lower case commit id along with a description of parentheses and the quoted subject line ie: commit 0123456789ab (commit description) Now this is in mainline, checkpatch starts complaining about my too long (40 chars) commit IDs in commit messages :-( 40 chars may be too long (but it's quick to copy-and-paste, as git show shows that by default), but 12 sounds a bit short, as that's only 48 bits. Right now, this test allows 12 to 16 byte length commit ids without emitting a warning. Andrew wanted this test, I don't care how long the commit id is in the commit log. Well, I mainly wanted to stop having to add commit description when people forget it. The length check was perhaps a bit anal. How about we make it 12 or more? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log
On Sun, 2014-08-10 at 23:08 +0200, Geert Uytterhoeven wrote: Hi Joe, Hi Geert. On Thu, Jul 3, 2014 at 12:00 AM, Joe Perches j...@perches.com wrote: Commit logs have various forms of commit id references. Try to standardize on a 12 character long lower case commit id along with a description of parentheses and the quoted subject line ie: commit 0123456789ab (commit description) Now this is in mainline, checkpatch starts complaining about my too long (40 chars) commit IDs in commit messages :-( 40 chars may be too long (but it's quick to copy-and-paste, as git show shows that by default), but 12 sounds a bit short, as that's only 48 bits. Right now, this test allows 12 to 16 byte length commit ids without emitting a warning. Andrew wanted this test, I don't care how long the commit id is in the commit log. According to the Birthday Paradox (en.wikiipedia.org/wiki/Birthday_problem), there's a probability of 50% of a collision if you use 48 bits IDs in a repository with ca. 16 milion (2^24) objects. A Linux kernel repository counts ca. 4 million objects, so we're getting close... So soon we'll get error: short SHA1 is ambiguous. BTW, is there actually an easy way to make git show show all options for an ambiguous SHA1? Not so far as I know, but I'm nothing like a git expert. The script I used before adding this to checkpatch was: $ cat format_commit.sh #!/bin/bash regex1=^error: short SHA1 $1 is ambiguous\. regex2=fatal: ambiguous argument '$1': unknown revision or path not in the working tree\. tmp=$(mktemp --tmpdir format_commit.X) git log --format='%H (%s)' -1 $1 $tmp 21 read line $tmp rm -f $tmp if [[ $line =~ $regex1 ]] ; then echo checking commits $1... git rev-list --remotes | grep -i ^$1 | while read line ; do git log --format='%H (%s)' -1 $line | echo commit $(cut -c 1-12,41-) done elif [[ $line =~ $regex2 ]] ; then echo No matching commit exit 1 else echo commit $(echo $line | cut -c1-12,41-) fi exit 0 $ so that using $ format_commit.sh 1234 looks at _all_ the commit references by using git rev-list then greps that output for the matches, but it is darn slow... $ time ./format_commit.sh 1234 checking commits 1234... commit 1234351cba95 (xfs: introduce xlog_copy_iovec) commit 1234471e2d11 (perf header: Fix numa topology printing) commit 1234f4bada54 (hwrng: Kconfig: remove dependency for atmel-rng driver) commit 12340313cf94 (MAINTAINERS: add new cgroup list to CC notice) commit 12346037a718 (UBIFS: dump more in the lprops debugging check) commit 12342c475f5d (iwlwifi: proper monitor support) commit 1234010684bb (Add notation that the Asus W5F laptop has a short cable instead of 80-wire.) commit 123411f2d0da ([CPUFREQ] dprintf format fixes in cpufreq/speedstep-centrino.c) real0m24.535s user0m21.668s sys 0m5.332s -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log
On Sun, 2014-08-10 at 14:35 -0700, Andrew Morton wrote: On Sun, 10 Aug 2014 14:28:01 -0700 Joe Perches j...@perches.com wrote: On Thu, Jul 3, 2014 at 12:00 AM, Joe Perches j...@perches.com wrote: Commit logs have various forms of commit id references. Try to standardize on a 12 character long lower case commit id along with a description of parentheses and the quoted subject line ie: commit 0123456789ab (commit description) Now this is in mainline, checkpatch starts complaining about my too long (40 chars) commit IDs in commit messages :-( 40 chars may be too long (but it's quick to copy-and-paste, as git show shows that by default), but 12 sounds a bit short, as that's only 48 bits. Right now, this test allows 12 to 16 byte length commit ids without emitting a warning. Andrew wanted this test, I don't care how long the commit id is in the commit log. Well, I mainly wanted to stop having to add commit description when people forget it. The length check was perhaps a bit anal. How about we make it 12 or more? Fine by me, just change the 16 to 40 --- scripts/checkpatch.pl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 31a731e..b385bcb 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -2133,7 +2133,7 @@ sub process { # Check for improperly formed commit descriptions if ($in_commit_log $line =~ /\bcommit\s+[0-9a-f]{5,}/i - $line !~ /\b[Cc]ommit [0-9a-f]{12,16} \(/) { + $line !~ /\b[Cc]ommit [0-9a-f]{12,40} \(/) { $line =~ /\b(c)ommit\s+([0-9a-f]{5,})/i; my $init_char = $1; my $orig_commit = lc($2); @@ -2141,7 +2141,7 @@ sub process { my $desc = 'commit description'; ($id, $desc) = git_commit_info($orig_commit, $id, $desc); ERROR(GIT_COMMIT_ID, - Please use 12 to 16 chars for the git commit ID like: '${init_char}ommit $id (\$desc\)'\n . $herecurr); + Please use 12 or more chars for the git commit ID like: '${init_char}ommit $id (\$desc\)'\n . $herecurr); } # Check for added, moved or deleted files -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html