Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log

2014-08-10 Thread Geert Uytterhoeven
Hi Joe,

On Thu, Jul 3, 2014 at 12:00 AM, Joe Perches j...@perches.com wrote:
 Commit logs have various forms of commit id references.

 Try to standardize on a 12 character long lower case
 commit id along with a description of parentheses and
 the quoted subject line

 ie: commit 0123456789ab (commit description)

Now this is in mainline, checkpatch starts complaining about my too long
(40 chars) commit IDs in commit messages :-(

40 chars may be too long (but it's quick to copy-and-paste, as git show
shows that by default), but 12 sounds a bit short, as that's only 48 bits.

According to the Birthday Paradox (en.wikipedia.org/wiki/Birthday_problem),
there's a probability of 50% of a collision if you use 48 bits IDs in a
repository with ca. 16 milion (2^24) objects. A Linux kernel repository
counts ca. 4 million objects, so we're getting close...

So soon we'll get error: short SHA1 is ambiguous.

BTW, is there actually an easy way to make git show show all options for
an ambiguous SHA1?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log

2014-08-10 Thread Andrew Morton
On Sun, 10 Aug 2014 14:28:01 -0700 Joe Perches j...@perches.com wrote:

  On Thu, Jul 3, 2014 at 12:00 AM, Joe Perches j...@perches.com wrote:
   Commit logs have various forms of commit id references.
  
   Try to standardize on a 12 character long lower case
   commit id along with a description of parentheses and
   the quoted subject line
  
   ie: commit 0123456789ab (commit description)
  
  Now this is in mainline, checkpatch starts complaining about my too long
  (40 chars) commit IDs in commit messages :-(
  
  40 chars may be too long (but it's quick to copy-and-paste, as git show
  shows that by default), but 12 sounds a bit short, as that's only 48 bits.
 
 Right now, this test allows 12 to 16 byte length commit ids
 without emitting a warning.
 
 Andrew wanted this test, I don't care how long the commit id
 is in the commit log.

Well, I mainly wanted to stop having to add commit description when
people forget it.  The length check was perhaps a bit anal.  How about
we make it 12 or more?


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log

2014-08-10 Thread Joe Perches
On Sun, 2014-08-10 at 23:08 +0200, Geert Uytterhoeven wrote:
 Hi Joe,

Hi Geert.

 On Thu, Jul 3, 2014 at 12:00 AM, Joe Perches j...@perches.com wrote:
  Commit logs have various forms of commit id references.
 
  Try to standardize on a 12 character long lower case
  commit id along with a description of parentheses and
  the quoted subject line
 
  ie: commit 0123456789ab (commit description)
 
 Now this is in mainline, checkpatch starts complaining about my too long
 (40 chars) commit IDs in commit messages :-(
 
 40 chars may be too long (but it's quick to copy-and-paste, as git show
 shows that by default), but 12 sounds a bit short, as that's only 48 bits.

Right now, this test allows 12 to 16 byte length commit ids
without emitting a warning.

Andrew wanted this test, I don't care how long the commit id
is in the commit log.

 According to the Birthday Paradox (en.wikiipedia.org/wiki/Birthday_problem),
 there's a probability of 50% of a collision if you use 48 bits IDs in a
 repository with ca. 16 milion (2^24) objects. A Linux kernel repository
 counts ca. 4 million objects, so we're getting close...
 
 So soon we'll get error: short SHA1 is ambiguous.
 
 BTW, is there actually an easy way to make git show show all options for
 an ambiguous SHA1?

Not so far as I know, but I'm nothing like a git expert.

The script I used before adding this to checkpatch was:

$ cat format_commit.sh 
#!/bin/bash

regex1=^error: short SHA1 $1 is ambiguous\.
regex2=fatal: ambiguous argument '$1': unknown revision or path not in the 
working tree\.

tmp=$(mktemp --tmpdir format_commit.X)

git log --format='%H (%s)' -1 $1  $tmp 21

read line  $tmp

rm -f $tmp

if [[ $line =~ $regex1 ]] ; then
echo checking commits $1...
git rev-list --remotes | grep -i ^$1 |
while read line ; do
git log --format='%H (%s)' -1 $line | 
echo commit $(cut -c 1-12,41-)
done
elif [[ $line =~ $regex2 ]] ; then
echo No matching commit
exit 1
else
echo commit $(echo $line | cut -c1-12,41-)
fi

exit 0
$

so that using $ format_commit.sh 1234 looks
at _all_ the commit references by using git rev-list
then greps that output for the matches, but it is
darn slow...

$ time ./format_commit.sh 1234
checking commits 1234...
commit 1234351cba95 (xfs: introduce xlog_copy_iovec)
commit 1234471e2d11 (perf header: Fix numa topology printing)
commit 1234f4bada54 (hwrng: Kconfig: remove dependency for atmel-rng driver)
commit 12340313cf94 (MAINTAINERS: add new cgroup list to CC notice)
commit 12346037a718 (UBIFS: dump more in the lprops debugging check)
commit 12342c475f5d (iwlwifi: proper monitor support)
commit 1234010684bb (Add notation that the Asus W5F laptop has a short cable 
instead of 80-wire.)
commit 123411f2d0da ([CPUFREQ] dprintf format fixes in 
cpufreq/speedstep-centrino.c)

real0m24.535s
user0m21.668s
sys 0m5.332s


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] checkpatch: Add test for commit id formatting style in commit log

2014-08-10 Thread Joe Perches
On Sun, 2014-08-10 at 14:35 -0700, Andrew Morton wrote:
 On Sun, 10 Aug 2014 14:28:01 -0700 Joe Perches j...@perches.com wrote:
   On Thu, Jul 3, 2014 at 12:00 AM, Joe Perches j...@perches.com wrote:
Commit logs have various forms of commit id references.
   
Try to standardize on a 12 character long lower case
commit id along with a description of parentheses and
the quoted subject line
   
ie: commit 0123456789ab (commit description)
   
   Now this is in mainline, checkpatch starts complaining about my too long
   (40 chars) commit IDs in commit messages :-(
   
   40 chars may be too long (but it's quick to copy-and-paste, as git show
   shows that by default), but 12 sounds a bit short, as that's only 48 bits.
  
  Right now, this test allows 12 to 16 byte length commit ids
  without emitting a warning.
  
  Andrew wanted this test, I don't care how long the commit id
  is in the commit log.
 
 Well, I mainly wanted to stop having to add commit description when
 people forget it.  The length check was perhaps a bit anal.  How about
 we make it 12 or more?

Fine by me, just change the 16 to 40
---
 scripts/checkpatch.pl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 31a731e..b385bcb 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2133,7 +2133,7 @@ sub process {
 # Check for improperly formed commit descriptions
if ($in_commit_log 
$line =~ /\bcommit\s+[0-9a-f]{5,}/i 
-   $line !~ /\b[Cc]ommit [0-9a-f]{12,16} \(/) {
+   $line !~ /\b[Cc]ommit [0-9a-f]{12,40} \(/) {
$line =~ /\b(c)ommit\s+([0-9a-f]{5,})/i;
my $init_char = $1;
my $orig_commit = lc($2);
@@ -2141,7 +2141,7 @@ sub process {
my $desc = 'commit description';
($id, $desc) = git_commit_info($orig_commit, $id, 
$desc);
ERROR(GIT_COMMIT_ID,
- Please use 12 to 16 chars for the git commit ID 
like: '${init_char}ommit $id (\$desc\)'\n . $herecurr);
+ Please use 12 or more chars for the git commit 
ID like: '${init_char}ommit $id (\$desc\)'\n . $herecurr);
}
 
 # Check for added, moved or deleted files


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html