On 24/10/20 7:07 am, Joe Perches wrote: > On Sat, 2020-10-24 at 05:38 +0530, Aditya Srivastava wrote: >> A quick evaluation on v5.6..v5.8 showed that this fix reduces >> REPEATED_WORD warnings from 2797 to 907. > > How many of these 907 remaining are still false positive? > >> A quick manual check found all cases are related to hex output or >> list command outputs in commit messages. > > You mean 1890 of the 2797 are now no longer reported and all 1890 > were false positives yes? >
Yes. In v5.6..5.8, there were 2797 warnings for REPEATED_WORD, after these changes, they are reduced to 907. However, many among these 907 must have been fixed by Dwaipayan's patch. I'll replace it with 1890 instead, for the better. >> pos($rawline) = 1 if (!$in_commit_log); >> while ($rawline =~ /\b($word_pattern) >> (?=($word_pattern))/g) { >> >> >> @@ -3074,6 +3076,17 @@ sub process { >> next if ($start_char =~ /^\S$/); >> next if (index(" \t.,;?!", $end_char) == -1); >> >> >> + # avoid repeating hex occurrences like 'ff >> ff fe 09 ...' >> + my %allow_repeated_words = ( >> + add => '', >> + added => '', >> + bad => '', >> + be => '', >> + ); > > If perl caches this local hash declaration, fine, > but I think it better to use 'our %allow_repeated_words' > and move it so it's only declared using the file scope. > I ran checkpatch over few commits, it was working fine. But I'll move it to file scope, using 'our'. That should do as well. >> + if ($first =~ /\b[0-9a-f]{2,}\b/) { > > This regex matches only lower case so it wouldn't match "Add". > > I think this regex would be clearer using > /^[0-9a-f]+$/i or /^[A-Fa-f0-9]+$/ > > Missed it. Will do. Thanks Aditya