On Sun, 20 Sep 2020, Joe Perches wrote:

> On Sun, 2020-09-20 at 21:52 +0530, Dwaipayan Ray wrote:
> > On Sun, Sep 20, 2020 at 8:39 PM Joe Perches <[email protected]> wrote:
> > > On Sun, 2020-09-20 at 14:47 +0530, Dwaipayan Ray wrote:
> > > > Checkpatch did not handle cases where the author From: header
> > > > was split into multiple lines. The author identity could not
> > > > be resolved and checkpatch generated a false NO_AUTHOR_SIGN_OFF
> > > > warning.
> > > 
> > I think there won't be any problem. Is my
> > observation correct?
> 
> Likely true, it probably doesn't matter.
> Still, I'd imagine it doesn't hurt either.
> 
> > > What I have does a bit more by saving any post-folding
> > > 
> > > "From: <name and email address>"
> > > 
> > > and comparing that to any "name and perhaps different
> > > email address" in a Signed-off-by: line.
> > > 
> > > A new message is emitted if the name matches but the
> > > email address is different.
> > > 
> > > Perhaps it's reasonable to apply your patch and then
> > > update it with something like the below:
> > > ---
> > >  scripts/checkpatch.pl | 32 ++++++++++++++++++++++++++++----
> > >  1 file changed, 28 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > > index 3e474072aa90..1ecc179e938d 100755
> > > --- a/scripts/checkpatch.pl
> > > +++ b/scripts/checkpatch.pl
> > > @@ -1240,6 +1240,15 @@ sub same_email_addresses {
> > >                $email1_address eq $email2_address;
> > >  }
> > > 
> > > +sub same_email_names {
> > > +       my ($email1, $email2) = @_;
> > > +
> > > +       my ($email1_name, $name1_comment, $email1_address, $comment1) = 
> > > parse_email($email1);
> > > +       my ($email2_name, $name2_comment, $email2_address, $comment2) = 
> > > parse_email($email2);
> > > +
> > > +       return $email1_name eq $email2_name;
> > > +}
> > > +
> > >  sub which {
> > >         my ($bin) = @_;
> > > 
> > > @@ -2679,20 +2688,32 @@ sub process {
> > >                 }
> > > 
> > >  # Check the patch for a From:
> > > -               if (decode("MIME-Header", $line) =~ /^From:\s*(.*)/) {
> > > +               if ($line =~ /^From:\s*(.*)/i) {
> > >                         $author = $1;
> > > -                       $author = encode("utf8", $author) if ($line =~ 
> > > /=\?utf-8\?/i);
> > > +                       my $curline = $linenr;
> > > +                       while (defined($rawlines[$curline]) && 
> > > $rawlines[$curline++] =~ /^\s(\s+)?(.*)/) {
> > > +                               $author .= ' ' if (defined($1));
> > > +                               $author .= "$2";
> > > +                       }
> > > +                       if ($author =~ /=\?utf-8\?/i) {
> > > +                               $author = decode("MIME-Header", $author);
> > > +                               $author = encode("utf8", $author);
> > > +                       }
> > > +
> > >                         $author =~ s/"//g;
> > >                         $author = reformat_email($author);
> > >                 }
> > > 
> > >  # Check the patch for a signoff:
> > >                 if ($line =~ /^\s*signed-off-by:\s*(.*)/i) {
> > > +                       my $sig = $1;
> > >                         $signoff++;
> > >                         $in_commit_log = 0;
> > >                         if ($author ne '') {
> > > -                               if (same_email_addresses($1, $author)) {
> > > -                                       $authorsignoff = 1;
> > > +                               if (same_email_addresses($sig, $author)) {
> > > +                                       $authorsignoff = "1";
> > > +                               } elsif (same_email_names($sig, $author)) 
> > > {
> > > +                                       $authorsignoff = $sig;
> > >                                 }
> > >                         }
> > >                 }
> > > @@ -6937,6 +6958,9 @@ sub process {
> > >                 } elsif (!$authorsignoff) {
> > >                         WARN("NO_AUTHOR_SIGN_OFF",
> > >                              "Missing Signed-off-by: line by nominal 
> > > patch author '$author'\n");
> > > +               } elsif ($authorsignoff ne "1") {
> > > +                       WARN("NO_AUTHOR_SIGN_OFF",
> > > +                            "From:/SoB: email address mismatch: 'From: 
> > > $author' != 'Signed-off-by: $authorsignoff'\n");
> > >                 }
> > >         }
> > > 
> > > 
> > 
> > Yes, this is definitely more logical !
> > I was actually hoping to talk with you on this.
> 
> Hope granted, but only via email... (smile)
> 
> > The code you sent better handles name mismatches when
> > email addresses are same. But I also have found several
> > such commits in which the author have signed off using
> > a different email address than the one which he/she used
> > to send the patch.
> > 
> > For example, Lukas checked commits between v5.4 and
> > v5.8 and he found:
> >     175 Missing Signed-off-by: line by nominal patch authorrep
> >     'Daniel Vetter <[email protected]>'
> > 
> > Infact in all of those commits he signed off using a different
> > mail, Daniel Vetter <[email protected]>.
> > 
> > So is it possible to resolve these using perhaps .mailmap
> > entries? Or should only the name mismatch part be better
> > handled? Or perhaps both?
> 
> Dunno.  It certainly can be improved...
> Try adding some more logic and see what you come up with.
> 
> btw:
> 
> The most frequent NO_AUTHOR_SIGN_OFF messages for v5.7..v5.8 are
> 
>      98 WARNING: From:/SoB: email address mismatch: 'From: Daniel Vetter 
> <[email protected]>' != 'Signed-off-by: Daniel Vetter 
> <[email protected]>'
>      24 WARNING: From:/SoB: email address mismatch: 'From: Thinh Nguyen 
> <[email protected]>' != 'Signed-off-by: Thinh Nguyen 
> <[email protected]>'
>      19 WARNING: From:/SoB: email address mismatch: 'From: Wolfram Sang 
> <[email protected]>' != 'Signed-off-by: Wolfram Sang 
> <[email protected]>'
>      11 WARNING: From:/SoB: email address mismatch: 'From: Luke Nelson 
> <[email protected]>' != 'Signed-off-by: Luke Nelson 
> <[email protected]>'
>       8 WARNING: From:/SoB: email address mismatch: 'From: Christophe Leroy 
> <[email protected]>' != 'Signed-off-by: Christophe Leroy 
> <[email protected]>'
>       6 WARNING: From:/SoB: email address mismatch: 'From: Davidlohr Bueso 
> <[email protected]>' != 'Signed-off-by: Davidlohr Bueso <[email protected]>'
>       5 WARNING: Missing Signed-off-by: line by nominal patch author '"Paul 
> A. Clarke" <[email protected]>'
>       4 WARNING: Missing Signed-off-by: line by nominal patch author 'Huang 
> Ying <[email protected]>'
>       3 WARNING: Missing Signed-off-by: line by nominal patch author 
> '"Stephan Müller" <[email protected]>'
>

Great minds think alike.

I did a similar investigation on Friday after the first discussion with 
Dwaipayan:

https://lore.kernel.org/linux-kernel-mentees/alpine.DEB.2.21.2009181238230.14717@felia/T/#m1bf5f7ca876d33d4d53e492b5d8a6232437c921f

I hope Dwaipayan can come up with a '.AUTHOR_SIGN_OFF.mailmap' file that 
we can use to distinguish the known developers that knowingly and 
intentionally use different identities vs. the 'newbies' that should 
validly be warned.

We will see if Dwaipayan can come up with a good idea how to handle that.

Lukas

> For the Missing Signed-off-by: lines above,
> even after decoding, the email matches but
> the names do not.
> 
> From: "Paul A. Clarke" <[email protected]>
> [...]
> Signed-off-by: Paul Clarke <[email protected]>
> 
> From: Huang Ying <[email protected]>
> [...]
> Signed-off-by: "Huang, Ying" <[email protected]>
> 
> From: =?UTF-8?q?Stephan=20M=C3=BCller?= <[email protected]>
> [...]
> Signed-off-by: Stephan Mueller <[email protected]>
> 
> > Also, I would like to know if there are any more changes
> > required for the current patch or if it is good to go?
> 
> I think it's fine.
> 
> cheers, Joe
> 
> 

Reply via email to