Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
You can disregard this, I missed the other thread and that looks fine. On 03/24/2017 01:14 PM, John 'Warthog9' Hawley wrote: > On 03/21/2017 11:31 AM, Joe Perches wrote: >> On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: >>> Spamassassin sticks a long (~79 character) long string after a >>> line that has a single space in it. The line with space causes >>> checkpatch to erroniously think that it's in the content body, as >>> opposed to headers and thus flag a mail header as an unwrapped long >>> comment line. >> >> If the spammassassin header is like >> >> email-header-n: foo >> email-header-m: bar >> >> X-Spam-Report: bar >> >> Does that form follow rfc 5322? > > It does look that way > >> If it does then any email header could have that >> form and the header wrapping test should be >> updated from >> >> if ($in_header_lines && $realfile =~ /^$/ && >> !($rawline =~ /^\s+\S/ || >>$rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { >> $in_header_lines = 0; >> $in_commit_log = 1; >> $has_commit_log = 1; >> } >> >> to something like >> >> if ($in_header_lines && $realfile =~ /^$/ && >> !($rawline =~ /^ (?:\s*\S|$)/ || >>$rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { >> > > So this seems to fix the specific issue we were tripping over, but in > doing so causes some other problems on some of the other headers in the > message we are using for testing (3 new warnings, and an error), > specifically flagging a: > - Warning on 'Received:' header > - Warning on 'DKIM-Signature:' header (twice, for both leading and > trailing white space on To: and From:) > - Erroring on the DKIM-Signature as well > > Noting the DKIM-Signature, in this case, is also a multi-line header message > > This resolves the issue we were seeing, and doesn't (at least in my test > cases), cause any new errors: > > if ($in_header_lines && $realfile =~ /^$/ && > !( > ( > $rawline =~ /^\s+\S*/ > && > $rawline !~ /^[\r\n]+$/ > ) > || > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i) > ) { > > as the line that's causing issues, /^ [\r\n]+$/, wouldn't get > incorrectly caught as the end of the headers. > > - John 'Warthog9' Hawley >
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
You can disregard this, I missed the other thread and that looks fine. On 03/24/2017 01:14 PM, John 'Warthog9' Hawley wrote: > On 03/21/2017 11:31 AM, Joe Perches wrote: >> On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: >>> Spamassassin sticks a long (~79 character) long string after a >>> line that has a single space in it. The line with space causes >>> checkpatch to erroniously think that it's in the content body, as >>> opposed to headers and thus flag a mail header as an unwrapped long >>> comment line. >> >> If the spammassassin header is like >> >> email-header-n: foo >> email-header-m: bar >> >> X-Spam-Report: bar >> >> Does that form follow rfc 5322? > > It does look that way > >> If it does then any email header could have that >> form and the header wrapping test should be >> updated from >> >> if ($in_header_lines && $realfile =~ /^$/ && >> !($rawline =~ /^\s+\S/ || >>$rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { >> $in_header_lines = 0; >> $in_commit_log = 1; >> $has_commit_log = 1; >> } >> >> to something like >> >> if ($in_header_lines && $realfile =~ /^$/ && >> !($rawline =~ /^ (?:\s*\S|$)/ || >>$rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { >> > > So this seems to fix the specific issue we were tripping over, but in > doing so causes some other problems on some of the other headers in the > message we are using for testing (3 new warnings, and an error), > specifically flagging a: > - Warning on 'Received:' header > - Warning on 'DKIM-Signature:' header (twice, for both leading and > trailing white space on To: and From:) > - Erroring on the DKIM-Signature as well > > Noting the DKIM-Signature, in this case, is also a multi-line header message > > This resolves the issue we were seeing, and doesn't (at least in my test > cases), cause any new errors: > > if ($in_header_lines && $realfile =~ /^$/ && > !( > ( > $rawline =~ /^\s+\S*/ > && > $rawline !~ /^[\r\n]+$/ > ) > || > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i) > ) { > > as the line that's causing issues, /^ [\r\n]+$/, wouldn't get > incorrectly caught as the end of the headers. > > - John 'Warthog9' Hawley >
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
On 03/21/2017 11:31 AM, Joe Perches wrote: > On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: >> Spamassassin sticks a long (~79 character) long string after a >> line that has a single space in it. The line with space causes >> checkpatch to erroniously think that it's in the content body, as >> opposed to headers and thus flag a mail header as an unwrapped long >> comment line. > > If the spammassassin header is like > > email-header-n: foo > email-header-m: bar > > X-Spam-Report: bar > > Does that form follow rfc 5322? It does look that way > If it does then any email header could have that > form and the header wrapping test should be > updated from > > if ($in_header_lines && $realfile =~ /^$/ && > !($rawline =~ /^\s+\S/ || > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > $in_header_lines = 0; > $in_commit_log = 1; > $has_commit_log = 1; > } > > to something like > > if ($in_header_lines && $realfile =~ /^$/ && > !($rawline =~ /^ (?:\s*\S|$)/ || > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > So this seems to fix the specific issue we were tripping over, but in doing so causes some other problems on some of the other headers in the message we are using for testing (3 new warnings, and an error), specifically flagging a: - Warning on 'Received:' header - Warning on 'DKIM-Signature:' header (twice, for both leading and trailing white space on To: and From:) - Erroring on the DKIM-Signature as well Noting the DKIM-Signature, in this case, is also a multi-line header message This resolves the issue we were seeing, and doesn't (at least in my test cases), cause any new errors: if ($in_header_lines && $realfile =~ /^$/ && !( ( $rawline =~ /^\s+\S*/ && $rawline !~ /^[\r\n]+$/ ) || $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i) ) { as the line that's causing issues, /^ [\r\n]+$/, wouldn't get incorrectly caught as the end of the headers. - John 'Warthog9' Hawley
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
On 03/21/2017 11:31 AM, Joe Perches wrote: > On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: >> Spamassassin sticks a long (~79 character) long string after a >> line that has a single space in it. The line with space causes >> checkpatch to erroniously think that it's in the content body, as >> opposed to headers and thus flag a mail header as an unwrapped long >> comment line. > > If the spammassassin header is like > > email-header-n: foo > email-header-m: bar > > X-Spam-Report: bar > > Does that form follow rfc 5322? It does look that way > If it does then any email header could have that > form and the header wrapping test should be > updated from > > if ($in_header_lines && $realfile =~ /^$/ && > !($rawline =~ /^\s+\S/ || > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > $in_header_lines = 0; > $in_commit_log = 1; > $has_commit_log = 1; > } > > to something like > > if ($in_header_lines && $realfile =~ /^$/ && > !($rawline =~ /^ (?:\s*\S|$)/ || > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > So this seems to fix the specific issue we were tripping over, but in doing so causes some other problems on some of the other headers in the message we are using for testing (3 new warnings, and an error), specifically flagging a: - Warning on 'Received:' header - Warning on 'DKIM-Signature:' header (twice, for both leading and trailing white space on To: and From:) - Erroring on the DKIM-Signature as well Noting the DKIM-Signature, in this case, is also a multi-line header message This resolves the issue we were seeing, and doesn't (at least in my test cases), cause any new errors: if ($in_header_lines && $realfile =~ /^$/ && !( ( $rawline =~ /^\s+\S*/ && $rawline !~ /^[\r\n]+$/ ) || $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i) ) { as the line that's causing issues, /^ [\r\n]+$/, wouldn't get incorrectly caught as the end of the headers. - John 'Warthog9' Hawley
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
On Wed, 2017-03-22 at 23:01 -0700, Darren Hart wrote: > I still haven't figured out why we test for this specific set of patterns. Why > is a line that starts with a space and ends with a newline considered still > in_header_lines. Or more specifically, why aren't we just testing for an empty > line (RFC 5322 Section 2.1, defining the separation of headers and the body). Because of the From: and commit: lines sometimes used to describe patches sent on behalf of others.
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
On Wed, 2017-03-22 at 23:01 -0700, Darren Hart wrote: > I still haven't figured out why we test for this specific set of patterns. Why > is a line that starts with a space and ends with a newline considered still > in_header_lines. Or more specifically, why aren't we just testing for an empty > line (RFC 5322 Section 2.1, defining the separation of headers and the body). Because of the From: and commit: lines sometimes used to describe patches sent on behalf of others.
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
On Wed, Mar 22, 2017 at 11:17:33AM -0700, Joe Perches wrote: > On Wed, 2017-03-22 at 08:25 -0700, Darren Hart wrote: > > On Tue, Mar 21, 2017 at 11:31:08AM -0700, Joe Perches wrote: > > > On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: > > > > Spamassassin sticks a long (~79 character) long string after a > > > > line that has a single space in it. The line with space causes > > > > checkpatch to erroniously think that it's in the content body, as > > > > opposed to headers and thus flag a mail header as an unwrapped long > > > > comment line. > > > > > > If the spammassassin header is like > > > > > > email-header-n: foo > > > email-header-m: bar > > > > > > X-Spam-Report: bar > > > > The specific content of the X-Spam-Report that triggers this for me, > > from this patch for example, is: > > > > === 8< === > > X-Spam-Report: SpamAssassin version 3.4.1 on casper.infradead.org summary: > > Content analysis details: (-1.9 points, 5.0 required) > > > > pts rule name description > > -- > > -- > > -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% > > [score: 0.] > > X-TUID: alGBIuPZmqOj > > > > === >8 === > > > > The long ... line is over 75 characters and triggers the test > > for long commit_log lines. > > > > > > > > Does that form follow rfc 5322? > > > > By my reading, this is governed by the long header fields defined by > > 2.2.3, with whitespace folding defined as "a CRLF may be inserted before > > any WSP." > > > > > > > > If it does then any email header could have that > > > form and the header wrapping test should be > > > > Yes, agreed. > > > > So the logic we want is: > > > > If we are in headers and we detect a CRLF and the next line starts with a > > WSP, > > then we are still in headers (and therefor not in the commit log). The CRLF > > information does not appear to be available as it is replaced with just \n. > > > > > updated from > > > > > > if ($in_header_lines && $realfile =~ /^$/ && > > > !($rawline =~ /^\s+\S/ || > > > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > > > $in_header_lines = 0; > > > $in_commit_log = 1; > > > $has_commit_log = 1; > > > } > > > > > > to something like > > > > > > if ($in_header_lines && $realfile =~ /^$/ && > > > !($rawline =~ /^ (?:\s*\S|$)/ || > > > > Hrm... lines that start with maybe a space followed by a : ... Why did you > > introduce that part of the check? > > The regex doesn't care about colons. > It's a perl non-capturing group. > https://perldoc.perl.org/perlretut.html#Non-capturing-groupings Aha, thank you for the pointer. > > > Looking at this more closely, I was also not clear why the original test > > looked > > for several spaces followed by non-space. What case is this for? > > Not several spaces, one or more spaces then a non-space. > The only change here is allowing an initial space followed > by either: > > 1: optional spaces, then non-space. > 2: EOL > > I supposed you could argue that case 2 should > also allow optional spaces before EOL and the > test should be > > if ($in_header_lines && $realfile =~ /^$/ && > !($rawline =~ /^\s+(?:\S|$)/ || I still haven't figured out why we test for this specific set of patterns. Why is a line that starts with a space and ends with a newline considered still in_header_lines. Or more specifically, why aren't we just testing for an empty line (RFC 5322 Section 2.1, defining the separation of headers and the body). -- Darren Hart VMware Open Source Technology Center
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
On Wed, Mar 22, 2017 at 11:17:33AM -0700, Joe Perches wrote: > On Wed, 2017-03-22 at 08:25 -0700, Darren Hart wrote: > > On Tue, Mar 21, 2017 at 11:31:08AM -0700, Joe Perches wrote: > > > On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: > > > > Spamassassin sticks a long (~79 character) long string after a > > > > line that has a single space in it. The line with space causes > > > > checkpatch to erroniously think that it's in the content body, as > > > > opposed to headers and thus flag a mail header as an unwrapped long > > > > comment line. > > > > > > If the spammassassin header is like > > > > > > email-header-n: foo > > > email-header-m: bar > > > > > > X-Spam-Report: bar > > > > The specific content of the X-Spam-Report that triggers this for me, > > from this patch for example, is: > > > > === 8< === > > X-Spam-Report: SpamAssassin version 3.4.1 on casper.infradead.org summary: > > Content analysis details: (-1.9 points, 5.0 required) > > > > pts rule name description > > -- > > -- > > -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% > > [score: 0.] > > X-TUID: alGBIuPZmqOj > > > > === >8 === > > > > The long ... line is over 75 characters and triggers the test > > for long commit_log lines. > > > > > > > > Does that form follow rfc 5322? > > > > By my reading, this is governed by the long header fields defined by > > 2.2.3, with whitespace folding defined as "a CRLF may be inserted before > > any WSP." > > > > > > > > If it does then any email header could have that > > > form and the header wrapping test should be > > > > Yes, agreed. > > > > So the logic we want is: > > > > If we are in headers and we detect a CRLF and the next line starts with a > > WSP, > > then we are still in headers (and therefor not in the commit log). The CRLF > > information does not appear to be available as it is replaced with just \n. > > > > > updated from > > > > > > if ($in_header_lines && $realfile =~ /^$/ && > > > !($rawline =~ /^\s+\S/ || > > > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > > > $in_header_lines = 0; > > > $in_commit_log = 1; > > > $has_commit_log = 1; > > > } > > > > > > to something like > > > > > > if ($in_header_lines && $realfile =~ /^$/ && > > > !($rawline =~ /^ (?:\s*\S|$)/ || > > > > Hrm... lines that start with maybe a space followed by a : ... Why did you > > introduce that part of the check? > > The regex doesn't care about colons. > It's a perl non-capturing group. > https://perldoc.perl.org/perlretut.html#Non-capturing-groupings Aha, thank you for the pointer. > > > Looking at this more closely, I was also not clear why the original test > > looked > > for several spaces followed by non-space. What case is this for? > > Not several spaces, one or more spaces then a non-space. > The only change here is allowing an initial space followed > by either: > > 1: optional spaces, then non-space. > 2: EOL > > I supposed you could argue that case 2 should > also allow optional spaces before EOL and the > test should be > > if ($in_header_lines && $realfile =~ /^$/ && > !($rawline =~ /^\s+(?:\S|$)/ || I still haven't figured out why we test for this specific set of patterns. Why is a line that starts with a space and ends with a newline considered still in_header_lines. Or more specifically, why aren't we just testing for an empty line (RFC 5322 Section 2.1, defining the separation of headers and the body). -- Darren Hart VMware Open Source Technology Center
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
On Wed, 2017-03-22 at 08:25 -0700, Darren Hart wrote: > On Tue, Mar 21, 2017 at 11:31:08AM -0700, Joe Perches wrote: > > On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: > > > Spamassassin sticks a long (~79 character) long string after a > > > line that has a single space in it. The line with space causes > > > checkpatch to erroniously think that it's in the content body, as > > > opposed to headers and thus flag a mail header as an unwrapped long > > > comment line. > > > > If the spammassassin header is like > > > > email-header-n: foo > > email-header-m: bar > > > > X-Spam-Report: bar > > The specific content of the X-Spam-Report that triggers this for me, > from this patch for example, is: > > === 8< === > X-Spam-Report: SpamAssassin version 3.4.1 on casper.infradead.org summary: > Content analysis details: (-1.9 points, 5.0 required) > > pts rule name description > -- > -- > -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% > [score: 0.] > X-TUID: alGBIuPZmqOj > > === >8 === > > The long ... line is over 75 characters and triggers the test > for long commit_log lines. > > > > > Does that form follow rfc 5322? > > By my reading, this is governed by the long header fields defined by > 2.2.3, with whitespace folding defined as "a CRLF may be inserted before > any WSP." > > > > > If it does then any email header could have that > > form and the header wrapping test should be > > Yes, agreed. > > So the logic we want is: > > If we are in headers and we detect a CRLF and the next line starts with a WSP, > then we are still in headers (and therefor not in the commit log). The CRLF > information does not appear to be available as it is replaced with just \n. > > > updated from > > > > if ($in_header_lines && $realfile =~ /^$/ && > > !($rawline =~ /^\s+\S/ || > > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > > $in_header_lines = 0; > > $in_commit_log = 1; > > $has_commit_log = 1; > > } > > > > to something like > > > > if ($in_header_lines && $realfile =~ /^$/ && > > !($rawline =~ /^ (?:\s*\S|$)/ || > > Hrm... lines that start with maybe a space followed by a : ... Why did you > introduce that part of the check? The regex doesn't care about colons. It's a perl non-capturing group. https://perldoc.perl.org/perlretut.html#Non-capturing-groupings > Looking at this more closely, I was also not clear why the original test > looked > for several spaces followed by non-space. What case is this for? Not several spaces, one or more spaces then a non-space. The only change here is allowing an initial space followed by either: 1: optional spaces, then non-space. 2: EOL I supposed you could argue that case 2 should also allow optional spaces before EOL and the test should be if ($in_header_lines && $realfile =~ /^$/ && !($rawline =~ /^\s+(?:\S|$)/ || ...
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
On Wed, 2017-03-22 at 08:25 -0700, Darren Hart wrote: > On Tue, Mar 21, 2017 at 11:31:08AM -0700, Joe Perches wrote: > > On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: > > > Spamassassin sticks a long (~79 character) long string after a > > > line that has a single space in it. The line with space causes > > > checkpatch to erroniously think that it's in the content body, as > > > opposed to headers and thus flag a mail header as an unwrapped long > > > comment line. > > > > If the spammassassin header is like > > > > email-header-n: foo > > email-header-m: bar > > > > X-Spam-Report: bar > > The specific content of the X-Spam-Report that triggers this for me, > from this patch for example, is: > > === 8< === > X-Spam-Report: SpamAssassin version 3.4.1 on casper.infradead.org summary: > Content analysis details: (-1.9 points, 5.0 required) > > pts rule name description > -- > -- > -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% > [score: 0.] > X-TUID: alGBIuPZmqOj > > === >8 === > > The long ... line is over 75 characters and triggers the test > for long commit_log lines. > > > > > Does that form follow rfc 5322? > > By my reading, this is governed by the long header fields defined by > 2.2.3, with whitespace folding defined as "a CRLF may be inserted before > any WSP." > > > > > If it does then any email header could have that > > form and the header wrapping test should be > > Yes, agreed. > > So the logic we want is: > > If we are in headers and we detect a CRLF and the next line starts with a WSP, > then we are still in headers (and therefor not in the commit log). The CRLF > information does not appear to be available as it is replaced with just \n. > > > updated from > > > > if ($in_header_lines && $realfile =~ /^$/ && > > !($rawline =~ /^\s+\S/ || > > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > > $in_header_lines = 0; > > $in_commit_log = 1; > > $has_commit_log = 1; > > } > > > > to something like > > > > if ($in_header_lines && $realfile =~ /^$/ && > > !($rawline =~ /^ (?:\s*\S|$)/ || > > Hrm... lines that start with maybe a space followed by a : ... Why did you > introduce that part of the check? The regex doesn't care about colons. It's a perl non-capturing group. https://perldoc.perl.org/perlretut.html#Non-capturing-groupings > Looking at this more closely, I was also not clear why the original test > looked > for several spaces followed by non-space. What case is this for? Not several spaces, one or more spaces then a non-space. The only change here is allowing an initial space followed by either: 1: optional spaces, then non-space. 2: EOL I supposed you could argue that case 2 should also allow optional spaces before EOL and the test should be if ($in_header_lines && $realfile =~ /^$/ && !($rawline =~ /^\s+(?:\S|$)/ || ...
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
On Tue, Mar 21, 2017 at 11:31:08AM -0700, Joe Perches wrote: > On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: > > Spamassassin sticks a long (~79 character) long string after a > > line that has a single space in it. The line with space causes > > checkpatch to erroniously think that it's in the content body, as > > opposed to headers and thus flag a mail header as an unwrapped long > > comment line. > > If the spammassassin header is like > > email-header-n: foo > email-header-m: bar > > X-Spam-Report: bar The specific content of the X-Spam-Report that triggers this for me, from this patch for example, is: === 8< === X-Spam-Report: SpamAssassin version 3.4.1 on casper.infradead.org summary: Content analysis details: (-1.9 points, 5.0 required) pts rule name description -- -- -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.] X-TUID: alGBIuPZmqOj === >8 === The long ... line is over 75 characters and triggers the test for long commit_log lines. > > Does that form follow rfc 5322? By my reading, this is governed by the long header fields defined by 2.2.3, with whitespace folding defined as "a CRLF may be inserted before any WSP." > > If it does then any email header could have that > form and the header wrapping test should be Yes, agreed. So the logic we want is: If we are in headers and we detect a CRLF and the next line starts with a WSP, then we are still in headers (and therefor not in the commit log). The CRLF information does not appear to be available as it is replaced with just \n. > updated from > > if ($in_header_lines && $realfile =~ /^$/ && > !($rawline =~ /^\s+\S/ || > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > $in_header_lines = 0; > $in_commit_log = 1; > $has_commit_log = 1; > } > > to something like > > if ($in_header_lines && $realfile =~ /^$/ && > !($rawline =~ /^ (?:\s*\S|$)/ || Hrm... lines that start with maybe a space followed by a : ... Why did you introduce that part of the check? Looking at this more closely, I was also not clear why the original test looked for several spaces followed by non-space. What case is this for? > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > > Thanks, -- Darren Hart VMware Open Source Technology Center
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
On Tue, Mar 21, 2017 at 11:31:08AM -0700, Joe Perches wrote: > On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: > > Spamassassin sticks a long (~79 character) long string after a > > line that has a single space in it. The line with space causes > > checkpatch to erroniously think that it's in the content body, as > > opposed to headers and thus flag a mail header as an unwrapped long > > comment line. > > If the spammassassin header is like > > email-header-n: foo > email-header-m: bar > > X-Spam-Report: bar The specific content of the X-Spam-Report that triggers this for me, from this patch for example, is: === 8< === X-Spam-Report: SpamAssassin version 3.4.1 on casper.infradead.org summary: Content analysis details: (-1.9 points, 5.0 required) pts rule name description -- -- -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.] X-TUID: alGBIuPZmqOj === >8 === The long ... line is over 75 characters and triggers the test for long commit_log lines. > > Does that form follow rfc 5322? By my reading, this is governed by the long header fields defined by 2.2.3, with whitespace folding defined as "a CRLF may be inserted before any WSP." > > If it does then any email header could have that > form and the header wrapping test should be Yes, agreed. So the logic we want is: If we are in headers and we detect a CRLF and the next line starts with a WSP, then we are still in headers (and therefor not in the commit log). The CRLF information does not appear to be available as it is replaced with just \n. > updated from > > if ($in_header_lines && $realfile =~ /^$/ && > !($rawline =~ /^\s+\S/ || > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > $in_header_lines = 0; > $in_commit_log = 1; > $has_commit_log = 1; > } > > to something like > > if ($in_header_lines && $realfile =~ /^$/ && > !($rawline =~ /^ (?:\s*\S|$)/ || Hrm... lines that start with maybe a space followed by a : ... Why did you introduce that part of the check? Looking at this more closely, I was also not clear why the original test looked for several spaces followed by non-space. What case is this for? > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > > Thanks, -- Darren Hart VMware Open Source Technology Center
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: > Spamassassin sticks a long (~79 character) long string after a > line that has a single space in it. The line with space causes > checkpatch to erroniously think that it's in the content body, as > opposed to headers and thus flag a mail header as an unwrapped long > comment line. If the spammassassin header is like email-header-n: foo email-header-m: bar X-Spam-Report: bar Does that form follow rfc 5322? If it does then any email header could have that form and the header wrapping test should be updated from if ($in_header_lines && $realfile =~ /^$/ && !($rawline =~ /^\s+\S/ || $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { $in_header_lines = 0; $in_commit_log = 1; $has_commit_log = 1; } to something like if ($in_header_lines && $realfile =~ /^$/ && !($rawline =~ /^ (?:\s*\S|$)/ || $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) {
Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: > Spamassassin sticks a long (~79 character) long string after a > line that has a single space in it. The line with space causes > checkpatch to erroniously think that it's in the content body, as > opposed to headers and thus flag a mail header as an unwrapped long > comment line. If the spammassassin header is like email-header-n: foo email-header-m: bar X-Spam-Report: bar Does that form follow rfc 5322? If it does then any email header could have that form and the header wrapping test should be updated from if ($in_header_lines && $realfile =~ /^$/ && !($rawline =~ /^\s+\S/ || $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { $in_header_lines = 0; $in_commit_log = 1; $has_commit_log = 1; } to something like if ($in_header_lines && $realfile =~ /^$/ && !($rawline =~ /^ (?:\s*\S|$)/ || $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) {
[PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
Spamassassin sticks a long (~79 character) long string after a line that has a single space in it. The line with space causes checkpatch to erroniously think that it's in the content body, as opposed to headers and thus flag a mail header as an unwrapped long comment line. This flags when X-Spam-Report is found, till an e-mail body indicator (blank line, /^[/n/r]*/) is found, blocking setting $in_commit_log. When found, unsets itself allowing the rest of the detection to function normally. Reported-by: Darren Hart (VMware)Signed-off-by: John 'Warthog9' Hawley (VMware) --- scripts/checkpatch.pl | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index baa3c7b..d2ce89a 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -2134,6 +2134,7 @@ sub process { my $signoff = 0; my $is_patch = 0; my $in_header_lines = $file ? 0 : 1; + my $found_spam_header = 0; my $in_commit_log = 0; #Scanning lines before patch my $has_commit_log = 0; #Encountered lines before patch my $commit_log_possible_stack_dump = 0; @@ -2279,6 +2280,12 @@ sub process { my $rawline = $rawlines[$linenr - 1]; +#if we encounter a spamassassin mail header, mark it + if ($in_header_lines == 1 && $line =~ /^X-Spam-Report:/) { + #mail header found, this needs to be flagged + $found_spam_header = 1; + } + #extract the line range in the file after the patch is applied if (!$in_commit_log && $line =~ /^\@\@ -\d+(?:,\d+)? \+(\d+)(,(\d+))? \@\@/) { @@ -2627,9 +2634,15 @@ sub process { # Check if it's the start of a commit log # (not a header line and we haven't seen the patch filename) + if ($in_header_lines && $found_spam_header && $line =~ /^[\n\r]*$/) { + # we are now past the header info that could be confusing + $found_spam_header = 0; + } + if ($in_header_lines && $realfile =~ /^$/ && !($rawline =~ /^\s+\S/ || - $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { + $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i) && + !$found_spam_header) { $in_header_lines = 0; $in_commit_log = 1; $has_commit_log = 1; -- 2.5.5
[PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
Spamassassin sticks a long (~79 character) long string after a line that has a single space in it. The line with space causes checkpatch to erroniously think that it's in the content body, as opposed to headers and thus flag a mail header as an unwrapped long comment line. This flags when X-Spam-Report is found, till an e-mail body indicator (blank line, /^[/n/r]*/) is found, blocking setting $in_commit_log. When found, unsets itself allowing the rest of the detection to function normally. Reported-by: Darren Hart (VMware) Signed-off-by: John 'Warthog9' Hawley (VMware) --- scripts/checkpatch.pl | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index baa3c7b..d2ce89a 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -2134,6 +2134,7 @@ sub process { my $signoff = 0; my $is_patch = 0; my $in_header_lines = $file ? 0 : 1; + my $found_spam_header = 0; my $in_commit_log = 0; #Scanning lines before patch my $has_commit_log = 0; #Encountered lines before patch my $commit_log_possible_stack_dump = 0; @@ -2279,6 +2280,12 @@ sub process { my $rawline = $rawlines[$linenr - 1]; +#if we encounter a spamassassin mail header, mark it + if ($in_header_lines == 1 && $line =~ /^X-Spam-Report:/) { + #mail header found, this needs to be flagged + $found_spam_header = 1; + } + #extract the line range in the file after the patch is applied if (!$in_commit_log && $line =~ /^\@\@ -\d+(?:,\d+)? \+(\d+)(,(\d+))? \@\@/) { @@ -2627,9 +2634,15 @@ sub process { # Check if it's the start of a commit log # (not a header line and we haven't seen the patch filename) + if ($in_header_lines && $found_spam_header && $line =~ /^[\n\r]*$/) { + # we are now past the header info that could be confusing + $found_spam_header = 0; + } + if ($in_header_lines && $realfile =~ /^$/ && !($rawline =~ /^\s+\S/ || - $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { + $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i) && + !$found_spam_header) { $in_header_lines = 0; $in_commit_log = 1; $has_commit_log = 1; -- 2.5.5