On Fri, 12 Jun 2026 15:08:07 -0400
Matthew Gee <[email protected]> wrote:

> Previous review-patch.py would detect and report and error or warning
> based off of the occurrence of the headers of the error and warning
> sections. This led to consistent false positives as often AI reviewers
> will create the header but put "none" or similar filler text within
> the following body.
> 
> This patch updates the code in order to check if the AI review has a
> body with error or warnings to fix and not just filler text. This is
> done by querying multiple lines at once and adjusting the regex to
> filter out headers followed only by filler text or formatting in the
> review.
> 
> These changes were tested against 10+ AI review outputs with several
> variations in formatting and filler text in order to catch a good
> variety of cases to make sure code reviews with actual errors or
> warnings are caught and not missed.
> 
> Signed-off-by: Matthew Gee <[email protected]>
> ---

AI saw some stuff which I missed.

Matthew,

The windowing (tee/islice/chain) is aligned correctly, but the new
matching has a few problems.

The bigger issue is that this only recognizes markdown '#' headers now.
The old scan matched an optional '#' prefix, '**' bold, and '<h1-3>' too.
rgx_should_match requires "#+\s", so plain-text "Errors:" / "**Errors**"
and HTML "<h2>Errors</h2>" no longer match. Default --format is text and
html is supported, so reviews in those formats classify clean (exit 0)
even when they report errors, which silently breaks the 2/3 exit codes
compare-patch-reviews.sh relies on. The '#'/'**'/'<h>' alternatives need
to stay.

The 3-line concatenation also reintroduces the false positive you're
trying to kill. curr+next+next_next are joined with no separator, so

        ## Errors
        None
        ## Warnings

collapses to "## errorsnone## warnings"; the 'none...$' filler anchor
fails because the next header follows, and has_errors gets set. Any
output that doesn't blank-line-separate sections hits this.

And in the other direction, content more than two lines below a header
is missed: two blank lines between "## Errors" and the body leave
stripped == "## errors", the '$' branch matches, and a real finding is
suppressed. The heuristic ends up sensitive to exact line spacing both
ways.

Minor: iter[str] / iter[str | None] aren't valid generics (use the
already-imported Iterator); the vars are also re-annotated with
conflicting types and annotate the iterators rather than the loop
targets. mypy will reject these. Run black too - the new for/zip line
and the stripped assignment are over width.

Suggest keeping header matching in all three formats, then scanning past
blank lines to the first non-empty body line and suppressing only if
it's filler (none/n/a/empty). That decouples detection from spacing
instead of fixing it to a 3-line shape.

Reply via email to