andreahlert opened a new pull request, #65: URL: https://github.com/apache/airflow-steward/pull/65
## Summary `skill-validator`'s `slugify()` was using `re.sub(r"[\s]+", "-", text)`, which collapses runs of whitespace into a single dash. GitHub's anchor renderer (and doctoc, which generates our TOCs) replaces each whitespace character one-for-one, so a heading whose text contains an em-dash, like `## Mode B — conversational mentoring`, has `mode-b--conversational-mentoring` as its real anchor (em-dash strips to `""`, leaving two adjacent spaces, each becoming a dash). The validator was producing `mode-b-conversational-mentoring` (single dash) and reporting `anchor 'X' not found` against the doctoc-generated TOC anchor it had just slugified differently. Running `skill-validate` locally on `main` cuts from 191 violations to 153 with this single one-character fix, all of the dropped violations being false positives of this exact shape. ## Change - `tools/skill-validator/src/skill_validator/__init__.py`: drop the `+` quantifier from `ANCHOR_SPACE_PATTERN` so each whitespace becomes its own dash. - `tools/skill-validator/tests/test_validator.py`: update the existing `test_multiple_spaces` expectation to match the actual GitHub algorithm (it had been pinning the bug), and add `test_em_dash_in_heading` for the canonical case. ## Why now This unblocks wiring `skill-validator` into prek + CI in a follow-up. With the false-positive noise removed, the remaining violations are real (broken links, missing-frontmatter keys, anchor renames the skills did not follow) and worth gating on. ## What this does not fix - The 13 skills missing `license: Apache-2.0` frontmatter (see #pending sibling PR). - The 13 broken links to deleted `code-review.instructions.md`. - The 4 unsubstituted `<...>` placeholders in `pr-management-triage/comment-templates.md` (need a decision on adopter-config convention vs. literal substitution). - Wiring the validator itself into prek + CI. Each is a separate, scoped follow-up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
