andygrove opened a new pull request, #4571: URL: https://github.com/apache/datafusion-comet/pull/4571
## Which issue does this PR close? No dedicated issue. This adds tooling that supports keeping the expression support page accurate, complementing the release-prep step that calls for verifying that page. ## Rationale for this change `docs/source/user-guide/latest/expressions.md` is the source of truth for which Spark expressions Comet supports and at what status. It is hand-maintained, so it drifts: a newly registered expression can be missing, a status can be stale after a serde change, or a row can linger after a serde is removed. There was no repeatable procedure for checking the whole page against the registered serdes. The existing `audit-comet-expression` skill audits one expression deeply, but nothing swept the page for coverage and status accuracy. ## What changes are included in this PR? A new project skill at `.claude/skills/audit-expression-page/SKILL.md`. It guides a whole-page audit of `expressions.md` along three dimensions and offers to fix the page: - **Missing coverage:** every expression registered in `QueryPlanSerde` (the per-category maps that build `exprSerdeMap`, plus `aggrSerdeMap`) appears on the page, resolving serde classes to SQL names via Spark's function registry. Operator-injected and shim-wired expressions are handled explicitly. - **Status accuracy:** each row's status matches the runtime behavior, classified from `getSupportLevel`, the `allowIncompatible` default, and the `convert` fallback branches. The skill is explicit that `getIncompatibleReasons` / `getCompatibleNotes` only generate documentation text and do not by themselves drive fallback, so classification is anchored on `getSupportLevel`, and a disagreement between the two is itself reported. - **Stale entries:** rows marked supported that no longer resolve to a registered serde, or whose status contradicts registration. The skill reads the status legend from the page at runtime, so it stays correct as the legend evolves. It takes an optional `[category]` argument to scope a run to one registry category. ## How are these changes tested? This adds one Markdown skill file and changes no code. It was verified by dry-running the skill end to end on two categories (`agg_funcs` and `math_funcs`) with the actual repository: confirming each of the three checks is followable, that the classification anchors on the right methods, and that it surfaces real coverage and status discrepancies with code evidence. Gaps found in the first dry run (category-name mapping, the `getIncompatibleReasons` versus `getSupportLevel` distinction, alias and operator-injected handling) were fixed and re-verified. `prettier --check` passes on the new file. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
