zhengruifeng opened a new pull request, #56400: URL: https://github.com/apache/spark/pull/56400
### What changes were proposed in this pull request? This PR adds a component-reconciliation step to `dev/merge_spark_pr.py`. When a PR is merged, the script already normalizes the component tags in the PR title (e.g. `[SQL]`, `[CORE]`). This change compares the *primary* components implied by the title against the primary components on the linked JIRA ticket and, on a mismatch, prompts the committer to: - **overwrite** the JIRA ticket's primary components with the PR title's, - **append** the PR title's primary components to the ticket, or - **keep** the JIRA ticket unchanged (the default). Only primary components participate in the comparison. Non-primary title tags (e.g. `[TEST]`) and non-primary JIRA components (e.g. `Optimizer`) are ignored when deciding whether the sets differ, and are preserved by both update paths. The JIRA issue summary printed during a merge now also lists the ticket's components. Supporting helpers were added: `Component.find_by_jira_name` (reverse lookup from a JIRA component name to its registry entry) and a `primary_only` flag on `jira_components_from_title_tags`. ### Why are the changes needed? The PR title and the JIRA ticket can drift out of sync on which components a change touches. Today the merge tool resolves the ticket without checking, so a committer has to notice and fix component mismatches by hand. Surfacing the difference at merge time, with a safe default of leaving JIRA untouched, makes it easy to keep the two consistent without forcing any change. ### Does this PR introduce _any_ user-facing change? No. `dev/merge_spark_pr.py` is a committer-only tool. ### How was this patch tested? The script runs its doctests on startup via `doctest.testmod()`. New doctests were added for `jira_components_from_title_tags` (including the `primary_only` path) and `Component.find_by_jira_name`; the full suite passes (68 examples). Formatting was verified with `black 26.3.1` (the repo's pinned version) against the root `pyproject.toml`. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code (Opus 4.8) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
