zhengruifeng opened a new pull request, #56400:
URL: https://github.com/apache/spark/pull/56400

   ### What changes were proposed in this pull request?
   
   This PR adds a component-reconciliation step to `dev/merge_spark_pr.py`. 
When a PR is merged, the script already normalizes the component tags in the PR 
title (e.g. `[SQL]`, `[CORE]`). This change compares the *primary* components 
implied by the title against the primary components on the linked JIRA ticket 
and, on a mismatch, prompts the committer to:
   
   - **overwrite** the JIRA ticket's primary components with the PR title's,
   - **append** the PR title's primary components to the ticket, or
   - **keep** the JIRA ticket unchanged (the default).
   
   Only primary components participate in the comparison. Non-primary title 
tags (e.g. `[TEST]`) and non-primary JIRA components (e.g. `Optimizer`) are 
ignored when deciding whether the sets differ, and are preserved by both update 
paths. The JIRA issue summary printed during a merge now also lists the 
ticket's components.
   
   Supporting helpers were added: `Component.find_by_jira_name` (reverse lookup 
from a JIRA component name to its registry entry) and a `primary_only` flag on 
`jira_components_from_title_tags`.
   
   ### Why are the changes needed?
   
   The PR title and the JIRA ticket can drift out of sync on which components a 
change touches. Today the merge tool resolves the ticket without checking, so a 
committer has to notice and fix component mismatches by hand. Surfacing the 
difference at merge time, with a safe default of leaving JIRA untouched, makes 
it easy to keep the two consistent without forcing any change.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. `dev/merge_spark_pr.py` is a committer-only tool.
   
   ### How was this patch tested?
   
   The script runs its doctests on startup via `doctest.testmod()`. New 
doctests were added for `jira_components_from_title_tags` (including the 
`primary_only` path) and `Component.find_by_jira_name`; the full suite passes 
(68 examples). Formatting was verified with `black 26.3.1` (the repo's pinned 
version) against the root `pyproject.toml`.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Opus 4.8)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to