This is an automated email from the ASF dual-hosted git repository.

zhengruifeng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new a7886fc57726 [SPARK-57097][INFRA] Prevent merging Epic/Umbrella PRs in 
merge script
a7886fc57726 is described below

commit a7886fc577266c17de452871706e3d2be3402214
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Thu May 28 14:48:38 2026 +0800

    [SPARK-57097][INFRA] Prevent merging Epic/Umbrella PRs in merge script
    
    ### What changes were proposed in this pull request?
    
    Add a fail-hard check in `dev/merge_spark_pr.py`: after fetching the JIRA 
summary for each `SPARK-NNNN` referenced in the PR title, abort the merge if 
any linked issue type is `Epic` or `Umbrella`. All offenders are collected in 
one pass and reported together so the committer can fix every blocker before 
re-running.
    
    Sample failure message:
    
        Cannot merge PR #56137. Linked JIRA(s) SPARK-1234 (Umbrella), 
SPARK-5678 (Epic) are Umbrella or Epic tickets and MUST not be resolved by a 
single PR. File Sub-task(s) under SPARK-1234 (Umbrella), SPARK-5678 (Epic) and 
update the PR title to reference the Sub-task(s) instead.
    
    ### Why are the changes needed?
    
    When the merge script completes a merge, it transitions the linked JIRA(s) 
to **Resolved** and stamps a **Fix Version** on them.
    
    PRs frequently end up referencing the wrong ticket - it is common for a 
contributor to drop the parent Epic/Umbrella ID into the title when they meant 
to create or reference a Sub-task. When that slips past the committer, the 
script silently auto-closes the Epic/Umbrella with a Fix Version, leaving the 
parent ticket in an incorrect **Resolved** state with a Fix Version it should 
not have. Cleaning that up later (reopening the Epic, removing the stray Fix 
Version, retransitioning) is m [...]
    
    Recent examples (all Umbrellas auto-resolved by the merge script):
    
    - [SPARK-54137](https://issues.apache.org/jira/browse/SPARK-54137) "Prepare 
Apache Spark 4.2.0" - the release-prep umbrella. PR #53445 was titled 
`[SPARK-54137][SQL][CONNECT] Remove redundant observed-metrics responses` at 
merge time. The script stamped the umbrella **Resolved / Fix Version 4.2.0** on 
2025-12-22 - within 13 seconds of the merge commit. A maintainer had to 
hand-reopen the umbrella and clear the Fix Version on 2026-01-06; the PR was 
retitled to its proper Sub-task SPARK [...]
    - [SPARK-54119](https://issues.apache.org/jira/browse/SPARK-54119) "Metrics 
& semantic modeling in Spark" - Umbrella stuck at **Resolved / Fix Version 
4.2.0** after 4 linked PRs: #55449, #55487, #55983, #56010.
    - [SPARK-56395](https://issues.apache.org/jira/browse/SPARK-56395) "SPIP: 
NEAREST BY Top-K Ranking Join" - Umbrella stuck at **Resolved / Fix Version 
4.2.0** after 5 linked PRs: #55629, #55681, #55682, #55688, #55873.
    
    Failing the merge before the JIRA transition runs forces a quick title fix 
(or a Sub-task creation) at merge time, and keeps Epic/Umbrella status accurate.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No. Committer tooling only.
    
    ### How was this patch tested?
    
    - `python3 -m py_compile dev/merge_spark_pr.py` succeeds.
    - Existing doctests still pass: `python3 -m doctest dev/merge_spark_pr.py` 
- 57/57.
    - The new check only runs when `asf_jira` is initialized, leaving the 
no-JIRA path unchanged, and the per-ID fetch-error path is preserved.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Generated-by: Claude Code (Opus 4.7)
    
    Closes #56137 from zhengruifeng/merge-script-block-umbrella-epic-dev3.
    
    Authored-by: Ruifeng Zheng <[email protected]>
    Signed-off-by: Ruifeng Zheng <[email protected]>
---
 dev/merge_spark_pr.py | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/dev/merge_spark_pr.py b/dev/merge_spark_pr.py
index f263c449333d..f484ad819501 100755
--- a/dev/merge_spark_pr.py
+++ b/dev/merge_spark_pr.py
@@ -1260,11 +1260,29 @@ def main():
 
     if asf_jira is not None:
         jira_ids = re.findall("SPARK-[0-9]{4,5}", title)
+        # Epic / Umbrella tickets group related work and must not be resolved 
by a single PR.
+        # Collect every offender so the committer sees the full list in one 
shot rather than
+        # discovering them one-by-one across repeated merge attempts.
+        blocking_issue_types = {"Epic", "Umbrella"}
+        blockers = []
         for jira_id in jira_ids:
             try:
-                print_jira_issue_summary(asf_jira.issue(jira_id))
+                issue = asf_jira.issue(jira_id)
             except Exception:
                 print_error("Unable to fetch summary of %s" % jira_id)
+                continue
+            print_jira_issue_summary(issue)
+            issue_type = issue.fields.issuetype.name
+            if issue_type in blocking_issue_types:
+                blockers.append((jira_id, issue_type))
+        if blockers:
+            ids_str = ", ".join("%s (%s)" % (jid, t) for jid, t in blockers)
+            fail(
+                "Cannot merge PR #%s. Linked JIRA(s) %s are Umbrella or Epic "
+                "tickets and MUST not be resolved by a single PR. File "
+                "Sub-task(s) under %s and update the PR title to reference "
+                "the Sub-task(s) instead." % (pr_num, ids_str, ids_str)
+            )
 
     print("\n=== Pull Request #%s ===" % pr_num)
     print("title\t%s\nsource\t%s\ntarget\t%s\nurl\t%s" % (title, pr_repo_desc, 
target_ref, url))


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to