zhengruifeng commented on code in PR #56026:
URL: https://github.com/apache/spark/pull/56026#discussion_r3295918145
##########
dev/merge_spark_pr.py:
##########
@@ -700,77 +701,275 @@ def resolve_jira_issues(title, merge_branches, comment):
resolve_jira_issue(merge_branches, comment, jira_id)
-def standardize_jira_ref(text):
- """
- Standardize the [SPARK-XXXXX] [MODULE] prefix
- Converts "[SPARK-XXX][mllib] Issue", "[MLLib] SPARK-XXX. Issue" or "SPARK
XXX [MLLIB]: Issue" to
- "[SPARK-XXX][MLLIB] Issue"
-
- >>> standardize_jira_ref(
- ... "[SPARK-5821] [SQL] ParquetRelation2 CTAS should check if delete
is successful")
- '[SPARK-5821][SQL] ParquetRelation2 CTAS should check if delete is
successful'
- >>> standardize_jira_ref(
- ... "[SPARK-4123][Project Infra][WIP]: Show new dependencies added in
pull requests")
- '[SPARK-4123][PROJECT INFRA][WIP] Show new dependencies added in pull
requests'
- >>> standardize_jira_ref("[MLlib] Spark 5954: Top by key")
- '[SPARK-5954][MLLIB] Top by key'
- >>> standardize_jira_ref("[SPARK-979] a LRU scheduler for load balancing
in TaskSchedulerImpl")
- '[SPARK-979] a LRU scheduler for load balancing in TaskSchedulerImpl'
- >>> standardize_jira_ref(
- ... "SPARK-1094 Support MiMa for reporting binary compatibility across
versions.")
- '[SPARK-1094] Support MiMa for reporting binary compatibility across
versions.'
- >>> standardize_jira_ref("[WIP] [SPARK-1146] Vagrant support for Spark")
- '[SPARK-1146][WIP] Vagrant support for Spark'
- >>> standardize_jira_ref(
- ... "SPARK-1032. If Yarn app fails before registering, app master
stays aroun...")
- '[SPARK-1032] If Yarn app fails before registering, app master stays
aroun...'
- >>> standardize_jira_ref(
- ... "[SPARK-6250][SPARK-6146][SPARK-5911][SQL] Types are now reserved
words in DDL parser.")
- '[SPARK-6250][SPARK-6146][SPARK-5911][SQL] Types are now reserved words in
DDL parser.'
- >>> standardize_jira_ref(
- ... 'Revert "[SPARK-48591][PYTHON] Simplify the if-else branches with
F.lit"')
- 'Revert "[SPARK-48591][PYTHON] Simplify the if-else branches with F.lit"'
- >>> standardize_jira_ref("Additional information for users building from
source code")
- 'Additional information for users building from source code'
+
+class Component:
+ """A Spark PR-title tag, paired with its canonical JIRA component name.
+
+ ``jira_name`` is the canonical name of the SPARK JIRA component (e.g.
+ "Documentation"); empty for status markers like [MINOR] that are not
+ JIRA components but are still recognized in PR titles.
+
+ ``tag`` is the preferred PR-title abbreviation (uppercase, no brackets,
+ e.g. "DOC"). ``aliases`` lists other accepted spellings that resolve to
+ the same component (e.g. "DOCS", "DOCUMENTATION" -> "DOC").
+
+ ``primary`` marks components whose presence alone satisfies the merge-time
+ requirement. Non-primary JIRA components (e.g. [TEST], [PS], [SHUFFLE])
Review Comment:
Done — updated to use `[DEPLOY]` (still non-primary). Thanks!
##########
dev/merge_spark_pr.py:
##########
Review Comment:
Correction to my earlier reply: in the latest revision, `[PROJECT INFRA]`
(with a space) is not actually reachable from `Title.parse` — the bracket-tag
regex `[A-Za-z0-9._-]+` excludes spaces, so `[PROJECT INFRA]` would have stayed
in the body text. To make the registry honest about what is reachable, I
dropped the with-space aliases in the latest commit; the underscore form
`[PROJECT_INFRA]` still resolves to `[INFRA]`. Per a separate dry-run on the
latest 1000 commits + 200 open PRs, none use the with-space form in practice,
so the drop should be safe.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]