zhengruifeng commented on PR #56026:
URL: https://github.com/apache/spark/pull/56026#issuecomment-4518681755

   ### End-to-end pipeline impact on real-world titles
   
   Ran the full new title pipeline (parse → normalize via registry → 
prompt-if-no-primary → dedup → move version tags to head → move FOLLOWUP last → 
warn on unknown tags) against the same sample.
   
   | | Commits (983) | PRs (172) |
   |---|---|---|
   | Need prompt for primary | 40 | 11 |
   | Title would be modified | 190 | 37 |
   | Has unknown tag(s) (warning only) | 9 | 7 |
   
   #### Breakdown of "modified" titles
   
   | Kind | Commits | PRs |
   |---|---|---|
   | Tag rename (`TESTS→TEST`, `FOLLOW-UP→FOLLOWUP`, `DOCS→DOC`, `WEBUI→UI`, 
`EXAMPLES→EXAMPLE`, `PYSPARK→PYTHON`, `SHELL→REPL`) | 187 | 25 |
   | Reorder only (`FOLLOWUP` → last, version tag → head) | 3 | 4 |
   | Whitespace / case fixes (e.g. missing space, lowercase inner tags) | 0 | 8 
|
   
   Examples:
   - Reorder: `[SPARK-55897][SQL][4.0] ...` → `[SPARK-55897][4.0][SQL] ...`
   - Whitespace: `[SPARK-56998]Add SECURITY.md ...` → `[SPARK-56998] Add 
SECURITY.md ...`
   - Case: `[SPARK-56962][SS][RTM][StreamingShuffle][Part2] ...` → 
`[SPARK-56962][SS][RTM][STREAMINGSHUFFLE][PART2] ...`
   
   #### Unknown tags
   
   Recurring (worth considering for the registry):
   
   | Tag | Commits | PRs |
   |---|---|---|
   | `DML` | 3 | 2 |
   | `GEO` | 3 | 0 |
   | `UDF` | 0 | 2 |
   | `RTM` | 0 | 2 |
   | `SHS` | 1 | 0 |
   
   One-offs / typos:
   - `WEBIUI` — typo of `WEBUI` (already aliased to primary `UI`)
   - `REVERT`, `SPARK`, `STREAMINGSHUFFLE`, `PART1`, `PART2` — non-standard or 
one-off
   
   For unknown tags the script just prints a warning and continues — the 
committer can decide whether to fix or proceed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to