rusackas opened a new pull request, #40716:
URL: https://github.com/apache/superset/pull/40716

   ### SUMMARY
   
   The translation-regression check 
(`scripts/translations/check_translation_regression.py`, run by the 
`Translations` / `babel-extract` workflow) **false-positives on any PR that 
intentionally removes a translatable string** — and the failure is unfixable at 
the PR level.
   
   **Root cause:** the check keyed on a drop in the **translated** count. The 
baseline is computed from `master` source (where the string still exists), so 
removing a string lowers the translated count by 1 per language → flagged as a 
regression. But a drop is indistinguishable from a *rename*: both lower the 
translated count by 1. No `.po` edit can fix it, because the string is 
legitimately gone from the PR's source. (Seen on #40643, which removes the 
"Registration hash" string as part of a security fix — its catalogs are already 
correct, yet the check can never pass.)
   
   **Fix:** key the check on the **increase in fuzzy entries** instead of a 
drop in translated count. `babel_update.sh` uses `pybabel update 
--ignore-obsolete`, so:
   
   | Source change | Effect on catalogs | Should flag? | New check |
   |---|---|---|---|
   | **Rename / reword** a string | old translation fuzzy-matched onto new 
msgid → `#, fuzzy` | ✅ yes (translation invalidated) | fuzzy ↑ → flagged |
   | **Delete** a string | dropped entirely, no fuzzy created | ❌ no 
(intentional) | fuzzy unchanged → clean |
   | **Add** a string | new untranslated entry | ❌ no | fuzzy unchanged → clean 
|
   
   This ignores deletions, still catches the real regressions (renames 
stranding translations), honours the contributor fix-path (resolving the 
fuzzies clears the count → check passes), and is immune to count-masking when a 
PR both adds and removes strings.
   
   ### Changes
   
   - `scripts/translations/check_translation_regression.py` — count translated 
**and** fuzzy per language; a regression is now `after_fuzzy > before_fuzzy`. 
Baseline JSON gains a `fuzzy` field (legacy integer baselines are tolerated). 
Report + log wording updated to say deletions aren't flagged. Producer and 
consumer of the baseline are the same script, so no workflow change is needed.
   - 
`tests/unit_tests/scripts/translations/check_translation_regression_test.py` — 
**new**; covers deletion (not flagged), rename (flagged), 
addition-doesn't-mask, no-change, legacy-int baseline, report generation, and 
`msgfmt` stat parsing (incl. the omitted-fuzzy-clause case). 8/8 pass.
   - `superset/translations/requirements.txt` — `Babel==2.9.1` → `2.17.0` to 
match `requirements/base.txt` (what CI actually installs). The stale pin made 
local `babel_update.sh` runs emit ~500k-line spurious reformatting diffs.
   
   ### TESTING INSTRUCTIONS
   
   ```bash
   python -m pytest 
tests/unit_tests/scripts/translations/check_translation_regression_test.py
   ```
   
   Once this merges, #40643 (and any future string-removal PR) will pass the 
Translations check without manual override.
   
   ### ADDITIONAL INFORMATION
   
   - [ ] Has associated issue:
   - [ ] Required feature flags:
   - [ ] Changes UI
   - [ ] Includes DB Migration
   - [ ] Introduces new feature or API
   - [ ] Removes existing feature or API
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to