rusackas opened a new pull request, #40716: URL: https://github.com/apache/superset/pull/40716
### SUMMARY The translation-regression check (`scripts/translations/check_translation_regression.py`, run by the `Translations` / `babel-extract` workflow) **false-positives on any PR that intentionally removes a translatable string** — and the failure is unfixable at the PR level. **Root cause:** the check keyed on a drop in the **translated** count. The baseline is computed from `master` source (where the string still exists), so removing a string lowers the translated count by 1 per language → flagged as a regression. But a drop is indistinguishable from a *rename*: both lower the translated count by 1. No `.po` edit can fix it, because the string is legitimately gone from the PR's source. (Seen on #40643, which removes the "Registration hash" string as part of a security fix — its catalogs are already correct, yet the check can never pass.) **Fix:** key the check on the **increase in fuzzy entries** instead of a drop in translated count. `babel_update.sh` uses `pybabel update --ignore-obsolete`, so: | Source change | Effect on catalogs | Should flag? | New check | |---|---|---|---| | **Rename / reword** a string | old translation fuzzy-matched onto new msgid → `#, fuzzy` | ✅ yes (translation invalidated) | fuzzy ↑ → flagged | | **Delete** a string | dropped entirely, no fuzzy created | ❌ no (intentional) | fuzzy unchanged → clean | | **Add** a string | new untranslated entry | ❌ no | fuzzy unchanged → clean | This ignores deletions, still catches the real regressions (renames stranding translations), honours the contributor fix-path (resolving the fuzzies clears the count → check passes), and is immune to count-masking when a PR both adds and removes strings. ### Changes - `scripts/translations/check_translation_regression.py` — count translated **and** fuzzy per language; a regression is now `after_fuzzy > before_fuzzy`. Baseline JSON gains a `fuzzy` field (legacy integer baselines are tolerated). Report + log wording updated to say deletions aren't flagged. Producer and consumer of the baseline are the same script, so no workflow change is needed. - `tests/unit_tests/scripts/translations/check_translation_regression_test.py` — **new**; covers deletion (not flagged), rename (flagged), addition-doesn't-mask, no-change, legacy-int baseline, report generation, and `msgfmt` stat parsing (incl. the omitted-fuzzy-clause case). 8/8 pass. - `superset/translations/requirements.txt` — `Babel==2.9.1` → `2.17.0` to match `requirements/base.txt` (what CI actually installs). The stale pin made local `babel_update.sh` runs emit ~500k-line spurious reformatting diffs. ### TESTING INSTRUCTIONS ```bash python -m pytest tests/unit_tests/scripts/translations/check_translation_regression_test.py ``` Once this merges, #40643 (and any future string-removal PR) will pass the Translations check without manual override. ### ADDITIONAL INFORMATION - [ ] Has associated issue: - [ ] Required feature flags: - [ ] Changes UI - [ ] Includes DB Migration - [ ] Introduces new feature or API - [ ] Removes existing feature or API 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
