rusackas commented on code in PR #40716:
URL: https://github.com/apache/superset/pull/40716#discussion_r3350446646
##########
scripts/translations/check_translation_regression.py:
##########
@@ -169,26 +220,32 @@ def cmd_compare(
report_path: Optional[str] = None,
) -> None:
with open(before_path) as f:
- before: dict[str, int] = json.load(f)
+ before_raw: dict[str, object] = json.load(f)
+ before = {lang: _normalize(entry) for lang, entry in before_raw.items()}
after = get_counts(translations_dir)
+ # A regression is an *increase* in fuzzy entries: the PR's source diff
+ # renamed/reworded strings, leaving their committed translations stranded.
+ # A plain drop in the translated count is NOT used — deleting a string
+ # lowers it identically to a rename but is a legitimate change, and with
+ # `pybabel update --ignore-obsolete` a deletion creates no fuzzy entry.
regressions: list[tuple[str, int, int]] = []
- for lang, before_count in sorted(before.items()):
- after_count = after.get(lang, 0)
- if after_count < before_count:
- regressions.append((lang, before_count, after_count))
+ for lang, before_stats in sorted(before.items()):
+ after_stats = after.get(lang, {"translated": 0, "fuzzy": 0})
+ if after_stats["fuzzy"] > before_stats["fuzzy"]:
+ regressions.append((lang, before_stats["fuzzy"],
after_stats["fuzzy"]))
Review Comment:
Good catch on the underlying scenario, though the fix needed to be narrower
than "treat any missing baseline language as a regression." Intentionally
deleting an entire catalog is a legitimate change (the same reason a drop in
translated count is not flagged), so failing on every absence would defeat the
purpose of this PR.
The real gap was the *uncountable* case: a `.po` that is still present but
fails `msgfmt` (malformed/corrupt) was silently skipped in `get_counts`, making
it indistinguishable from a deletion and so passing as "no regression." Fixed
in f713d309d7: `get_counts` now records count failures, and `cmd_compare`
treats a baseline language whose catalog is present-but-uncountable as a hard
failure, while a genuinely deleted catalog still passes. Added unit tests for
all three cases (deleted catalog passes, uncountable baseline catalog fails,
uncountable non-baseline catalog ignored).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]