rusackas commented on PR #38948:
URL: https://github.com/apache/superset/pull/38948#issuecomment-4437208612
Thanks for the effort on this @FrancescoCastaldi β adding ~1,900 missing
Italian translations is genuinely useful work. Before we can merge, though, I
went through the diff and found some issues that would break the Italian UI at
runtime. I want to flag them so we can get this in shape.
### π¨ ~49 translations break Python format placeholders
The translator (the PR body mentions CodeAnt-AI) appears to have translated
the **placeholder names themselves** as if they were regular words. A few
examples:
| msgid | Translation in this PR |
|---|---|
| `%(name)s.csv` | `%(nome)s.csv` |
| `%(rows)d rows returned` | `%(righe)d righe restituite` |
| `Either the username "%(username)s"β¦` | `β¦%(nomeutente)sβ¦` |
| `Invalid reference to column: "%(column)s"` | `β¦%(colonna)s` |
| `Error: %(error)s` | `Errore: %(errore)s` |
| `Explore - %(table)s` | `Explore - %(tabella)s` |
Every one of these will crash at runtime with `KeyError` when Python tries
`"β¦" % {"name": x}` and finds `%(nome)s` instead. The placeholder tokens
(`%(name)s`, `%s`, `%d`, `{var}`) need to be left untouched β only the
surrounding human text should be translated.
### π¨ ~7 cases of mismatched `%s`/`%d` counts, including some where the
wrong source string was translated
Some of these look like the AI conflated different msgids:
- `'An error occurred while fetching datasets: %s'` β translated as "Si Γ¨
verificato un errore durante il recupero dei **nomi delle funzioni**" (=
"fetching function names") β different msgid's content, and loses the `%s`.
- `'Copy of %s'` β translated as "Copia **la query sulla partizione negli
appunti**" (= "Copy the partition query to clipboard") β completely unrelated.
- `'%s Selected'` β "%s selezionato (%s fisico, %s virtuale)" β invents two
extra `%s`s.
### π¨ ~5 broken `{var}` placeholders (str.format style)
Same problem in `str.format`-style strings: `{name}` β `{nome}`, `{up}` β
`{su}`, `{down}` β `{giΓΉ}`. Crashes at `.format()` time.
### Stats from the diff
- ~1,930 new translations (was empty β now filled) β most of the value
- ~1,088 changed translations (filled β different filled) β worth an
Italian-speaker eyeball pass for prose quality
- ~13 emptied β these are actually fine, they're cleanups of incorrect `#,
fuzzy` markers
- **~61 known-broken** from the placeholder / content audit above
### Path forward
A few options:
1. **Run `pybabel check`** (or a small ad-hoc script) over the file before
re-submitting β it'll flag placeholder mismatches between msgid and msgstr
automatically.
2. **Re-run the AI translation** with a prompt that explicitly forbids
touching `%(name)s` / `%s` / `%d` / `{name}` tokens, and feeds one msgid at a
time so adjacent strings can't leak across.
3. **Spot-audit** the 7 cases I called out where the prose suggests an
adjacent msgid was translated instead.
Happy to help unblock once the placeholder integrity is restored. Thanks
again for taking this on.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]