rusackas commented on PR #38948:
URL: https://github.com/apache/superset/pull/38948#issuecomment-4437208612

   Thanks for the effort on this @FrancescoCastaldi β€” adding ~1,900 missing 
Italian translations is genuinely useful work. Before we can merge, though, I 
went through the diff and found some issues that would break the Italian UI at 
runtime. I want to flag them so we can get this in shape.
   
   ### 🚨 ~49 translations break Python format placeholders
   
   The translator (the PR body mentions CodeAnt-AI) appears to have translated 
the **placeholder names themselves** as if they were regular words. A few 
examples:
   
   | msgid | Translation in this PR |
   |---|---|
   | `%(name)s.csv` | `%(nome)s.csv` |
   | `%(rows)d rows returned` | `%(righe)d righe restituite` |
   | `Either the username "%(username)s"…` | `…%(nomeutente)s…` |
   | `Invalid reference to column: "%(column)s"` | `…%(colonna)s` |
   | `Error: %(error)s` | `Errore: %(errore)s` |
   | `Explore - %(table)s` | `Explore - %(tabella)s` |
   
   Every one of these will crash at runtime with `KeyError` when Python tries 
`"…" % {"name": x}` and finds `%(nome)s` instead. The placeholder tokens 
(`%(name)s`, `%s`, `%d`, `{var}`) need to be left untouched β€” only the 
surrounding human text should be translated.
   
   ### 🚨 ~7 cases of mismatched `%s`/`%d` counts, including some where the 
wrong source string was translated
   
   Some of these look like the AI conflated different msgids:
   
   - `'An error occurred while fetching datasets: %s'` β†’ translated as "Si Γ¨ 
verificato un errore durante il recupero dei **nomi delle funzioni**" (= 
"fetching function names") β€” different msgid's content, and loses the `%s`.
   - `'Copy of %s'` β†’ translated as "Copia **la query sulla partizione negli 
appunti**" (= "Copy the partition query to clipboard") β€” completely unrelated.
   - `'%s Selected'` β†’ "%s selezionato (%s fisico, %s virtuale)" β€” invents two 
extra `%s`s.
   
   ### 🚨 ~5 broken `{var}` placeholders (str.format style)
   
   Same problem in `str.format`-style strings: `{name}` β†’ `{nome}`, `{up}` β†’ 
`{su}`, `{down}` β†’ `{giΓΉ}`. Crashes at `.format()` time.
   
   ### Stats from the diff
   
   - ~1,930 new translations (was empty β†’ now filled) β€” most of the value
   - ~1,088 changed translations (filled β†’ different filled) β€” worth an 
Italian-speaker eyeball pass for prose quality
   - ~13 emptied β€” these are actually fine, they're cleanups of incorrect `#, 
fuzzy` markers
   - **~61 known-broken** from the placeholder / content audit above
   
   ### Path forward
   
   A few options:
   
   1. **Run `pybabel check`** (or a small ad-hoc script) over the file before 
re-submitting β€” it'll flag placeholder mismatches between msgid and msgstr 
automatically.
   2. **Re-run the AI translation** with a prompt that explicitly forbids 
touching `%(name)s` / `%s` / `%d` / `{name}` tokens, and feeds one msgid at a 
time so adjacent strings can't leak across.
   3. **Spot-audit** the 7 cases I called out where the prose suggests an 
adjacent msgid was translated instead.
   
   Happy to help unblock once the placeholder integrity is restored. Thanks 
again for taking this on.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to