написане Fri, 25 May 2012 22:49:37 +0300, Romain d'Alverny
<[email protected]>:
On Fri, May 25, 2012 at 6:41 PM, Yuri Chornoivan <[email protected]> wrote:
Just of curiosity, a couple of questions.
1. What was the reason to choose unsustainable format that is not
recognized
by the existing toolchains (lang)?
My past experience with gettext (4 years, I dropped it about 3 years
ago, things may have changed since) has been terrible (mostly, because
gettext is [was?] not thread-safe and the server implementation made
that hugely problematic). So we had erratic server behaviour regarding
locale management on our websites.
I am not telling about server side at all. Why not extract all the strings
form these .lang files into reliable format. It should be deadly easy with
the current format (drop two lines and extract every line with ";" at the
beginning. It can be done even via PHP. Then generate .lang for locales
using cron with minimal task priority.
No mistakes in quotation, no formatting breaks, no extra manual job, no
additional load on server for every single translation update, no
additional pages with lists, no wiki explanations, just tell the
translators the name of the directory in SVN and the single pot file
(generated by cron).
I could have migrated to .po files nonetheless and use a library
reimplementing gettext over it. But frankly, the past struggles with
gettext (in a Web context - it does perfectly its job in a local app
context) made me want to try something else, lighter (so less featured
as well obviously). I know Mozilla used to have .lang files too, I
checked it out a bit and thought it would be worth the try; and
perhaps would we be able to reuse their own libs in this regard.
2. Consider someone decided to change something on a webpage. How can we
(translators) know about the changes (Manually parse svn diffs, copy and
paste diffs from PHP page? What about fuzzy matches?) ?
First, someone would be identified - I commit about 95% of what goes
on www, 5 other percent are between obgr and dams. That's for the
code, content, layout, etc. (I'm not happy with this either). I wrote
quickly report and diff utilities so you can have a look (and we can
rework them, just tell me how), but I expect to sneak peak at
Mozilla's codebase.
Thanks. Great work, imho.
But with all respect, Mozilla (currently) has some language teams with
number of members that exceeds the number of all Mageia translators. They
can waste their resources, at least in the current state.
So, upon modified or added contents on a webpage, that would happen in
English only (that's the pivot language). That will need to extract
the string from the source, inject it in the lang file, and the diff
will appear on the report page then. No need for anyone to play with
PHP code, you will only edit the .lang files.
That's what I see now:
украї́нська мо́ва (wrong, by the way. It should be "Українська". I wrote
Oliver about this, but nobody cares :'( ):
3 / 95 100% (looks like a cipher, 3 files, 95 messages, I guess)
2 untranslated (heh? ";page_title" and ";Mageia 2". Penalty for
copy-pasting diff to the .lang, I guess ;) )
Français (fr):
3 / 93 98%
7 missing 8 untranslated (15/98*100 + 98% = 113.3%)
looking further:
OK+1 (must be bonus points for 13.3% ;) )
And there won't be fuzzy matches. The string will be blankly added, or
removed.
Very sad. It's not a big problem to translate even whole KDE or GNOME with
their docs and wikis (believe me, at least about KDE ;) ). It's a problem
to keep them translated.
The Rosetta situation with whole string discarding for one comma or one
space is somewhat unacceptable without any chance to have at least
minimalistic translation memory.
P.S. Written when struggling to realize where to place the new strings
from
http://www.mageia.org/langs/report.php
Thinking about left this without translation in the future.
? didn't understand. What was the issue?
There are no big issues now. But it will end up bad for the following
reasons:
1. Copy-pasting from report page can break (and will break) the sequence
of messages making a total mess of outdated/new/existing translations.
2. No translation memory, hence no uniform translations and no saving of
time if something has been already translated earlier.
3. Fragmentation with minimal scaling. What is the future? Directory with
dozens of files from different releases (2.uk.lang, 2-1.uk.lang,
3.uk.lang, 3beta.uk.lang, etc.)? Dozen of directories for different
releases? Table with 70 language columns and dozens of rows?
Note that I'm all to improve and ease the translation process for you
- but being alone, I do it with what my experience has been as well -
that will explain as well why I chose a dead boring simple homemade
framework for the Web site, and not an existing CMS - the situation
wouldn't be the same with an active 3 or 4-seats Web team.
And again, tell me if I'm missing something or if something can be
improved for you.
Ok.
1. If as a result of lacking manpower nothing can be done with the
above-mentioned issues, can we have RSS/Atom feed (automatic) for the
report page to automate the process of updating or automatic messaging to
this list (with no manual sending warnings needed) in the case when
English strings are changed/added?
2. If it is not hard to do, can the .lang files be regenerated
automatically (with keeping the order of current English pivot)?
Identifying of untranslated strings can be harder, but at least it will
keep the logical order of the messages.
Thanks for your answer and efforts to improve the current workflow.
Best regards,
Yuri