On Friday 26 October 2007 16:54, you wrote:
ET Hi, everyone,
ET
ET In response to Xavier Alvarez' request on 10/25 for
ET translators and coordinators, I decided to get off the
ET sidelines and take a look at OLPC's new Pootle-based L10N
ET infrastructure.
ET
ET Here are a few things I noticed which I think will be of
ET general interest and concern:
ET
ET (0) CASING/NAMING OF PO FILES PROBLEM:
The 'rule' is quite simple (but not necessarily as intuitive as
may be expected): given that we are bundling several d.l.o
projects into pootle-projects, we need to ensure (or at least
minimize the possibility) of having 2 POT files with the same
name.
Solution? We prefix whatever filename used for the POT in d.l.o
with the name of its project...
journal-activity.Journal.po
--dlo-project-.filename
Thus, any 'inconsistencies' are really product of other
inconsistencies... they just happen to be more evident (and ugly)
within Pootle.
ET
ET (Upper/Lower) Casing of names of po files is
ET inconsistent: For example, in Core there is
ET journal-activity.Journal.po with upper case J for
ET the 2nd occurrence of Journal but then why isn't
ET write.write.po written write.Write.po?
ET
ET This is a small point, but consistent and inuitive
ET naming of these PO files will help everyone. Or am I just
ET failing to understand or intuit what the pattern is supposed
ET to be here?
ET
ET (1) INCONSISTENT NUMBER OF MSGIDs ACROSS DIFFERENT
ET LANGUAGES:
Yes and no.
The numbers shown in the statistics do not represent quantity of
MSGIDs but WORDS in the file. So I presume that for untranslated
strings it takes the MSGID words, and for translated strings, the
MSGSTR. Thus two languages with all things translated and upto
date, may still show different numbers (although conceptually
they are the same). BTW, it does show the number of strings in
other 'statistic levels'.
Yes, I was quite baffled too... translators are more worried about
the word-count than 'lines of code'... ;)
In http://solar.laptop.org:5080/projects/xo_core/
LanguageTrans. Fuzzy Untrans. Total
Portuguese (Brazil) 162 42% 4 1% 213 56% 379
Spanish 219 62% 0 0% 132
37% 351
While in each language+project
[pt_BR] 8 files, 162/379 words (42%) translated [118/247 strings]
[es]8 files, 219/351 words (62%) translated [157/234 strings]
Note that even Still, there's a difference with the number of
strings... see below.
ET
ETThe other day when I looked at write.write.po for
ET French, there were only 10 messages in the catalog. Today, I
ET see that there are 36 messages which looks a lot closer to
ET what I myself get from xgettext toolbar.py on the latest
ET code.
ETHowever, when I checked write.write.po for Thai today,
ET I see that it still has only 10 messages.
ET
ETSolution (Or at least A Question Posing As A Possible
ET Solution):
ET
ETDoes everyone agree that there needs to be a way that
ET all of the .po files for all languages get updated with the
ET latest messages extracted via xgettext from the latest
ET codebase (toolbar.py, etc.)?
Yes, there's a problem. Reviewing what you've noted, the problem
appears to be a mix of things. Just for the record, we are
sticking to the POT files found in d.l.o git (not fedora)
1) the POT in dlo only has 9 strings
http://dev.laptop.org/git?p=projects/write;a=blob_plain;f=po/write.pot;hb=HEAD
2) the POT creation dates have probably been tampered with
externally so it's impossible to determine which one makes sense
without going into the source code:
FR.PO POT-Creation-Date: 2007-06-21 17:33+0200\n
DLO POT POT-Creation-Date: 2007-06-21 17:33+0200\n
I personally believe that developers should generate the POT file
and make sure that it's in d.l.o git.
Overall, I find these inconsistencies a direct result of the messy
flow we've had with t.fp.o. As a matter of fact, I've been trying
to process the tickets in d.l.o holding PO submissions and things
haven't been very nice. The current situation is:
0) only some projects have been injected into Pootle
(core and bundled activites, with few exceptions like Etoys)
1) d.l.o POT files are being considered the standard
2) d.l.o PO files have been injected but not fully verified
2.1) many have lost their (UTF-8) encoding
2.2) many PO files seem not to correspond to their POT (1)
3) tickets (submitting PO files) seem to issues noted in (2)
On top, some of the quirks and particularities of the tools do
seem to get in the way, but I think that most stem from the fact
that we don't have a 'base' POT population.
Still working on it,
Xavier
PS: The issue regarding lists is an interesting issue that I think
it may be much broader than the XO... :)
...snip...
ET
ET Questions, suggestions, ideas, etc. are all welcome!
ET
ET
ET Cheers,
ET Xavier
ET
ET