Re: [Design] Proposal: have an en-US localization egg

Philippe Bossut Fri, 22 Sep 2006 22:29:57 -0700

Hi,

Heikki Toivonen wrote:

Philippe Bossut wrote:

I'd like Chandler to allow the use of a translation egg for English
(yes, American English aka en-US). The string in the code would be the
default "geek speak English", usable but could differ from the final
reference. The proper English strings would be delivered via the message
factory.


I am not sure I understand fully what you propose. Do you mean we would
extract all strings from Chandler and create an exact duplicate of the
strings in en_US locale egg? Or do you propose we change Chandler code
to use some other kinds of identifiers so that some locale egg is
required to see any text in the UI?

I'm not proposing to change Chandler code at all. I'm proposing that,when running in locale en-US, Chandler loads the relevant localizationfile if present as with any other locale. Actually, it is entirelypossible that this works already (I'll check and log a bug if itdoesn't). What I'm proposing is that we *do have* such a po filesomewhere in our (TBD) localization svn repository, that we (coders andPPD) remember that we have one and that we ship it along with the en-USversion.

- Issue1: Minute changes in strings in the code end up breaking all
localized eggs
e.g. Changing "Today" in "Today's events" last week broke my French
translation and since "Aujourd'hui" is already longish, I'm not planning
to change this to anything else. It's a minor pain today but imagine
breaking 87 localized languages with a last minute change like that...
painful...


I don't see how having an egg for en_US would help.

Imagine having a translation file for en-US. If someone wants to modifythe string "Today" to "Today's event", this can be done in the po. Thepython doesn't need to change and therefore the other localized po don'tneed to be changed. Currently, the string is changed in the python codeand all localized versions now show the English "Today's event" insteadof their relevant translation. Silly, especially if that can be easilyavoided.

Of course it makes sense to do a change in the python code if the*semantic* of the string or code is changed (e.g. if we had changed"Today" by "This week" since the info displayed would have been alldifferent). But in that particular instance, it was not the case.

If it makes things easier to understand, imagine a change which is aboutfixing a typo in English. Should that break all the localized files? Idon't think so (after all, fixing typos in localized files don't breakEnglish, so, ironically, localizers have more slack to do last minutechanges than we do in English...).

What would help, and is actually my preference, is that once we start
getting serious about trying to ship multiple language versions of
Chandler simultaneously we add an Internationalization/Localization
Freeze to the process. This would be a signal to developers to stop
making any changes that affect localization, and signal localizers that
it would be pretty safe to start working on the localization without
fear of strings changing underneath before the release. This is how
Mozilla does things.

+1 It's a good idea to have such a localization freeze. I certainlydon't want to make the localization several times if I can avoid it andstarting localization now makes no sense (except to find bugs as Markkuand I did...).

However, this won't prevent last minute changes in English to happen,e.g., fixing typos. What happens at Mozilla when a typo is discovered inEnglish at the last minute? Do people balance the pro and cons of"shipping an ugly typo in English" vs. "annoying the rest of the 87localizers with a last minute change"? (actually, you answered that onelater, since Mozilla uses an en_US file, they just don't have thisproblem). What I propose would avoid this entirely (with the samesolution than Mozilla as it happens).

Of course, we certainly *can* have a super tight process that wouldrequire review of all strings before localization freeze and carefulcontrol of string changes after freeze, etc... But why add process whenwe can solve the problem with code and no process?

However, since we are nowhere near the stage (AFAIK) where we want to
ship simultaneous language versions I think even this whole discussion
is almost moot at this point.

I'm talking about this when I'm going through the issue myself andbefore it fades away and/or come up as an idea too late.I'm bringing it to the list because it's pointless if this file existsbut no ones knows about it and continues to change strings in pythoncode when there's another solution.

I'm not advocating though that we leave ugly, badly typed strings inPython just because. What we could agree on is that *after localizationfreeze*, changes to the English strings in Python would be forbidden.All fixes to strings in English will have to be done in the en-US pofile. That would give us some slack as far as checking the strings inconcerned, finishing the Welcome message and starting the localizationprocess. Does this make more sense?

I would say that localizers who do not want strings to change under them
should start the localization work AFTER we have made the release. Given
that it is only going to take a day or so to do a localization from
scratch this could still mean many localized versions available in a
very short period of time after a release.

Sure, still, last minute wording changes do happen and are sometimesrequired. Case in point: the "Welcome" message.

- Issue2: Some strings are identical in English but different in other
languages.
e.g. "to" in the DV is used twice but would actually need 2 different
strings in French because of the different contexts. Allowing the code
to add context and have "to(in)" for interval time and "to(date)" for
final date would allow me to have "au" and "jusqu'au" separately. Bryan
Stearns shown me another way of doing this though I just can't remember
what the trick was.
(Note that allowing dupes in the po and have 2 or more entries for "to"
wouldn't help me much since I would lack the context to understand which
"to" is what...)


I think this is an issue even with context in the strings. From my
experience translating Mozilla a way back I could not figure out from
the strings where they were actually going to be shown in the UI, which
is the context that really matters. I had to translate, run the app, and
correct places where I had chosen a wrong term.

Sure but when you have only one string and one only while you reallyreally need 2 (or more) in your own language, you're screwed no matterwhat. It's a fact that I need 2 strings to translate the various "to"correctly in French. Are you suggesting is doesn't matter? If not, whatis your suggestion to solve the problem?

Note that Mozilla requires a 'localization' even for en_US because all
strings are specified using ids which are then looked up from the
localization files when the app is running.

Aha! That explains why they don't run into the "last minute evil typo"issue... Seems to make my point that having an en-US localized file is agood idea (though in Mozilla case it seems they just didn't have a choice).

Beyond not breaking localized eggs, this will allow easy fix of typos
and allow non devs to polish the UI wording without interfering with the
work of localizers.


I grant that this would be a benefit, but I don't see this a big problem.

I think you would if you had to redo a job because of one typo inanother language. Both Markku and me pointed several typos andinconsistencies of capitalization when doing our translations. Sure, wecan fix them now but we'll miss some and we'll introduce new ones. FWIW,I've seen last minute string changes in *every* single product I've everbeen involved with. I can't believe Chandler will be an exception.

Again, the solution I propose would suppress the dependency betweenfixing those in English and starting localizations on other languages.How can that be bad?

One *very* nice side effect for localizers and PPD alike is that changes
to the "Welcome" message will be possible till the last minute without
impossible to achieve synchronization between people scattered around
the planet. Actually, we could (should) even suppress this lengthy
message entirely from the .py, replace it in there by _(u"Here goes the
long welcome message") and have this message live only in the en-US po,
making possible for Pieter and al, to fiddle with it till the last darn
second without interfering with the main svn tree.


I'd say this to be something that must be completed before i18n/l10n
freeze if we go that route. Now I just say: if this is a concern, wait
until after the release to translate this.

What's wrong with having a solution which doesn't have this constraint?Especially if that solution has no implementation cost? Why addprocesses and dependencies when we can avoid them?

I am a little surprised by you making it sound like a big problem that
your translation broke. Do the tools allow you to easily translate the
changed parts, or do you have to start over from scratch? I sincerely
hope the former, which was also the case with Mozilla (Mozilla even
needed custom localization tools developed).

It's not a big problem yet sure. I'm just projecting myself in the 0.7Preview time frame and beyond when that will become one. It's one thingto annoy a staff member with extra work (although, this is cost...),it's another to pester an army of volunteers, especially when theproblem can be easily and elegantly avoided altogether. And I'm notclaiming my idea is a strike of genius BTW. I'm quite sure those guyswill suggest it to us the very first time we'll break their po file withone minor change. It's a very simple and natural idea so, why not makesure it works and agree about it now?


Cheers,
- Philippe
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "Design" mailing list
http://lists.osafoundation.org/mailman/listinfo/design

Re: [Design] Proposal: have an en-US localization egg

Reply via email to